Alright, let’s talk about my recent deep dive into that Oregon Santa Clara prediction thing. I’m no pro, just a guy who likes to tinker, so bear with me.

The Setup
First off, I was curious. Saw some buzz about folks trying to predict game outcomes. Figured, why not give it a shot? I started by grabbing historical data. Think game scores, team stats, all that jazz. Scraped it from a couple of sports websites – a bit tedious, I won’t lie. Used Python and BeautifulSoup, nothing fancy.
Data Cleaning (Ugh)
Next up, cleaning. Oh boy, the cleaning. Dates were all over the place, team names inconsistent, missing data…the works. Spent a good chunk of time wrangling it all into a usable format. Pandas library became my best friend. I ended up standardizing team names and filling in missing values with averages, seemed reasonable enough.
Feature Engineering
Then came the fun part: figuring out what might actually predict the outcome. I played around with stuff like:
- Win percentage over the last 5 games
- Average points scored/allowed
- Home/away advantage
- Maybe even a simple ranking based on previous performance
I know, I know, pretty basic. But gotta start somewhere, right?
Model Time

For the model, I went with something simple: a Logistic Regression. Seemed like a good starting point for a binary outcome (win or lose). Split the data into training and testing sets. Trained the model using scikit-learn. Again, nothing too crazy. Did some hyperparameter tuning with GridSearchCV to find the best C value.
The Big Prediction
Finally, time to make the prediction. Fed the model the data for the Oregon Santa Clara game. Drumroll… it predicted Oregon would win.
Did it work?
Well… Oregon DID win! Beginner’s luck? Probably. But hey, it was still pretty cool.
What I learned
This whole thing was more about the process than the result. I learned a ton about:
- Data scraping and cleaning: Seriously, data is messy.
- Feature engineering: Thinking about what actually matters.
- Basic machine learning models: Logistic Regression is pretty neat.
Next Steps

Definitely not stopping here. I’m thinking of trying more advanced models, maybe a Random Forest or something. Also want to dig deeper into feature engineering – things like player stats, coaching changes, all that could play a role. This was a fun little project. Maybe I’ll actually become a decent predictor someday. Who knows?