In my first attempt at a how-to guide, I delve into linear regression – a tried and tested method of predictive analysis. It’s a light weight method with a fast implementation time; making it ideal to use as a benchmark for more complex learning algorithms. The analysis is univariate – single input to prediction. Scenarios like this aren’t very realistic but they make it easy to graphically visualise the process. The steps followed in this guide form the backbone, and starting point, of a machine learning pipeline.

The model tries to predict user rating for any video game, given the critic score. It’s trained using the Kaggle “Video Game Sales with Ratings” data set using the python numerical computation library NumPy. The entire process – including descriptions, diagrams, code, and outputs – was documented in a Jupyter notebook. An HTML export of the notebook can be found at the link below.

Univariate Linear Regression

Share this article