The least squares method
In the early nineteenth century the mathematician Carl Friedrich Gauss was asked by an astronomer to figure out some calculations. This astronomer lost sight of the asteroid Ceres for a few weeks and asked Gauss to try to estimate its position. Gauss' method was to find a trend in the observed positions and to extrapolate these in order to find the actual position of the asteroid.
In fig. 2 we see how this method works. The red dots are the observed data. The problem is to find a line (in blue) such that the distances (in green) are minimal. The calculation consists of summing the squares of these distances (squaring because the square of a negative number is positive and some of these distances are negative - seen as differences). The parameters of the blue line are then calculated such that the total sum of these squares is minimal. This means that for any other line than the blue line this sum is always greater. In the course of this web page we will use the term "deviation" for the square of this sum divided by the number of data points. In other words the deviation can be seen as the mean distance of the data points to the optimal regression line. In the extreme case where this number is zero all points would lie on the line.
Fig. 2 (in green:) The distances of the data (in red) w.r.t. the estimated line (in blue)