Re: [Offtopic] Line fitting [was Re: Numpy outlier removal]

Terry Reedy Tue, 08 Jan 2013 01:11:33 -0800

On 1/7/2013 8:23 PM, Steven D'Aprano wrote:

On Mon, 07 Jan 2013 22:32:54 +0000, Oscar Benjamin wrote:

An example: Earlier today I was looking at some experimental data. A
simple model of the process underlying the experiment suggests that two
variables x and y will vary in direct proportion to one another and the
data broadly reflects this. However, at this stage there is some
non-normal variability in the data, caused by experimental difficulties.
A subset of the data appears to closely follow a well defined linear
pattern but there are outliers and the pattern breaks down in an
asymmetric way at larger x and y values. At some later time either the
sources of experimental variation will be reduced, or they will be
better understood but for now it is still useful to estimate the
constant of proportionality in order to check whether it seems
consistent with the observed values of z. With this particular dataset I
would have wasted a lot of time if I had tried to find a computational
method to match the line that to me was very visible so I chose the line
visually.



If you mean:

"I looked at the data, identified that the range a < x < b looks linear
and the range x > b does not, then used least squares (or some other
recognised, objective technique for fitting a line) to the data in that
linear range"

then I'm completely cool with that.

If both x and y are measured values, then regressing x on y and y on xwith give different answers and both will be wrong in that *neither*will be the best answer for the relationship between them. Oscar did notspecify whether either was an experimentally set input variable.

But that is not fitting a line by eye, which is what I am talking about.

With the line constrained to go through 0,0, a line eyeballed with aclear ruler could easily be better than either regression line, as ahuman will tend to minimize the deviations *perpendicular to the line*,which is the proper thing to do (assuming both variables are measured inthe same units).


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Re: [Offtopic] Line fitting [was Re: Numpy outlier removal]

Reply via email to