On 10/12/06, Charles R Harris <[EMAIL PROTECTED]> wrote:


On 10/12/06, Charles R Harris < [EMAIL PROTECTED]> wrote:


On 10/12/06, Greg Willden < [EMAIL PROTECTED]> wrote:
On 10/12/06, Charles R Harris <[EMAIL PROTECTED] > wrote:
I'm guessing that the rcond number in the lstsq version (default 1e-10) is the difference. Generally the lstsq version should work better than the MPL version because at*a is not as well conditioned and vandermonde matrices are notoriously ill conditioned anyway for higher degree fits. It would help if you attached the data files in ascii form unless they happen to contain thousands of data items. Just the x will suffice, zip it up if you have to.


Here are the files.

Since the two algorithms behave differently and each has it place then can both be included in numpy?
i.e. numpy.polyfit(x,y,N, mode='alg1')
numpy.polyfit (x,y,N, mode='alg2')

replacing alg1 and alg2 with meaningful names.

The polyfit function looks seriously busted. If I do the fits by hand I get the same results using the (not so hot) MPL version or lstsq. I don't know what the problem is. The docstring is also incorrect for the method. Hmmm...

Polyfit seems overly conservative in its choice of rcond.

In [101]: lin.lstsq(v,y,1e-10)[0]
Out[101]:
array([  5.84304475e-07,  -5.51513630e-03,   1.51465472e+01 ,
         3.05631361e-02])
In [107]: polyfit(x,y,3)
Out[108]:
array([  5.84304475e-07,  -5.51513630e-03,   1.51465472e+01,
         3.05631361e-02])

Compare

In [112]: lin.lstsq(v,y,1e-12)[0]
Out[112]:
array([ -5.42970700e-07,   1.61425067e-03,   1.99260667e+00,
         6.51889107e+03])

In [113]: dot(lin.inv(vtv),dot(v.T,y))
Out[113]:
array([ -5.42970700e-07,   1.61425067e-03,   1.99260667e+00,
         6.51889107e+03])
 
So the default needs to be changed somewhere. Probably polyfit shoud accept rcond as a keyword. Where the actual problem lies is a bit obscure as the normal rcond default for lin.lstsq is 1e-12. Maybe some sort of import error somewhere down the line.

And here is the location of the problem in numpy/linalg/linalg.py :

def lstsq(a, b, rcond=1.e-10):
 
The 1e-10 is a bit conservative. On the other hand, I will note that the condition number of the dot(V^T ,V) matrix is somewhere around 1e22, which means in general terms that you need around 22 digits of accuracy. Inverting it only works sorta by accident in the current case. Generally, using Vandermonde matrices and polynomial fits it a bad idea when the dynamic range of the interval gets large and the degree gets up around 4-5 as it leads to ill conditioned sets of equations. When you really need the best start with chebychev polynomials or, bestest, compute a set of polynomials orthogonal over the sample points. Anyway, I think rcond should be something like 1e-12 or 1e-13 by default and be available as a keyword in the polyfit function. If no one complains I will make this change, although it is just a bandaid and things will fall apart again as soon as you call polyfit(x,y,4).

Chuck


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion

Reply via email to