As of NumPy v1.7, numpy.polyfit includes an option for providing weighting
to data to be fit.  It's a welcome addition, but the implementation seems a
bit non-standard, perhaps even wrong, and I wonder if someone can enlighten
me.

1. The documentation does not specify what the weighting array "w" is
supposed to be.  After some fooling around I figured out that it is
1/sigma, where sigma is the standard deviation uncertainty "sigma" in the y
data.  A better implementation, which would be consistent with how
weighting is done in scipy.optimize.curve_fit, would be to specify "sigma"
directly, rather than "w".  This is typically what the user has at hand,
this syntax is more transparent, and it would make polyfit consistent with
the nonlinear fitting routine scipy.optimize.curve_fit.

2. The definition of the covariance matrix in polyfit is peculiar (or maybe
wrong, I'm not sure).  Towards the end of the code for polyfit, the
standard covariance matrix is calculated and given the variable name
"Vbase".  However, instead of returning Vbase as the covariance matrix,
polyfit returns the quantity Vbase * fac, where fac = resids / (len(x) -
order - 2.0).  fac scales as N, the number of data points, so the
covariance matrix is about N times larger than it should be for large
values of N; the increase is smaller for small N (of order 10).  Perhaps
the authors of polyfit want to use a more conservative error estimate, but
the references given in the poyfit code make no mention of it.  I think it
would be much better to return the standard definition of the covariance
matrix (see, for example, the book Numerical Recipes).

David Pine
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to