Hi, I hope this is the appropriate place to ask something like this, otherwise please let me know (or feel free to ignore this). Also I hope that I do not misunderstood something or did some silly mistake. If so, please let me know as well!
TLDR: When scaling the covariance matrix based on the residuals, scipy.optimize.curve_fit uses a factor of chisq(popt)/(M-N) (with M=number of point, N=number of parameters) and numpy.polyfit uses chisq(popt)/(M-N-2). I am wondering which is correct. I am somewhat confused about different results I am getting for the covariance matrix of a simple linear fit, when comparing `scipy.optimize.curve_fit` and `numpy.polyfit`. I am aware, that `curve_fit` solves the more general non-linear problem numerically, while `polyfit` finds an analytical solution to the linear problem. However, both converge to the same solution, so I suspect that this difference is not important here. The difference, I am curious about is not in the returned parameters but in the estimate of the corresponding covariance matrix. As I understand, there are two different ways to estimate it, based either on the absolute values of the provided uncertainties or by interpreting those only as weights and then scaling the matrix to produce an appropriate reduced chisq. To that end, curve_fit has the parameter: https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.cur ve_fit.html: "absolute_sigma : bool, optional If True, sigma is used in an absolute sense and the estimated parameter covariance pcov reflects these absolute values. If False, only the relative magnitudes of the sigma values matter. The returned parameter covariance matrix pcov is based on scaling sigma by a constant factor. This constant is set by demanding that the reduced chisq for the optimal parameters popt when using the scaled sigma equals unity. In other words, sigma is scaled to match the sample variance of the residuals after the fit. Mathematically, pcov(absolute_sigma=False) = pcov(absolute_sigma=True) * chisq(popt)/(M-N)" https://docs.scipy.org/doc/numpy/reference/generated/numpy.polyfit.html on the other hand, does not say anything about how the covariance matrix is estimated. To my understanding, its default should correspond to `absolute_sigma=False` for `curve_fit`. As `polyfit` has a weight parameter instead of an uncertainty parameter, I guess the difference in default behavior is not that surprising. However, even when specifying `absolute_sigma=False`, `curve_fit` and `polyfit` produce different covariance matrices as the applied scaling factors are chisq(popt)/(M-N-2) for `polyfit` (https://github.com/numpy/numpy/blob/6a58e25703cbecb6786faa09a04ae2ec82 21348b/numpy/lib/polynomial.py#L598-L605) and chisq(popt)/(M-N) for `curve_fit` (https://github.com/scipy/scipy/blob/607a21e07dad234f8e63fcf03b7994137a 3ccd5b/scipy/optimize/minpack.py#L781-L782). The argument given in a comment to the scaling `polyfit` is: "Some literature ignores the extra -2.0 factor in the denominator, but it is included here because the covariance of Multivariate Student-T (which is implied by a Bayesian uncertainty analysis) includes it. Plus, it gives a slightly more conservative estimate of uncertainty.", but honestly, in a quick search, I was not able to find any literature not ignoring the extra "factor". But obviously, I could very well be misunderstanding something. Nonetheless, as `curve_fit` ignores it as well, I was wondering whether those two shouldn't give consistent results and if so, which would be the correct solution. Best, Jonathan _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion