On 29-Nov-09 17:13 PM, Dr. Phillip M. Feldman wrote: > All of the statistical packages that I am currently using and have used in > the past (Matlab, Minitab, R, S-plus) calculate standard deviation using the > sqrt(1/(n-1)) normalization, which gives a result that is unbiased when > sampling from a normally-distributed population. NumPy uses the sqrt(1/n) > normalization. I'm currently using the following code to calculate standard > deviations, but would much prefer if this could be fixed in NumPy itself: > > def mystd(x=numpy.array([]), axis=None): > """This function calculates the standard deviation of the input using the > definition of standard deviation that gives an unbiased result for > samples > from a normally-distributed population.""" > > xd= x - x.mean(axis=axis) > return sqrt( (xd*xd).sum(axis=axis) / (numpy.size(x,axis=axis)-1.0) ) > Anne Archibald has suggested a work-around. Perhaps ddof could be set, by default to 1 as other values are rarely required.
Where the distribution of a variate is not known a priori, then I believe that it can be shown that the n-1 divisor provides the best estimate of the variance. Colin W. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion