Re: [Numpy-discussion] non-standard standard deviation

Bruce Southey Fri, 04 Dec 2009 07:55:17 -0800

On 12/04/2009 06:18 AM, yogesh karpate wrote:

@ Pauli and @ Colin:
Sorry for the late reply. I was busyin some other assignments.# As far as normalization by(n) is concerned then its commonassumption that the population is normally distributed and populationsize is fairly large enough to fit the normal distribution. But thisstandard deviation, when applied to a small population, tends to betoo low therefore it is called as biased.# The correction known as bessel correction is there for small samplesize std. deviation. i.e. normalization by (n-1).# In "electrical-and-electronic-measurements-and-instrumentation" byA.K. Sawhney . In 1st chapter of the book "Fundamentals ofMeausrements " . Its shown that for N=16 the std. deviationnormalization was (n-1)=15# While I was learning statistics in my course Instructor would adviseto take n=20 for normalization by (n-1)
# Probability and statistics by Schuam Series  is good reading.
Regards
~ymk
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Hi,

Basically, all that I see with these arbitrary values is that you arerelying on the 'central limit theorem'(http://en.wikipedia.org/wiki/Central_limit_theorem). Really the issuein using these values is how much statistical bias will you tolerateespecially in the impact on usage of that estimate because the usage ofvariance (such as in statistical tests) tend to be more influenced bybias than the estimate of variance. (Of course, many features rely onasymptotic properties so bias concerns are less apparent in large samplesizes.)

Obviously the default relies on the developers background andrequirements. There are multiple valid variance estimators in statisticswith different denominators like N (maximum likelihood estimator), N-1(restricted maximum likelihood estimator and certain Bayesianestimators) and Stein's(http://en.wikipedia.org/wiki/James%E2%80%93Stein_estimator). Sothecurrent default behavior is a valid and documented. Consequently youcan not just have one option or different functions (like certainprograms) and Numpy's implementation actually allows you do all these ina single function. So I also see no reason change even if I have to addthe ddof=1 argument, after all 'Explicit is better than implicit' :-).


Bruce

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] non-standard standard deviation

Reply via email to