jiho wrote: > On 2007-June-01 , at 01:03 , Tom La Bone wrote: >> The function wtd.var(x,w) in Hmisc calculates the weighted variance >> of x >> where w are the weights. It appears to me that wtd.var(x,w) = var >> (x) if all >> of the weights are equal, but this does not appear to be the case. Can >> someone point out to me where I am going wrong here? Thanks. > > The true formula of weighted variance is this one: > http://www.itl.nist.gov/div898/software/dataplot/refman2/ch2/ > weighvar.pdf > But for computation purposes, wtd.var uses another definition which > considers the weights as repeats instead of true weights. However if > the weights are normalized (sum to one) to two formulas are equal. If > you consider weights as real weights instead of repeats, I would > recommend to use this option. > With normwt=T, your issue is solved: > > > a=1:10 > > b=a > > b[]=2 > > b > [1] 2 2 2 2 2 2 2 2 2 2 > > wtd.var(a,b) > [1] 8.68421 > # all weights equal 2 <=> there are two repeats of each element of a > > var(c(a,a)) > [1] 8.68421 > > wtd.var(a,b,normwt=T) > [1] 9.166667 > > var(a) > [1] 9.166667 > > Cheers, > > JiHO
The issue is what is being assumed for N in the denominator of the variance formula, since the unbiased estimator subtracts one. Using normwt=TRUE means you are in effect assuming N is the number of elements in the data vector, ignoring the weights. Frank Harrell > --- > http://jo.irisson.free.fr/ > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.