Jason Turner <[EMAIL PROTECTED]> writes: > [EMAIL PROTECTED] wrote: > > > How do I go about generating a WEIGHTED mean (and standard error) of a > > variable (e.g., expenditures) for each level of a categorical variable > > (e.g., geographic region)? I'm looking for something comparable to PROC > > MEANS in SAS with both a class and weight statement. > > That's two questions. > 1) to apply a weighted mean to a vector, see ?weighted.mean > 2) to apply a function to data grouped by categorical variable, you > probably need "by" or "tapply". See the help pages and examples for > both.
Three actually. Noone seems to have answered how to get the SD, and that's a little more tricky. The simplest (well, the quickest) way to get the weighted SD is to do a weighted regression analysis with just an intercept term: x <- c(3,4,5); w <- c(2,5,7) # just for testing summary(lm(x~1,weight=w))$sigma # this is the weighted sum of squares on N-1 DF wss <- sum((x-m)^2*w) sqrt(wss/2) Notice however that SAS also does frequency weighting where (x=2.7,w=5) means that there are five observations of 2.7. In that case, the brute-force approach is sd(rep(x,w)) # which is the same as sqrt(wss/13) # sum(w)-1 DF -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help