On Thu, 2006-10-05 at 15:44 -0700, Kaom Te wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hello, > > I'm a novice user trying to figure out how to retain NA aggregate > values. For example, given a data frame with data for 3 of the 4 > possible factor colors("orange" is omitted from the data frame), I want > to calculate the average height by color, but I'd like to retain the > knowledge that "orange" is a possible factor, its just missing. Here is > the example code: > > > data <- data.frame(color = factor(c("blue","red","red","green","blue"), > levels = c("blue","red","green","orange")), > height = c(2,8,4,4,5)) > > aggregate(data$height, list(color = data$color), mean) > color x > 1 blue 3.5 > 2 red 6.0 > 3 green 4.0 > > > > Instead I would like to get > > color x > 1 blue 3.5 > 2 red 6.0 > 3 green 4.0 > 4 orange NA > > Is this possible. I've read as much documentation as I can find, but am > unable to find the solution. It seems like something people would need > to do. So I would assume it must be built in somewhere or do I need to > write my own version of aggregate? > > Thanks in advance, > Kaom
If you review the Details section of ?aggregate, you will note: "Empty subsets are removed, ..." Thus, one approach is: tmp <- tapply(data$height, data$color, mean, na.rm = TRUE) > tmp blue red green orange 3.5 6.0 4.0 NA DF <- data.frame(color = names(tmp), mean.height = tmp, row.names = seq(along = tmp)) > DF color mean.height 1 blue 3.5 2 red 6.0 3 green 4.0 4 orange NA HTH, Marc Schwartz ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.