On Wed, 2006-08-09 at 18:07 +0200, Christian Oswald wrote:
> Dear List,
> 
> I neeed a grouped list with two sort of categorical data. I have a data
> .frame like this.
>       year    cat.    b       c
> 1     2006    a1      125     212
> 2     2006    a2      256     212     
> 3     2005    a1      14      12
> 4     2004    a3      565     123
> 5     2004    a2      156     789     
> 6     2005    a1      1       456
> 7     2003    a2      786     123
> 8     2003    a1      421     569
> 9     2002    a2      425     245
> 
> I need a list with the sum of b and c for every year and every cat (a1,
> a2 or a3) in this year. I had used the tapply function to build the sum
> for every year or every cat. How can I combine the two grouping values?

Christian,

Is that what you want (using DF as your data.frame):

> aggregate(DF[, c("b", "c")], 
            by = list(Year = DF$year, Cat = DF$cat.),
            sum)
  Year Cat   b   c
1 2003  a1 421 569
2 2005  a1  15 468
3 2006  a1 125 212
4 2002  a2 425 245
5 2003  a2 786 123
6 2004  a2 156 789
7 2006  a2 256 212
8 2004  a3 565 123

You can also reorder the results by Year and Cat:

> DF.result <- aggregate(DF[, c("b", "c")], 
                         by = list(Year = DFyear, Cat = DF$cat.), 
                         sum)

> DF.result[order(DF.result$Year, DF.result$Cat), ]
  Year Cat   b   c
4 2002  a2 425 245
1 2003  a1 421 569
5 2003  a2 786 123
6 2004  a2 156 789
8 2004  a3 565 123
2 2005  a1  15 468
3 2006  a1 125 212
7 2006  a2 256 212



Note that tapply() can only handle one 'X' vector at a time, whereas
aggregate can handle multiple 'X' columns in one call. For example:

> tapply(DF$b, list(DF$year, DF$cat.), sum)
      a1  a2  a3
2002  NA 425  NA
2003 421 786  NA
2004  NA 156 565
2005  15  NA  NA
2006 125 256  NA

will give you the sum of 'b' for each combination of Year and Cat within
the 2d table, but I suspect this is not the output format you want. You
also get NA's in the cells where there was not the given combination
present in your data.

HTH,

Marc Schwartz

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to