on 11/16/2008 08:12 PM jeffc wrote: > Hi, > > I have a data set similar to the following > > State Gender Quantity > TX Male 1 > NY Female 2 > TX Male 3 > NY Female 4 > > > I need to calculate cumulative sum of the quantity by State and Gender. The > expected output is > State Gender Quantity CumQuantity > TX Male 1 1 > TX Male 3 4 > NY Female 2 2 > NY Female 4 6 > > I highly appreciate if someone can give me some hints on solving that in R. > > Hao
I would verify this, but something along the lines of the following: > DF State Gender Quantity 1 TX Male 1 2 NY Female 2 3 TX Male 3 4 NY Female 4 do.call(rbind, lapply(split(DF, list(DF$State, DF$Gender), drop = TRUE), function(x) cbind(x, cumsum(x$Quantity)))) which yields: State Gender Quantity cumsum(x$Quantity) NY.Female.2 NY Female 2 2 NY.Female.4 NY Female 4 6 TX.Male.1 TX Male 1 1 TX.Male.3 TX Male 3 4 To take this step by step: First, split() DF by the two factors: > split(DF, list(DF$State, DF$Gender), drop = TRUE) $NY.Female State Gender Quantity 2 NY Female 2 4 NY Female 4 $TX.Male State Gender Quantity 1 TX Male 1 3 TX Male 3 Pass that to lapply(), in which we do the cumsum() and cbind(): > lapply(split(DF, list(DF$State, DF$Gender), drop = TRUE), function(x) cbind(x, cumsum(x$Quantity))) $NY.Female State Gender Quantity cumsum(x$Quantity) 2 NY Female 2 2 4 NY Female 4 6 $TX.Male State Gender Quantity cumsum(x$Quantity) 1 TX Male 1 1 3 TX Male 3 4 Pass that to do.call() to rbind() the results together: > do.call(rbind, lapply(split(DF, list(DF$State, DF$Gender), drop = TRUE), function(x) cbind(x, cumsum(x$Quantity)))) State Gender Quantity cumsum(x$Quantity) NY.Female.2 NY Female 2 2 NY.Female.4 NY Female 4 6 TX.Male.1 TX Male 1 1 TX.Male.3 TX Male 3 4 See ?split, ?do.call, ?rbind and ?cumsum. If you want the exact row ordering as you had it in your post, you can alter the factor levels, otherwise they will be sorted by alpha (eg. NY before TX and Female before Male). HTH, Marc Schwartz ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.