I still haven't come up with a solution to the question below, and I have another one. I frequently find myself in a situation where I have the list of columns I want to aggregate over in the form of a vector of strings, and I have to do something like the following:
dat[, list(mean.z = mean(z)), by = eval(parse(text = sprintf("list(%s)", paste(x, collapse=","))))] I think that's a pretty ugly solution (although it does work), but I haven't come up with anything better. Any suggestions? Thanks. - Elliot On Tue, Sep 11, 2012 at 11:33 AM, Elliot Joel Bernstein < elliot.bernst...@fdopartners.com> wrote: > I've been using this setup: > > > > flist <- expression( list(mean.z = mean(z), sd.z = sd(z)) ) > > dat[ , eval(flist), list(x)] > > It works great, but there's one small catch. If I do something like > > > flist <- expression(list(x.per.y = sum(x) / sum(y))) > > dat[, eval(flist), list(y)] > > it does the wrong thing, because sum(y) in each group is just the common > value, rather than that value times the length. Is there any way around > this? Obviously I could rewrite the expression if I know I'm going to by > grouping by y, but I'd like it to be generic. > > Thanks. > > - Elliot > > > On Wed, Aug 8, 2012 at 9:17 AM, David Winsemius <dwinsem...@comcast.net>wrote: > >> >> On Aug 7, 2012, at 9:28 PM, arun wrote: >> >> HI, >>> >>> Try this: >>> >>> fun1<-function(x,.expr){ >>> .expr<-expression(list(mean.z=**mean(z),sd.z=sd(z))) >>> z1<-eval(.expr) >>> } >>> >>> #or >>> fun1<-function(x,.expr){ >>> .expr<-expression(list(mean.z=**mean(z),sd.z=sd(z))) >>> z1<-.expr >>> } >>> >>> >>> dat[,eval(z1),list(x)] >>> dat[,eval(z1),list(y)] >>> dat[,eval(z1),list(x,y)] >>> >>> >> I'm not seeing the connection between those functions and the data.table >> call. (Running that code produces an error on my machine.) If the goal is >> to have an expression result then just create it with expression(). In the >> example: >> >> > flist <- expression( list(mean.z = mean(z), sd.z = sd(z)) ) >> > dat[ , eval(flist), list(x)] >> x mean.z sd.z >> 1: 2 0.04436034 1.039615 >> 2: 3 -0.06354504 1.077686 >> 3: 1 -0.08879671 1.066916 >> >> -- >> David. >> >> >> A.K. >>> >>> >>> >>> ----- Original Message ----- >>> From: Elliot Joel Bernstein >>> <elliot.bernstein@fdopartners.**com<elliot.bernst...@fdopartners.com> >>> > >>> To: r-help@r-project.org >>> Cc: >>> Sent: Tuesday, August 7, 2012 5:36 PM >>> Subject: [R] Repeated Aggregation with data.table >>> >>> I have been using ddply to do aggregation, and I frequently define a >>> single aggregation function that I use to aggregate over different >>> groups. For example, >>> >>> require(plyr) >>> >>> dat <- data.frame(x = sample(3, 100, replace=TRUE), y = sample(3, 100, >>> replace = TRUE), z = rnorm(100)) >>> >>> f <- function(x) { data.frame(mean.z = mean(x$z), sd.z = sd(x$z)) } >>> >>> ddply(dat, "x", f) >>> ddply(dat, "y", f) >>> ddply(dat, c("x", "y"), f) >>> >>> I recently discovered the data.table package, which dramatically >>> speeds up the aggregation: >>> >>> require(data.table) >>> dat <- data.table(dat) >>> >>> dat[, list(mean.z = mean(z), sd.z = sd(z)), list(x)] >>> dat[, list(mean.z = mean(z), sd.z = sd(z)), list(y)] >>> dat[, list(mean.z = mean(z), sd.z = sd(z)), list(x,y)] >>> >>> But I can't figure out how to save the aggregation function >>> "list(mean.z = mean(z), sd.z = sd(z))" as a variable that I can reuse, >>> similar to the function "f" above. Can someone please explain how to >>> do that? >>> >>> Thanks. >>> >>> - Elliot >>> >>> -- >>> Elliot Joel Bernstein, Ph.D. | Research Associate | FDO Partners, LLC >>> 134 Mount Auburn Street | Cambridge, MA | 02138 >>> Phone: (617) 503-4619 | Email: >>> elliot.bernstein@fdopartners.**com<elliot.bernst...@fdopartners.com> >>> >>> ______________________________**________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> >>> PLEASE do read the posting guide http://www.R-project.org/** >>> posting-guide.html <http://www.R-project.org/posting-guide.html> >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >>> ______________________________**________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> >>> PLEASE do read the posting guide http://www.R-project.org/** >>> posting-guide.html <http://www.R-project.org/posting-guide.html> >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> David Winsemius, MD >> Alameda, CA, USA >> >> > > > -- > Elliot Joel Bernstein, Ph.D. | Research Associate | FDO Partners, LLC > 134 Mount Auburn Street | Cambridge, MA | 02138 > Phone: (617) 503-4619 | Email: elliot.bernst...@fdopartners.com > > -- Elliot Joel Bernstein, Ph.D. | Research Associate | FDO Partners, LLC 134 Mount Auburn Street | Cambridge, MA | 02138 Phone: (617) 503-4619 | Email: elliot.bernst...@fdopartners.com [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.