I still haven't come up with a solution to the question below, and I have
another one. I frequently find myself in a situation where I have the list
of columns I want to aggregate over in the form of a vector of strings, and
I have to do something like the following:

dat[, list(mean.z = mean(z)), by = eval(parse(text = sprintf("list(%s)",
paste(x, collapse=","))))]

I think that's a pretty ugly solution (although it does work), but I
haven't come up with anything better. Any suggestions?

Thanks.

- Elliot

On Tue, Sep 11, 2012 at 11:33 AM, Elliot Joel Bernstein <
elliot.bernst...@fdopartners.com> wrote:

> I've been using this setup:
>
>
> > flist <- expression( list(mean.z = mean(z), sd.z = sd(z)) )
> > dat[ , eval(flist), list(x)]
>
> It works great, but there's one small catch. If I do something like
>
> > flist <- expression(list(x.per.y = sum(x) / sum(y)))
> > dat[, eval(flist), list(y)]
>
> it does the wrong thing, because sum(y) in each group is just the common
> value, rather than that value times the length. Is there any way around
> this? Obviously I could rewrite the expression if I know I'm going to by
> grouping by y, but I'd like it to be generic.
>
> Thanks.
>
> - Elliot
>
>
> On Wed, Aug 8, 2012 at 9:17 AM, David Winsemius <dwinsem...@comcast.net>wrote:
>
>>
>> On Aug 7, 2012, at 9:28 PM, arun wrote:
>>
>>  HI,
>>>
>>> Try this:
>>>
>>> fun1<-function(x,.expr){
>>>   .expr<-expression(list(mean.z=**mean(z),sd.z=sd(z)))
>>>  z1<-eval(.expr)
>>>  }
>>>
>>> #or
>>> fun1<-function(x,.expr){
>>>   .expr<-expression(list(mean.z=**mean(z),sd.z=sd(z)))
>>>  z1<-.expr
>>>  }
>>>
>>>
>>>  dat[,eval(z1),list(x)]
>>> dat[,eval(z1),list(y)]
>>> dat[,eval(z1),list(x,y)]
>>>
>>>
>> I'm not seeing the connection between those functions and the data.table
>> call. (Running that code produces an error on my machine.) If the goal is
>> to have an expression result then just create it with expression(). In the
>> example:
>>
>> > flist <- expression( list(mean.z = mean(z), sd.z = sd(z)) )
>> > dat[ , eval(flist), list(x)]
>>    x      mean.z     sd.z
>> 1: 2  0.04436034 1.039615
>> 2: 3 -0.06354504 1.077686
>> 3: 1 -0.08879671 1.066916
>>
>> --
>> David.
>>
>>
>>  A.K.
>>>
>>>
>>>
>>> ----- Original Message -----
>>> From: Elliot Joel Bernstein 
>>> <elliot.bernstein@fdopartners.**com<elliot.bernst...@fdopartners.com>
>>> >
>>> To: r-help@r-project.org
>>> Cc:
>>> Sent: Tuesday, August 7, 2012 5:36 PM
>>> Subject: [R] Repeated Aggregation with data.table
>>>
>>> I have been using ddply to do aggregation, and I frequently define a
>>> single aggregation function that I use to aggregate over different
>>> groups. For example,
>>>
>>> require(plyr)
>>>
>>> dat <- data.frame(x = sample(3, 100, replace=TRUE), y = sample(3, 100,
>>> replace = TRUE), z = rnorm(100))
>>>
>>> f <- function(x) { data.frame(mean.z = mean(x$z), sd.z = sd(x$z)) }
>>>
>>> ddply(dat, "x", f)
>>> ddply(dat, "y", f)
>>> ddply(dat, c("x", "y"), f)
>>>
>>> I recently discovered the data.table package, which dramatically
>>> speeds up the aggregation:
>>>
>>> require(data.table)
>>> dat <- data.table(dat)
>>>
>>> dat[, list(mean.z = mean(z), sd.z = sd(z)), list(x)]
>>> dat[, list(mean.z = mean(z), sd.z = sd(z)), list(y)]
>>> dat[, list(mean.z = mean(z), sd.z = sd(z)), list(x,y)]
>>>
>>> But I can't figure out how to save the aggregation function
>>> "list(mean.z = mean(z), sd.z = sd(z))" as a variable that I can reuse,
>>> similar to the function "f" above. Can someone please explain how to
>>> do that?
>>>
>>> Thanks.
>>>
>>> - Elliot
>>>
>>> --
>>> Elliot Joel Bernstein, Ph.D. | Research Associate | FDO Partners, LLC
>>> 134 Mount Auburn Street | Cambridge, MA | 02138
>>> Phone: (617) 503-4619 | Email: 
>>> elliot.bernstein@fdopartners.**com<elliot.bernst...@fdopartners.com>
>>>
>>> ______________________________**________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>> PLEASE do read the posting guide http://www.R-project.org/**
>>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>> ______________________________**________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>> PLEASE do read the posting guide http://www.R-project.org/**
>>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> David Winsemius, MD
>> Alameda, CA, USA
>>
>>
>
>
> --
> Elliot Joel Bernstein, Ph.D. | Research Associate | FDO Partners, LLC
> 134 Mount Auburn Street | Cambridge, MA | 02138
> Phone: (617) 503-4619 | Email: elliot.bernst...@fdopartners.com
>
>


-- 
Elliot Joel Bernstein, Ph.D. | Research Associate | FDO Partners, LLC
134 Mount Auburn Street | Cambridge, MA | 02138
Phone: (617) 503-4619 | Email: elliot.bernst...@fdopartners.com

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to