Depending on the size of the dataframe and the operations you are
trying to perform, aggregate or ddply may be better.  In the function
below, df has the same structure as your dataframe.

Check out this code which runs aggregate and ddply for different
dataframe sizes.
============================
require(plyr)

CompareAggregation <- function(n) {
    df = data.frame(id=c(rep("A",15*n), rep("B",10*n), rep("C",
20*n)))
    df$fltval = rnorm(nrow(df))
    df$intval = rbinom(nrow(df), 1000, 0.8)
    t1 <- system.time(zz1 <- aggregate(list(fltsum=df$fltval,intsum=df
$intval), list(id=df$id), sum))
    t2 <- system.time(zz2 <- ddply(df, .(id), function(x) c(sum(x
$fltval), sum(x$intval)) ))
    return(c(agg=t1[[1]],ddply=t2[[1]]))
}

z <- c(10^seq(1,5))
names(z) <- as.character(z)
res.df <- t(data.frame(lapply(z, CompareAggregation)))
print(res.df)
============================


On Apr 14, 11:43 am, "arnaud Gaboury" <arnaud.gabo...@gmail.com>
wrote:
> Thank you for your help. The best I have found is to use the ddply function.
>
> > pose

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to