"Young Cho" <[EMAIL PROTECTED]> writes:

> I have data.frame's with IDs and multiple columns. B/c some of IDs
> showed up more than once, I need sum up colum values to creat a new
> dataframe with unique ids.
>
> I hope there are some cheaper ways of doing it...  Because the
> dataframe is huge, it takes almost an hour to do the task.  Thanks
> so much in advance!

Does this do what you want in a faster way?

sum_dup <- function(df) {
    idIdx <- split(1:nrow(df), as.character(df$ID))
    whID <- match("ID", names(df))
    colNms <- names(df)[-whID]
    ans <- lapply(colNms, function(cn) {
        unlist(lapply(idIdx,
                      function(x) sum(df[[cn]][x])),
               use.names=FALSE)
    })
    attributes(ans) <- list(names=colNms,
                            row.names=names(idIdx),
                            class="data.frame")
    ans
}


-- 
Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center
http://bioconductor.org

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to