On Sat, 29 Jul 2006, Kevin B. Hendricks wrote:

> Hi Bill,
>
>>>>    sum : igroupSums
>
> Okay, after thinking about this ...
>
> # assumes i is the small integer factor with n levels
> # v is some long vector
> # no sorting required
>
> igroupSums <- function(v,i) {
>   sums <- rep(0,max(i))
>   for (j in 1:length(v)) {
>       sums[[i[[j]]]] <- sums[[i[[j]]]] + v[[j]]
>   }
>   sums
> }
>
> if written in fortran or c might be faster than using split.  It is
> at least just linear in time with the length of vector v.

For sums you should look at rowsum().  It uses a hash table in C and last 
time I looked was faster than using split(). It returns a vector of the 
same length as the input, but that would easily be fixed.

The same approach would work for min, max, range, count, mean, but not for 
arbitrary functions.

        -thomas

Thomas Lumley                   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]       University of Washington, Seattle

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to