On Sat, 29 Jul 2006, Kevin B. Hendricks wrote:
Hi Bill,
sum : igroupSums
Okay, after thinking about this ...
# assumes i is the small integer factor with n levels
# v is some long vector
# no sorting required
igroupSums - function(v,i) {
sums - rep(0,max(i))
for (j in
Hi Thomas,
Here is a comparison of performance times from my own igroupSums
versus using split and rowsum:
x - rnorm(2e6)
i - rep(1:1e6,2)
unix.time(suma - unlist(lapply(split(x,i),sum)))
[1] 8.188 0.076 8.263 0.000 0.000
names(suma)- NULL
unix.time(sumb - igroupSums(x,i))
[1]
Hi Bill,
After playing with this some more and adding an implementation to
handle NAs in the data vector, I have run into the problem of what to
return when the only data values for a particular bin (or level) in
the data vector were NAs and the user selected na.rm=T
1. Should it return 0
Hi Bill,
So you wrote one routine that can calculate any single of a variety
of stats and allows weights, is that right? Can it return a data
frame of any subset of requested stats as well (that is what I was
thinking of doing anyway).
I think someone can easily calculate all of those
Hi,
I was using my installed R which is 2.3.1 for the first tests. I
moved to the r-devel tree (I svn up and rebuild everyday) for my by
tests to see if it would work any better. I neglected to retest
merge with the devel version.
So it appears merge is already fixed and I just need to
There was a performance comparison of several moving average
approaches here:
http://tolstoy.newcastle.edu.au/R/help/04/10/5161.html
The author of that message ultimately wrote the caTools R package
which contains some optimized versions.
Not sure if these results suggest anything of interest
Kevin == Kevin B Hendricks [EMAIL PROTECTED]
on Fri, 28 Jul 2006 14:53:57 -0400 writes:
[.]
Kevin The idea is to somehow make functions that work well
Kevin over small sub- sequences of a much longer vector
Kevin without resorting to splitting the vector into many
Hi Bill,
Splus8.0 has something like what you are talking about
that provides a fast way to compute
sapply(split(xVector, integerGroupCode), summaryFunction)
for some common summary functions. The 'integerGroupCode'
is typically the codes from a factor, but you could compute
it in
On Fri, 28 Jul 2006, Kevin B. Hendricks wrote:
Hi Bill,
Splus8.0 has something like what you are talking about
that provides a fast way to compute
sapply(split(xVector, integerGroupCode), summaryFunction)
for some common summary functions. The 'integerGroupCode'
is typically the
Hi Bill,
sum : igroupSums
Okay, after thinking about this ...
# assumes i is the small integer factor with n levels
# v is some long vector
# no sorting required
igroupSums - function(v,i) {
sums - rep(0,max(i))
for (j in 1:length(v)) {
sums[[i[[j - sums[[i[[j + v[[j]]
Hi Developers,
I am looking for another new project to help me get more up to speed
on R and to learn something outside of R internals. One recent R
issue I have run into is finding a fast implementations of the
equivalent to the following SAS code:
/* MDPC is an integer sort key made
Kevin B. Hendricks [EMAIL PROTECTED] writes:
My first R attempt was a simple
# sort the data.frame gd and the sort key
sorder - order(MDPC)
gd - gd[sorder,]
MDPC - MDPC[sorder]
attach(gd)
# find the length and sum for each unique sort key
XN - by(MVE, MDPC, length)
XSUM - by(MVE, MDPC,
12 matches
Mail list logo