Tim,
If you just want the event.seq, you should be able to get really fast
results with:
unsplit(
lapply(split(df$event.of.interest, df$subject),
FUN = function(x) {
cumsum(cumsum(x)0)
}
),
df$subject)
Doesn't produce a matrix, but it's fast and you could cbind it afterwards.
That also
Any advice, tips, clues or pointers to resources on how best to speed up
or, better still, avoid the loops in the following example code much
appreciated. My actual dataset has several tens of thousands of rows and
lots of columns, and these loops take a rather long time to run.
Everything else
On 2/14/07, Tim Churches [EMAIL PROTECTED] wrote:
Any advice, tips, clues or pointers to resources on how best to speed up
or, better still, avoid the loops in the following example code much
appreciated. My actual dataset has several tens of thousands of rows and
lots of columns, and these
On Thu, 2007-02-15 at 12:24 +1100, Tim Churches wrote:
Any advice, tips, clues or pointers to resources on how best to speed up
or, better still, avoid the loops in the following example code much
appreciated. My actual dataset has several tens of thousands of rows and
lots of columns, and
jim holtman wrote:
On 2/14/07, Tim Churches [EMAIL PROTECTED] wrote:
Any advice, tips, clues or pointers to resources on how best to speed up
or, better still, avoid the loops in the following example code much
appreciated. My actual dataset has several tens of thousands of rows and
lots of
Marc Schwartz wrote:
OK, here is one possible solution, though perhaps with a bit more time,
there may be more optimal approaches.
Using your example data above, but first noting that you do not want to
use:
df - data.frame(cbind(subject,year,event.of.interest))
Using cbind() first,
One concern that I have with the 'tapply' approach is that it does not
create the correct results if the data it not in sorted order. See the
example below:
# generate an unsorted set of data
set.seed(123)
x - data.frame(a=sample(1:3,12,TRUE), b=sample(0:1, 12, TRUE))
x
a b
1 1 1
2 3 1