Hello,
I'm mailing r-devel because I think the performance problem I'm having
is best solved by re-writing reshape() in C. I've been reading the
writing R extensions documentation and I have some questions about
they best way to write the C bits, but first let me describe my problem:
I'm trying
If for each combination of X and Y there is at most one Z and Z is numeric
(it could be made so in your example) then you could use xtabs which is
faster:
m - n - 10
DF - data.frame(X = gl(m*n, 1), Y = gl(m, n), Z = 10*(1:(n*m)))
system.time(w1 - reshape(DF, timevar = X, idvar = Y, dir =
I wrote:
It still needs some debugging, to put it mildly
(doesn't work properly on reals), but the basic idea appears to work.
It works for reals on a 64-bit machine, but not on a 32-bit machine. I
figure the culprit is this bit of c code:
SET_VECTOR_ELT(wideCol, wideRow, VECTOR_ELT(longCol,
I'd like to thank everyone that's replied so far--more inline:
On Thu, 2006-08-24 at 11:16 +0100, Prof Brian Ripley wrote:
Your example does not correspond to your description. You have taken a
random number of loci for each subject and measured each a random number
of times:
You're right.
If your Z in reality is not naturally numeric try representing it as a
factor and using
the numeric levels as your numbers and then put the level labels back on:
m - n - 5
DF - data.frame(X = gl(m*n, 1), Y = gl(m, n), Z = letters[1:25])
Zn - as.numeric(DF$Z)
system.time(w1 - reshape(DF, timevar =
On Thu, 2006-08-24 at 08:57 -0400, Gabor Grothendieck wrote:
If your Z in reality is not naturally numeric try representing it as a
factor and using
the numeric levels as your numbers and then put the level labels back on:
m - n - 5
DF - data.frame(X = gl(m*n, 1), Y = gl(m, n), Z =
On 8/24/06, Mitch Skinner [EMAIL PROTECTED] wrote:
On Thu, 2006-08-24 at 08:57 -0400, Gabor Grothendieck wrote:
If your Z in reality is not naturally numeric try representing it as a
factor and using
the numeric levels as your numbers and then put the level labels back on:
m - n - 5