On Mon, 10 May 2004, Liaw, Andy wrote: > Both of you might have missed my question from Friday: For very long `x' > (e.g., length=50000), indexing by names can take a long time. See that > thread for detail. (For small data, you can hardly tell the difference.)
That's solved in R-devel as of this morning. You need a million to see a significant time in indexing. However, I think that in this case you should be indexing by the codes of a factor, as tapply is guaranteed to produce results in the order of the levels of f (after conversion to a factor). So the natural way to index by a factor is the default one. It may come as no surprise then that lda has code like group.means <- tapply(x, list(rep(g, p), col(x)), mean) X <- x - group.means[g, ] where g is a factor. > Also, I'm trying to write the function in a way that one can pass in more > than one grouping variables in a list, much like tapply. The version I > shown is a simplified version to demonstrate the `problem' I had. I > obviously missed the fact that tapply returns 1D array... > > Best, > Andy > > > From: [EMAIL PROTECTED] > > > > On 10 May 2004 at 10:09, Christophe Pallier wrote: > > > > > > > > > > > Liaw, Andy wrote: > > > > > > >Suppose I > > > >define the function: > > > > > > > >fun <- function(x, f) { > > > > m <- tapply(x, f, mean) > > > > ans <- x - m[match(f, unique(f))] > > > > names(ans) <- names(x) > > > > ans > > > >} > > > > > > > > > > > > > > > > > > May I ask what is the purpose of match(f,unique(f)) ? > > > > > > To remove the group means, I have be using: > > > > > > x-tapply(x,f,mean)[f] > > > > > > for a while, (and I am now changing to > > > x-tapply(x,f,mean)[as.character(f)] because of the peculiarities of > > > > wouldn't > > sweep(as.array(x), 1, tapply(x,f,mean)[as.character(f)] , "-") > > > > be more natural? > > > > Kjetil Halvorsen > > > > > indexing named vectors with factors ) > > > > > > The use of tapply(x,f,mean)[match(f,unique(f))] assumes a particular > > > order in the result of tapply, no? It seems a bit dangerous to me. > > > > > > > > > Christophe Pallier -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html