On Wed, 25 Jan 2006, Ray Brownrigg wrote: > There's an even faster one, which nobody seems to have mentioned yet: > > rep(l <- rle(ids)$lengths, l)
I considered this but it wasn't clear to me from the initial post that each ID occupied a contiguous section of the vector. Also, lazy evaluation makes code like this rep(l <- rle(ids)$lengths, l) a bit worrying. It relies on rep() using the first argument before it uses the second one. In this case, clearly, it works, but it is not a style I would encourage and it's easy to construct functions where it fails. -thomas > Timing on my 2.8GHz NetBSD system shows: > >> length(ids) > [1] 45150 >> # Gabor: >> system.time(for (i in 1:100) ave(as.numeric(factor(ids)), ids, FUN = > length)) > [1] 3.45 0.06 3.54 0.00 0.00 >> # Barry (and others I think): >> system.time(for (i in 1:100) table(ids)[ids]) > [1] 2.13 0.05 2.20 0.00 0.00 >> Me: >> system.time(for (i in 1:100) rep(l <- rle(ids)$lengths, l)) > [1] 1.60 0.00 1.62 0.00 0.00 > > Of course the difference between 21 milliseconds and 16 milliseconds is > not great, unless you are doing this a lot. > > Ray Brownrigg > >> From: Gabor Grothendieck <[EMAIL PROTECTED]> >> >> Nice. I timed it and its much faster than mine too. >> >> On 1/24/06, Barry Rowlingson <[EMAIL PROTECTED]> wrote: >>> Laetitia Marisa wrote: >>>> Hello, >>>> >>>> Is there a simple and fast function that returns a vector of the number >>>> of replications for each object of a vector ? >>>> For example : >>>> I have a vector of IDs : >>>> ids <- c( "ID1", "ID2", "ID2", "ID3", "ID3","ID3", "ID5") >>>> >>>> I want the function returns the following vector where each term is the >>>> number of replicates for the given id : >>>> c( 1, 2, 2, 3,3,3,1 ) >>> >>> One-liner: >>> >>> > table(ids)[ids] >>> ids >>> ID1 ID2 ID2 ID3 ID3 ID3 ID5 >>> 1 2 2 3 3 3 1 >>> >>> 'table(ids)' computes the counts, then the subscripting [ids] looks it >>> all up. >>> >>> Now try it on your 40,000-long vector! >>> >>> Barry > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html