In keeping with the theme of reducing unnecessary overhead (and using William's example data)
> system.time( vAve <- ave(a, i, FUN=cummax) ) user system elapsed 0.125 0.003 0.127 > system.time( b <- unlist( lapply( split(a,i) , cummax) ) ) user system elapsed 0.320 0.007 0.327 > system.time( b <- unlist( lapply( split(a,i) , cummax) , >use.names=FALSE) ) user system elapsed 0.067 0.001 0.068 > all.equal(vAve, b) [1] TRUE Apparently, quite a bit of overhead associated with keeping the names when unlisting. -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 5/18/16, 7:26 AM, "R-help on behalf of William Dunlap via R-help" <r-help-boun...@r-project.org on behalf of r-help@r-project.org> wrote: >ave(A, i, FUN=cummax) loops but is faster than your aggregate-based >solution. E.g., > >> i <- rep(1:10000, sample(0:210, replace=TRUE, size=10000)) >> length(i) >[1] 1056119 >> a <- sample(-50:50, replace=TRUE, size=length(i)) >> system.time( vAve <- ave(a, i, FUN=cummax) ) > user system elapsed > 0.13 0.03 0.16 >> system.time( vAggregate <- >as.vector(unlist(aggregate(a,list(i),cummax)[[2]])) ) > user system elapsed > 1.81 0.13 1.98 >> all.equal(vAve, vAggregate) >[1] TRUE > > > >Bill Dunlap >TIBCO Software >wdunlap tibco.com > >On Wed, May 18, 2016 at 6:32 AM, John Logsdon < >j.logs...@quantex-research.com> wrote: > >> Folks >> >> I have some very long vectors - typically 1 million long - which are >> indexed by another vector, same length, with values from 1 to a few >> thousand, sp each sub part of the vector may be a few hundred values >>long. >> >> I want to calculate the cumulative maximum of each sub part the main >> vector by the index in an efficient manner. This can obviously be done >>in >> a loop but the whole calculation is embedded within many other >> calculations which would make everything very slow indeed. All the >>other >> sums are vectorised already. >> >> For example, >> >> A=c(1,2,1, -3,5,6,7,4, 6,3,7,6,9, ...) >> i=c(1,1,1, 2,2,2,2,2, 3,3,3,3,3, ...) >> >> where A has three levels that are not the same but the levels themselves >> are all monotonic non-decreasing. >> >> the answer to be a vector of the same length: >> >> R=c(1,2,2, -3,5,6,7,7, 6,6,7,7,9, ...) >> >> If I could reset the cumulative maximum to -1e6 (eg) at each change of >> index, a simple cummax would do but I can't see how to do this. >> >> The best way I have found so far is to use the aggregate command: >> >> as.vector(unlist(aggregate(a,list(i),cummax)[[2]])) >> >> but rarely this fails, returning a shorter vector than expected and >>seems >> rather ugly, converting to and from lists which may well be an >> unnecessary overhead. >> >> I have been trying other approaches using apply() methods but either it >> can't be done using them or I can't get my head round them! >> >> Any ideas? >> >> Best wishes >> >> John >> >> John Logsdon >> Quantex Research Ltd >> +44 161 445 4951/+44 7717758675 >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> >>https://secure-web.cisco.com/1db6hsP9YKn27F8A9c3lLtE4FDoYVpnnKmVgP0ZTGuPp >>rrXWaCCwKPZt-pMgmapmF56MrgngzykSrZV_gXR2fFi1PX6vWBRDFYUhqF2AyuCUF2v4-ZN-8 >>q7fO3mBBnj_2k4lYyx46FqHtq2YNFkc-Hsh3zRxdA0WP8-5LlqRS76CzguBuwflIHhF6RC9n8 >>bi4GGTgNwUAZkfBIBU1Sq2Um1UovWcAe6Su1C7PC6N8LMqOBxCzdIjLT5P_esNZi3t5WiA7U9 >>DdEXxH-RdLJVyrMLmjvyuoCBYponGY4gRxSKSAIB-PuWULy7N1CGCGfMbmeN5tF1NsCnENwLS >>NH29UinTSrcPwdtvMMh_2PKZ0CjY/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flisti >>nfo%2Fr-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > >______________________________________________ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://secure-web.cisco.com/1db6hsP9YKn27F8A9c3lLtE4FDoYVpnnKmVgP0ZTGuPpr >rXWaCCwKPZt-pMgmapmF56MrgngzykSrZV_gXR2fFi1PX6vWBRDFYUhqF2AyuCUF2v4-ZN-8q7 >fO3mBBnj_2k4lYyx46FqHtq2YNFkc-Hsh3zRxdA0WP8-5LlqRS76CzguBuwflIHhF6RC9n8bi4 >GGTgNwUAZkfBIBU1Sq2Um1UovWcAe6Su1C7PC6N8LMqOBxCzdIjLT5P_esNZi3t5WiA7U9DdEX >xH-RdLJVyrMLmjvyuoCBYponGY4gRxSKSAIB-PuWULy7N1CGCGfMbmeN5tF1NsCnENwLSNH29U >inTSrcPwdtvMMh_2PKZ0CjY/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2F >r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.