On Fri, Nov 4, 2011 at 1:02 PM, Gabor Grothendieck <ggrothendi...@gmail.com> wrote: > On Fri, Nov 4, 2011 at 12:56 PM, Gabor Grothendieck > <ggrothendi...@gmail.com> wrote: >> On Fri, Nov 4, 2011 at 12:34 PM, James Marca >> <jma...@translab.its.uci.edu> wrote: >>> Good morning, >>> >>> I have discovered what I believe to be a performance regression >>> between Zoo 1.6x and Zoo 1.7-6 in the application of rollapply. >>> On zoo 1.6x, rollapply of my function over my data takes about 20 >>> minutes. Using 1.7-6, the same code takes about 6 hours. >>> >>> R --version >>> R version 2.13.1 (2011-07-08) >>> Copyright (C) 2011 The R Foundation for Statistical Computing >>> ISBN 3-900051-07-0 >>> Platform: x86_64-pc-linux-gnu (64-bit) >>> >>> Two versions of zoo 1.6 run *fast* On one machine I am running >>> >>> less /usr/lib64/R/library/zoo/DESCRIPTION >>> Package: zoo >>> Version: 1.6-3 >>> Date: 2010-04-23 >>> Title: Z's ordered observations >>> ... >>> Packaged: 2010-04-23 07:28:47 UTC; zeileis >>> Repository: CRAN >>> Date/Publication: 2010-04-23 07:43:54 >>> Built: R 2.10.1; ; 2010-04-25 06:41:34 UTC; unix >>> >>> (Thankfully I forgot to upgrade.packages() on this machine!) >>> >>> On the other >>> >>> Package: zoo >>> Version: 1.6-5 >>> Date: 2011-04-08 >>> ... >>> Packaged: 2011-04-08 17:13:47 UTC; zeileis >>> Repository: CRAN >>> Date/Publication: 2011-04-08 17:27:47 >>> Built: R 2.13.1; ; 2011-11-04 15:49:54 UTC; unix >>> >>> I have stripped out zoo 1.7-6 from all my machines. >>> >>> I tried to ensure all libraries were identical on the two machines >>> (using lsof), and after finally downgrading zoo I got the second >>> machine to be as fast as the first, so I am quite certain the >>> difference in speed is down to the Zoo version used. >>> >>> My code runs a fairly simple function over a time series using the >>> following call to process a year of 30s data (9 columns, about a >>> million rows): >>> >>> vals <- rollapply(data=ts.data[,c(n.3.cols, o.3.cols,volocc.cols)] >>> ,width=40 >>> >>> ,FUN=rolling.function.fn(n.cols=n.3.cols,o.cols=o.3.cols,vo.cols=volocc.cols) >>> ,by.column=FALSE >>> ,align='right') >>> >>> >>> (The rolling.function.fn call returns a function that is initialized >>> with the initial call above (a trick I learned from Javascript)) >>> >>> If this is a known situation with the new 1.7 generation Zoo, my >>> apologies and I'll go away. If my code could be turned into a useful >>> test, I'd be happy to help out as much as I'm able. Given the extreme >>> runtime difference though, I thought I should offer my help in this >>> case, since zoo is such a useful package in my work. >> >> This was a known problem and was fixed but if its still there then >> there must be some other condition under which it can occur as well. >> If you can provide a small self contained reproducible example it >> would help in tracking it down. >> >> -- >> Statistics & Software Consulting >> GKX Group, GKX Associates Inc. >> tel: 1-877-GKX-GROUP >> email: ggrothendieck at gmail.com >> > > Also, as a workaround you can try this to use an old rollapply in a > new version of zoo: > > library(zoo) > source("http://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/zoo/R/rollapply.R?revision=817&root=zoo") > rollapply(...whatever...) >
Have looked at it and there is now a performance improvement in the development version of rollapply that gives an order of magnitude performance boost in the following case: > library(zoo) > n <- 10000 > z <- zoo(cbind(a = 1:n, b = 1:n)) > system.time(rollapplyr(z, 10, sum, by.column = FALSE)) user system elapsed 8.80 0.02 8.97 > > # download rollapply rev 913 from svn repo and rerun > source("http://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/zoo/R/rollapply.R?revision=913&root=zoo") > system.time(rollapplyr(z, 10, sum, by.column = FALSE)) user system elapsed 0.52 0.02 0.53 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.