Have 40,000 rows of buy/sell trade data and am trying to add up the buys for each second, the code works but it is very slow. Any suggestions how to improve the sapply function ?
secEP = endpoints(xSym$Direction, "secs") # vector of last second on an XTS timeseries object with multiple entries for each second. d = xSym$Direction s = xSym$Size buySize = sapply(1:(length(secEP)-1), function(y) { i = (secEP[y]+ 1):secEP[y+1]; # index of vectors between each secEP return(sum(as.numeric(s[i][d[i] == "buy"]))); } ) Object details: secEP = numeric Vector of one second Endpoints in xSym$Direction. > head(xSym$Direction) Direction 2011-01-05 09:30:00 "unkn" 2011-01-05 09:30:02 "sell" 2011-01-05 09:30:02 "buy" 2011-01-05 09:30:04 "buy" 2011-01-05 09:30:04 "buy" 2011-01-05 09:30:04 "buy" > head(xSym$Size) Size 2011-01-05 09:30:00 " 865" 2011-01-05 09:30:02 " 100" 2011-01-05 09:30:02 " 100" 2011-01-05 09:30:04 " 100" 2011-01-05 09:30:04 " 100" 2011-01-05 09:30:04 " 41" Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Aggragating-subsets-of-data-in-larger-vector-with-sapply-tp3206445p3206445.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.