I am running R 2.1.1 in a Microsoft Windows XP environment. I have a matrix with three vectors (columns) and ~2 million rows. The three vectors are date_, id, and price. The data is ordered (sorted) by code and date_. (The matrix contains daily prices for several thousand stocks, and has ~2 million rows. If a stock did not trade on a particular date, its price is set to NA) I wish to add a fourth vector that is next_price. (Next price is the current price as long as the current price is not NA. If the current price is NA, the next_price is the next price that the security with this same ID trades. If the stock does not trade again, next_price is set to NA.) I wrote the following loop to calculate next_price. It works as intended, but I have one problem. When I have only 10,000 rows of data, the calculations are very fast. However, when I run the loop on the full 2 million rows, it seems to take ~ 1 second per row. Why is this happening? What can I do to speed the calculations when running the loop on the full 2 million rows? (I am not running low on memory, but I am maxing out my CPU at 100%) Here is my code and some sample data: data<- data[order(data$code,data$date_),] l<-dim(data)[1] w<-3 data[l,w+1]<-NA for (i in (l-1):(1)){ data[i,w+1]<-ifelse(is.na(data[i,w])==F,data[i,w],ifelse(data[i,2]==data[i+1,2],data[i+1,w+1],NA)) } date id price next_price 6/24/2005 1635 444.7838 444.7838 6/27/2005 1635 448.4756 448.4756 6/28/2005 1635 455.4161 455.4161 6/29/2005 1635 454.6658 454.6658 6/30/2005 1635 453.9155 453.9155 7/1/2005 1635 453.3153 453.3153 7/4/2005 1635 NA 453.9155 7/5/2005 1635 453.9155 453.9155 7/6/2005 1635 453.0152 453.0152 7/7/2005 1635 452.8651 452.8651 7/8/2005 1635 456.0163 456.0163 12/19/2005 1635 442.6982 442.6982 12/20/2005 1635 446.5159 446.5159 12/21/2005 1635 452.4714 452.4714 12/22/2005 1635 451.074 451.074 12/23/2005 1635 454.6453 454.6453 12/27/2005 1635 NA NA 12/28/2005 1635 NA NA 12/1/2003 1881 66.1562 66.1562 12/2/2003 1881 64.9192 64.9192 12/3/2003 1881 66.0078 66.0078 12/4/2003 1881 65.8098 65.8098 12/5/2003 1881 64.1275 64.1275 12/8/2003 1881 64.8697 64.8697 12/9/2003 1881 63.5337 63.5337 12/10/2003 1881 62.9399 62.9399
--------------------------------- [[alternative HTML version deleted]]
______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html