Does anyone recommend a more efficient way to "roll" values in a time series
dataset?
I merged a bunch of different time series datasets (10's of thousands of them)
whose observation dates and sampling interval differ. Some time series
observations are reported at the beginning of the month, some at the end, some
on Mondays, some on Wednesday, some annually, etc.
In the process of merging all of the irregular time series (by date observed),
a significant number of NA's appear in the dataset where I really want the last
reported value 'rolled' forward.
To use a concrete example, a time series that has reported values at the
beginning of every month shows NA's for every day except the date it was
reported (in this case, the first of the month). I want the value to roll
forward so that NA's after the first of the month are replaced with a last
reported value.
I wrote the following for loop to accomplish the task on the object 'dataset',
however it is far to slow too process 10's of thousands of different time
series with 15,000 observations each. At this rate it is going, it would take
weeks to complete.
for(j in 1:length(names(dataset)))
{
last<-NA;
for(i in 1:length(row.names(dataset)))
ifelse(is.na(dataset[i,j]), test[i,j] <- last,
last<-dataset[i,j]);
}
One would think a rather simple operation as this could perform much faster.
My sense is using the "apply" function is the way to go, however I just can't
get my head around a function that would reference the last reported value.
Any guidance is appreciated.
-Richard
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.