Hi

I have 2 data.frames each of the same number of rows (approximately 30000 or
more entries).
They also have the same number of columns, lets say 2.
One column has the date, the other column has a double precision number. Let
the column names be V1, V2.

Now I want to calculate the correlation of the 2 sets of data, for the last
100 days for every day available in the data.frames.

My code looks like this :
# Let df1, and df2 be the 2 data frames with the required data
## begin code snippet

my_corr <- c();
for ( i_end in 100:nrow(df1)) {
       i_start <- i_end  - 99;
       my_corr[i_start] <-
cor(x=df1[i_start:i_end,"V2"],y=df2[i_start:i_end,"V2"])
}

## end of code snippet

This runs very slowly, and takes more than an hour to run if I have to
calculate correlation between 10 data sets leaving me with 45 runs of this
snippet or taking more than 30 minutes to run.

Is there an efficient  way to write  this piece of code where I can get it
to run faster ?

If I do something similar in Excel, it is much faster. But I have to use R,
since this is a part of a bigger program.

Any help will be appreciated.

Thanks and Regards
Vikas

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to