I am stuck on a data transformation problem. I have a data frame, df1
in my example, with some original "levels" data. The data pertain to
some variable, such as GDP, in various reference periods, REF, as
estimated and released in various release periods, REL. The release
periods follow after the reference periods by two months or more,
sometimes by several years. I want to build a second data frame,
called df2 in my example, with the month-to-month growth rates that
existed in each reference period, revealing the revisions to those
growth rates in subsequent periods.
REF1 <-
c("2017-01-01","2017-01-01","2017-01-01","2017-01-01","2017-01-01",
"2017-02-01","2017-02-01","2017-02-01","2017-02-01","2017-02-01",
"2017-03-01","2017-03-01","2017-03-01","2017-03-01","2017-03-01")
REL1 <-
c("2020-09-01","2020-08-01","2020-07-01","2020-06-01","2019-05-01",
"2020-09-01","2020-08-01","2020-07-01","2020-06-01","2019-05-01",
"2020-09-01","2020-08-01","2020-07-01","2020-06-01","2019-05-01")
VAL1 <-
c(17974,14567,13425,NA,12900,17974,14000,14000,12999,13245,17197,11500,
19900,18765,13467)
df1 <- data.frame(REF1,REL1,VAL1)
REF2 <-
c("2017-02-01","2017-02-01","2017-02-01","2017-02-01","2017-02-01",
"2017-03-01","2017-03-01","2017-03-01","2017-03-01","2017-03-01")
REL2 <-
c("2020-09-01","2020-08-01","2020-07-01","2020-06-01","2019-05-01",
"2020-09-01","2020-08-01","2020-07-01","2020-06-01","2019-05-01")
VAL2 <- c(0.0,-3.9,4.3,NA,2.3,-4.3,-17.9,42.1,44.4,1.7)
df2 <- data.frame(REF2,REL2,VAL2)
In my example I have provided some sample data pertaining to three
reference months, 2017-01-01 through 2017-03-01, and five release
periods, "2020-09-01","2020-08-01","2020-07-01","2020-06-01" and
"2019-05-01". In my actual problem I have millions of REF-REL
combinations, so my data frame is quite large. I am using data.table
for faster processing, though I am more familiar with the tidyverse. I
am providing df2 as the target data frame for my example, so you can
see what I am trying to achieve.
I have not been able to find an efficient way to do these
calculations. I have tried "for" loops with "if" statements, without
success so far, and anyway this approach would be too slow, I fear.
Suggestions as to how I might proceed would be much appreciated.
Philip
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.