Hi R-helpers,
I have dataframe like
ID_CASE YEAR_MTH ATT_1 A1 A2
A3 CB26A 201302 1 146 42 74 CB26A 201302 0 140 50 77 CB26A 201303 0 128
36 77 CB26A 201304 1 146 36 72 CB26A 201305 1 134 36 80 CB26A 201305 0
148 30 80 CB26A 201306 0 134 20 72 CB26A 201307 1 125 48 79 CB26A 201309
0 122 44 74 CB26A 201310 1 126 37 72 CB26A 201310 1 107 43 75
I want a final dataframe which will look like
ID_CASE Period No.ofChange %Paid CB26A 201302-2013042 0.414365
CB26A 201303-201305 2 0.445245 CB26A 201304-201306 1 0.444444 CB26A
201305-201307 2 0.460741 CB26A 201306-201308 1 0.461774 CB26A
201307-201309 1 0.451327 CB26A 201308-201310 1 0.461378
where,
Period = a time period of 3 months which is shifted by 1 month subsequently
No.ofChange = number of time ATT_1 has changed values in this period
%Paid = sum(A3)/(sum(A1)+sum(A2)) for this period
E.g. for Period=201302-201304,
%Paid = (74+77+77+72)/((146+140+128+146)+(42+50+36+36))
Period calculation should start from the first YEAR_MTH for the ID_CASE,
i.e., if for a ID_CASE first YEAR_MTH is 201301 or 201304 then the period
should be defined accordingly.
I have a dataframe with 400 unique ID_CASE, I need to do it for all ID_CASE.
How can I do it in R?
Regards,
Abhinaba
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.