Re: [R] Creating a conditional lag variable in R

peter dalgaard Sat, 27 Jul 2019 01:34:05 -0700

Some pointers (not tested, may contain blunders...)

(a) you likely need some sort of split-operate-unsplit construct, by country. 
E.g.,


myfun <- function(d) {....operate on data frame with only one country....} 
ll <- split(data, data$country)
ll.new <- lapply(ll, myfun)
data.new <- unsplit(ll.new, data$country)

(There might be a tidyverse idiom for this too)

(b) your X1_pre5count looks like it is the same as cumsum(1-X1)*X1 (within 
country)

(c) if you count in the opposite direction, tt <- rev(cumsum(rev(1-X1))) you 
get number of years until agreement. Then X1_pre4 should be as.integer(tt <=4  
& tt > 0)

-pd

> On 27 Jul 2019, at 09:13 , Faradj Koliev <farad...@gmail.com> wrote:
> 
> Re-post, now in *plain text*. 
> 
> 
> 
> Dear R-users, 
> 
> I’ve a rather complicated task to do and need all the help I can get. 
> 
> I have data indicating whether a country has signed an agreement or not 
> (1=yes and 0=otherwise). I want to simply create variable that would capture 
> the years before the agreement is signed. The aim is to see whether pre or 
> post agreement period has any impact on my dependent variables. 
> 
> More preciesly, I want to create the following variables: 
> (i) a variable that is =1 in the 4 years pre/before the agreement, 0 
> otherwise; 
> (ii) a variable that is =1 5 years pre the agreement and 
> (iii) a variable that would count the 4 and 5 years pre the agreement 
> (1,2,3,4..). 
> 
> Please see the sample data below. I have manually added the variables I would 
> like to generate in R, labelled as “X1_pre4” ( 4 years before the agreement 
> X1), “X2_pre4”, “X1_pret5” ( 5 years before the agreement X5), and 
> “X1pre5_count” (which basically count the years, 1,2,3, etc). The X1 and X2 
> is the agreement that countries have either signed (1) or not (0). Note 
> though that I want the variable to capture all the years up to 4 and 5. If 
> it’s only 2 years, it should still be ==1 (please see the example below). 
> 
> To illustrate the logic: the country A has signed the agreement X1 in 1972 in 
> the sample data,  then, the (i) and (ii) variables as above should be =1 for 
> the years 1970, 1971, and =0 from 1972 until the end of the study period. 
> 
> The country A has signed the agreement X2 in 1975,  then, the (i) variable 
> should be =1 from 1971 to 1974 (post 4 years) and (ii) should be =1 for the  
> 1970-1974  period (post 5 years before the agreement is signed). 
> 
> Later, I would also like to create post_4 and post_5 variables, but I think 
> I’ll be able to figure it out once I know how to generate the pre/before 
> variables. 
> 
> All suggestions are much appreciated! 
> 
> 
> 
> data<-structure(list(country = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
> 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
> 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A", "B", "C"), class = "factor"), 
>    year = c(1970L, 1971L, 1972L, 1973L, 1974L, 1975L, 1976L, 
>    1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 1984L, 1985L, 
>    1986L, 1987L, 1988L, 1970L, 1971L, 1972L, 1973L, 1974L, 1975L, 
>    1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 1984L, 
>    1985L, 1986L, 1987L, 1988L, 1970L, 1971L, 1972L, 1973L, 1974L, 
>    1975L, 1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 
>    1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L, 1991L), 
>    X1 = c(0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
>    1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 
>    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 
>    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
>    1L, 1L), X2 = c(0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
>    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
>    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 
>    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 
>    1L, 1L, 1L, 1L), X1_pre4 = c(1L, 1L, 0L, 0L, 0L, 0L, 0L, 
>    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
>    0L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
>    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 
>    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X2_pre4 = c(0L, 1L, 1L, 
>    1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
>    0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
>    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 
>    1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X1_pre5 = c(1L, 
>    1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
>    0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 
>    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
>    0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), 
>    X1_pre5_count = c(1L, 2L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
>    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 2L, 3L, 
>    4L, 5L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
>    0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 2L, 3L, 4L, 5L, 0L, 0L, 0L, 
>    0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
> -60L))
> 
>> On 26 Jul 2019, at 21:58, Bert Gunter <bgunter.4...@gmail.com> wrote:
>> 
>> Because you posted in HTML, your example got mangled and resulted in an 
>> error. Re-post in *plain text* please (making sure that you cut and paste 
>> correctly)
>> 
>> Bert Gunter
>> 
>> "The trouble with having an open mind is that people keep coming along and 
>> sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>> 
>> 
>> On Fri, Jul 26, 2019 at 12:25 PM Faradj Koliev <farad...@gmail.com> wrote:
>> Dear R-users, 
>> 
>> I’ve a rather complicated task to do and need all the help I can get. 
>> 
>> I have data indicating whether a country has signed an agreement or not 
>> (1=yes and 0=otherwise). I want to simply create variable that would capture 
>> the years before the agreement is signed. The aim is to see whether pre or 
>> post agreement period has any impact on my dependent variables. 
>> 
>> More preciesly, I want to create the following variables: 
>> (i) a variable that is =1 in the 4 years pre/before the agreement, 0 
>> otherwise; 
>> (ii) a variable that is =1 5 years pre the agreement and 
>> (iii) a variable that would count the 4 and 5 years pre the agreement 
>> (1,2,3,4..). 
>> 
>> Please see the sample data below. I have manually added the variables I 
>> would like to generate in R, labelled as “X1_pre4” ( 4 years before the 
>> agreement X1), “X2_pre4”, “X1_pret5” ( 5 years before the agreement X5), and 
>> “X1pre5_count” (which basically count the years, 1,2,3, etc). The X1 and X2 
>> is the agreement that countries have either signed (1) or not (0). Note 
>> though that I want the variable to capture all the years up to 4 and 5. If 
>> it’s only 2 years, it should still be ==1 (please see the example below). 
>> 
>> To illustrate the logic: the country A has signed the agreement X1 in 1972 
>> in the sample data,  then, the (i) and (ii) variables as above should be =1 
>> for the years 1970, 1971, and =0 from 1972 until the end of the study 
>> period. 
>> 
>> The country A has signed the agreement X2 in 1975,  then, the (i) variable 
>> should be =1 from 1971 to 1974 (post 4 years) and (ii) should be =1 for the  
>> 1970-1974  period (post 5 years before the agreement is signed). 
>> 
>> Later, I would also like to create post_4 and post_5 variables, but I think 
>> I’ll be able to figure it out once I know how to generate the pre/before 
>> variables. 
>> 
>> All suggestions are much appreciated! 
>> 
>> 
>> 
>> data<–structure(list(country = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
>> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
>> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
>> 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
>> 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A", "B", "C"), class = "factor"), 
>>    year = c(1970L, 1971L, 1972L, 1973L, 1974L, 1975L, 1976L, 
>>    1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 1984L, 1985L, 
>>    1986L, 1987L, 1988L, 1970L, 1971L, 1972L, 1973L, 1974L, 1975L, 
>>    1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 1984L, 
>>    1985L, 1986L, 1987L, 1988L, 1970L, 1971L, 1972L, 1973L, 1974L, 
>>    1975L, 1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 
>>    1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L, 1991L), 
>>    X1 = c(0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
>>    1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 
>>    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 
>>    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
>>    1L, 1L), X2 = c(0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
>>    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
>>    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 
>>    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 
>>    1L, 1L, 1L, 1L), X1_pre4 = c(1L, 1L, 0L, 0L, 0L, 0L, 0L, 
>>    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
>>    0L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
>>    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 
>>    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X2_pre4 = c(0L, 1L, 1L, 
>>    1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
>>    0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
>>    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 
>>    1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X1_pre5 = c(1L, 
>>    1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
>>    0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 
>>    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
>>    0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), 
>>    X1_pre5_count = c(1L, 2L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
>>    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 2L, 3L, 
>>    4L, 5L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
>>    0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 2L, 3L, 4L, 5L, 0L, 0L, 0L, 
>>    0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
>> -60L))
>> 
>> 
>> 
>>        [[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd....@cbs.dk  Priv: pda...@gmail.com

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Creating a conditional lag variable in R

Reply via email to