Re: [R] Creating a before-and-after variable in R

2019-10-03 Thread Faradj Koliev
Thank you very much for your help!

All the best, 
Faradj 

> 3 okt. 2019 kl. 16:37 skrev Eric Berger :
> 
> You can replace the last line in my first suggestion by the following two 
> lines
> 
> d <- 2014  # the default (set by the user)
> a$treatment <- sapply( 1:nrow(a), function(i) { b <- v[a$country_code[i]]; 
> a$year[i] - ifelse(is.na(b),d,b)})
> 
> Best,
> Eric
> 
> 
> 
> 
> 
> On Thu, Oct 3, 2019 at 5:19 PM Faradj Koliev  wrote:
> Hi, 
> 
> I was thinking that it could simply show the negative counts. For ex: if a 
> country hasn’t introduced the policy X, and it's in the dataset from 1982 to 
> 2014, then the treatment variable would take a value -33 in 1982 and -1 in 
> 2014. 
> 
> Best, 
> Faradj 
> 
> 
> > 3 okt. 2019 kl. 16:11 skrev Eric Berger :
> > 
> > Hi Faradj,
> > What should the treatment variable be in those cases? If you want to set it 
> > to a constant y (such as y=0), you can add something like
> > 
> > y <- 0
> > a$treatment[ is.na(a$treatment) ] <- y
> > 
> > HTH,
> > Eric
> > 
> > 
> > On Thu, Oct 3, 2019 at 4:54 PM Faradj Koliev  wrote:
> > Dear Eric, 
> > 
> > Thank you very much for this - it worked perfectly! 
> > 
> > A small thing: I wonder whether it’s possible to include those cases where 
> > the x is =0 for the whole study period. I have countries with x=0 for the 
> > whole period and the treatment variable is=NA for these observations. 
> > 
> > Best, 
> > Faradj 
> > 
> > 
> > > 3 okt. 2019 kl. 15:18 skrev Eric Berger :
> > > 
> > > Hi Faradj,
> > > Suppose your data frame is labeled 'a'. Then the following seems to do 
> > > what you want.
> > > 
> > > v <- rep(NA_integer_,max(a$country_code))
> > > v[ a$country_code[a$x==1] ] <- a$year[a$x==1]
> > > a$treatment <- sapply( 1:nrow(a), function(i) { a$year[i] - 
> > > v[a$country_code[i]]})
> > > 
> > > HTH,
> > > Eric
> > > 
> > > 
> > > On Thu, Oct 3, 2019 at 3:36 PM Faradj Koliev  wrote:
> > > Dear Michael Dewey, 
> > > 
> > > Thanks for reaching out about this. I trying again, now with plain text, 
> > > and hope it works. 
> > > 
> > > Best, 
> > > Faradj 
> > > 
> > > 
> > > 
> > > Dear R-users, 
> > > 
> > > I need an urgent help with the following: I have a country-year data 
> > > covering the period 1982 - 2013. I want to assess how the variable X (a 
> > > certain policy) affects the Y variable. The X variable is =1 when a 
> > > country introduces that policy in a specific year, otherwise =0. 
> > > 
> > > What I want to do is to create a treatment variable, that would be a 
> > > negative count until the X=1, and then  positive counts  for the years 
> > > after X=1. 
> > > 
> > > For example, let’s say that the U.S. introduced the policy x in year 
> > > 2000. The treatment variable would look be like this: 
> > > 
> > > country
> > > 
> > > year
> > > 
> > > x
> > > 
> > > treatment
> > > 
> > > USA
> > > 
> > > 1982
> > > 
> > > 0
> > > 
> > > -18
> > > 
> > > USA
> > > 
> > > 1983
> > > 
> > > 0
> > > 
> > > -17
> > > 
> > > USA
> > > 
> > > 1984
> > > 
> > > 0
> > > 
> > > -16
> > > 
> > > USA
> > > 
> > > 1985
> > > 
> > > 0
> > > 
> > > -15
> > > 
> > > USA
> > > 
> > > 1986
> > > 
> > > 0
> > > 
> > > -14
> > > 
> > > USA
> > > 
> > > 1987
> > > 
> > > 0
> > > 
> > > -13
> > > 
> > > USA
> > > 
> > > 1988
> > > 
> > > 0
> > > 
> > > -12
> > > 
> > > USA
> > > 
> > > 1989
> > > 
> > > 0
> > > 
> > > -11
> > > 
> > > USA
> > > 
> > > 1990
> > > 
> > > 0
> > > 
> > > -10
> > > 
> > > USA
> > > 
> > > 1991
> > > 
> > > 0
> > > 
> > > -9
> > > 
> > > USA
> > > 
> > > 1992
> > &

Re: [R] Creating a before-and-after variable in R

2019-10-03 Thread Faradj Koliev
Hi, 

I was thinking that it could simply show the negative counts. For ex: if a 
country hasn’t introduced the policy X, and it's in the dataset from 1982 to 
2014, then the treatment variable would take a value -33 in 1982 and -1 in 
2014. 

Best, 
Faradj 


> 3 okt. 2019 kl. 16:11 skrev Eric Berger :
> 
> Hi Faradj,
> What should the treatment variable be in those cases? If you want to set it 
> to a constant y (such as y=0), you can add something like
> 
> y <- 0
> a$treatment[ is.na(a$treatment) ] <- y
> 
> HTH,
> Eric
> 
> 
> On Thu, Oct 3, 2019 at 4:54 PM Faradj Koliev  wrote:
> Dear Eric, 
> 
> Thank you very much for this - it worked perfectly! 
> 
> A small thing: I wonder whether it’s possible to include those cases where 
> the x is =0 for the whole study period. I have countries with x=0 for the 
> whole period and the treatment variable is=NA for these observations. 
> 
> Best, 
> Faradj 
> 
> 
> > 3 okt. 2019 kl. 15:18 skrev Eric Berger :
> > 
> > Hi Faradj,
> > Suppose your data frame is labeled 'a'. Then the following seems to do what 
> > you want.
> > 
> > v <- rep(NA_integer_,max(a$country_code))
> > v[ a$country_code[a$x==1] ] <- a$year[a$x==1]
> > a$treatment <- sapply( 1:nrow(a), function(i) { a$year[i] - 
> > v[a$country_code[i]]})
> > 
> > HTH,
> > Eric
> > 
> > 
> > On Thu, Oct 3, 2019 at 3:36 PM Faradj Koliev  wrote:
> > Dear Michael Dewey, 
> > 
> > Thanks for reaching out about this. I trying again, now with plain text, 
> > and hope it works. 
> > 
> > Best, 
> > Faradj 
> > 
> > 
> > 
> > Dear R-users, 
> > 
> > I need an urgent help with the following: I have a country-year data 
> > covering the period 1982 - 2013. I want to assess how the variable X (a 
> > certain policy) affects the Y variable. The X variable is =1 when a country 
> > introduces that policy in a specific year, otherwise =0. 
> > 
> > What I want to do is to create a treatment variable, that would be a 
> > negative count until the X=1, and then  positive counts  for the years 
> > after X=1. 
> > 
> > For example, let’s say that the U.S. introduced the policy x in year 2000. 
> > The treatment variable would look be like this: 
> > 
> > country
> > 
> > year
> > 
> > x
> > 
> > treatment
> > 
> > USA
> > 
> > 1982
> > 
> > 0
> > 
> > -18
> > 
> > USA
> > 
> > 1983
> > 
> > 0
> > 
> > -17
> > 
> > USA
> > 
> > 1984
> > 
> > 0
> > 
> > -16
> > 
> > USA
> > 
> > 1985
> > 
> > 0
> > 
> > -15
> > 
> > USA
> > 
> > 1986
> > 
> > 0
> > 
> > -14
> > 
> > USA
> > 
> > 1987
> > 
> > 0
> > 
> > -13
> > 
> > USA
> > 
> > 1988
> > 
> > 0
> > 
> > -12
> > 
> > USA
> > 
> > 1989
> > 
> > 0
> > 
> > -11
> > 
> > USA
> > 
> > 1990
> > 
> > 0
> > 
> > -10
> > 
> > USA
> > 
> > 1991
> > 
> > 0
> > 
> > -9
> > 
> > USA
> > 
> > 1992
> > 
> > 0
> > 
> > -8
> > 
> > USA
> > 
> > 1993
> > 
> > 0
> > 
> > -7
> > 
> > USA
> > 
> > 1994
> > 
> > 0
> > 
> > -6
> > 
> > USA
> > 
> > 1995
> > 
> > 0
> > 
> > -5
> > 
> > USA
> > 
> > 1996
> > 
> > 0
> > 
> > -4
> > 
> > USA
> > 
> > 1997
> > 
> > 0
> > 
> > -3
> > 
> > USA
> > 
> > 1998
> > 
> > 0
> > 
> > -2
> > 
> > USA
> > 
> > 1999
> > 
> > 0
> > 
> > -1
> > 
> > USA
> > 
> > 2000
> > 
> > 1
> > 
> > 0
> > 
> > USA
> > 
> > 2001
> > 
> > 0
> > 
> > 1
> > 
> > USA
> > 
> > 2002
> > 
> > 0
> > 
> > 2
> > 
> > USA
> > 
> > 2003
> > 
> > 0
> > 
> > 3
> > 
> > USA
> > 
> > 2004
> > 
> > 0
> > 
> > 4
> > 
> > USA
> > 
> > 2005
> > 
> > 0
> > 
> > 5
> > 
> > USA
> > 
>

Re: [R] Creating a before-and-after variable in R

2019-10-03 Thread Faradj Koliev
Dear Eric, 

Thank you very much for this - it worked perfectly! 

A small thing: I wonder whether it’s possible to include those cases where the 
x is =0 for the whole study period. I have countries with x=0 for the whole 
period and the treatment variable is=NA for these observations. 

Best, 
Faradj 


> 3 okt. 2019 kl. 15:18 skrev Eric Berger :
> 
> Hi Faradj,
> Suppose your data frame is labeled 'a'. Then the following seems to do what 
> you want.
> 
> v <- rep(NA_integer_,max(a$country_code))
> v[ a$country_code[a$x==1] ] <- a$year[a$x==1]
> a$treatment <- sapply( 1:nrow(a), function(i) { a$year[i] - 
> v[a$country_code[i]]})
> 
> HTH,
> Eric
> 
> 
> On Thu, Oct 3, 2019 at 3:36 PM Faradj Koliev  wrote:
> Dear Michael Dewey, 
> 
> Thanks for reaching out about this. I trying again, now with plain text, and 
> hope it works. 
> 
> Best, 
> Faradj 
> 
> 
> 
> Dear R-users, 
> 
> I need an urgent help with the following: I have a country-year data covering 
> the period 1982 - 2013. I want to assess how the variable X (a certain 
> policy) affects the Y variable. The X variable is =1 when a country 
> introduces that policy in a specific year, otherwise =0. 
> 
> What I want to do is to create a treatment variable, that would be a negative 
> count until the X=1, and then  positive counts  for the years after X=1. 
> 
> For example, let’s say that the U.S. introduced the policy x in year 2000. 
> The treatment variable would look be like this: 
> 
> country
> 
> year
> 
> x
> 
> treatment
> 
> USA
> 
> 1982
> 
> 0
> 
> -18
> 
> USA
> 
> 1983
> 
> 0
> 
> -17
> 
> USA
> 
> 1984
> 
> 0
> 
> -16
> 
> USA
> 
> 1985
> 
> 0
> 
> -15
> 
> USA
> 
> 1986
> 
> 0
> 
> -14
> 
> USA
> 
> 1987
> 
> 0
> 
> -13
> 
> USA
> 
> 1988
> 
> 0
> 
> -12
> 
> USA
> 
> 1989
> 
> 0
> 
> -11
> 
> USA
> 
> 1990
> 
> 0
> 
> -10
> 
> USA
> 
> 1991
> 
> 0
> 
> -9
> 
> USA
> 
> 1992
> 
> 0
> 
> -8
> 
> USA
> 
> 1993
> 
> 0
> 
> -7
> 
> USA
> 
> 1994
> 
> 0
> 
> -6
> 
> USA
> 
> 1995
> 
> 0
> 
> -5
> 
> USA
> 
> 1996
> 
> 0
> 
> -4
> 
> USA
> 
> 1997
> 
> 0
> 
> -3
> 
> USA
> 
> 1998
> 
> 0
> 
> -2
> 
> USA
> 
> 1999
> 
> 0
> 
> -1
> 
> USA
> 
> 2000
> 
> 1
> 
> 0
> 
> USA
> 
> 2001
> 
> 0
> 
> 1
> 
> USA
> 
> 2002
> 
> 0
> 
> 2
> 
> USA
> 
> 2003
> 
> 0
> 
> 3
> 
> USA
> 
> 2004
> 
> 0
> 
> 4
> 
> USA
> 
> 2005
> 
> 0
> 
> 5
> 
> USA
> 
> 2006
> 
> 0
> 
> 6
> 
> USA
> 
> 2007
> 
> 0
> 
> 7
> 
> USA
> 
> 2008
> 
> 0
> 
> 8
> 
> USA
> 
> 2009
> 
> 0
> 
> 9
> 
> USA
> 
> 2010
> 
> 0
> 
> 10
> 
> USA
> 
> 2011
> 
> 0
> 
> 11
> 
> USA
> 
> 2012
> 
> 0
> 
> 12
> 
> USA
> 
> 2013
> 
> 0
> 
> 13
> 
> 
> 
> Do you have any idea as how I can generate this? All suggestions are 
> appreciated!
> 
> 
> I’ve tried to create it but failed. I only could generate positive counts 
> using this code: 
> require(data.table)
> setDT(data)[,treatment := seq.int(0,.N-1L), by = cumsum(x)-x]
> 
> My sample below: 
> dput(data)
> structure(list(country_code = c(900L, 900L, 900L, 900L, 900L, 
> 900L, 900L, 900L, 900L, 900L, 900L, 900L, 900L, 900L, 900L, 900L, 
> 900L, 900L, 900L, 900L, 900L, 900L, 900L, 900L, 900L, 900L, 900L, 
> 900L, 900L, 900L, 900L, 900L, 305L, 305L, 305L, 305L, 305L, 305L, 
> 305L, 305L, 305L, 305L, 305L, 305L, 305L, 305L, 305L, 305L, 305L, 
> 305L, 305L, 305L, 305L, 305L, 305L, 305L, 305L, 305L, 305L, 305L, 
> 305L, 305L, 305L, 305L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 
> 140L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 
> 140L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 
> 140L, 140L, 140L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 
> 471L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 
> 471L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 
> 471L, 471L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 
> 352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 
> 352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 
> 352L, 490L, 490L, 490L, 490L, 490L

Re: [R] Creating a before-and-after variable in R

2019-10-03 Thread Faradj Koliev
, 12, 12, 12, 12, 12, 12), x = c(0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L)), .Names = c("country_code", "year", "y", "x"), class = "data.frame", 
row.names = c(NA, 
-722L))




> 3 okt. 2019 kl. 14:24 skrev Michael Dewey :
> 
> Dear Faradj
> 
> I am afraid your post is unreadable since this is a plain text list and you 
> sent in HTML.
> 
> Michael
> 
> On 03/10/2019 12:17, Faradj Koliev wrote:
>> Dear R-users,
>> I need an urgent help with the following: I have a country-year data 
>> covering the period 1982 - 2013. I want to assess how the variable X (a 
>> certain policy) affects the Y variable. The X variable is =1 when a country 
>> introduces that policy in a specific year, otherwise =0.
>> What I want to do is to create a treatment variable, that would be a 
>> negative count until the X=1, and then  positive counts  for the years after 
>> X=1.
>> For example, let’s say that the U.S. introduced the policy x in year 2000. 
>> The treatment variable would look be like this:
>>  country
>> year
>> x
>> treatment
>> USA
>> 1982
>> 0
>> -18
>> USA
>> 1983
>> 0
>> -17
>> USA
>> 1984
>> 0
>> -16
>> USA
>> 1985
>> 0
>> -15
>> USA
>> 1986
>> 0
>> -14
>> USA
>> 1987
>> 0
>> -13
>> USA
>> 1988
>> 0
>> -12
>> USA
>> 1989
>> 0
>> -11
>> USA
>> 1990
>> 0
>> -10
>> USA
>> 1991
>> 0
>> -9
>> USA
>> 1992
>> 0
>> -8
>> USA
>> 1993
>> 0
>> -7
>> USA
>> 1994
>> 0
>> -6
>> USA
>> 1995
>> 0
>> -5
>> USA
>> 1996
>> 0
>> -4
>> USA
>> 1997
>> 0
>> -3
>> USA
>> 1998
>> 0
>> -2
>> USA
>> 1999
>> 0
>> -1
>

[R] Creating a before-and-after variable in R

2019-10-03 Thread Faradj Koliev
Dear R-users, 

I need an urgent help with the following: I have a country-year data covering 
the period 1982 - 2013. I want to assess how the variable X (a certain policy) 
affects the Y variable. The X variable is =1 when a country introduces that 
policy in a specific year, otherwise =0. 

What I want to do is to create a treatment variable, that would be a negative 
count until the X=1, and then  positive counts  for the years after X=1. 

For example, let’s say that the U.S. introduced the policy x in year 2000. The 
treatment variable would look be like this: 
 
country

year

x

treatment

USA

1982

0

-18

USA

1983

0

-17

USA

1984

0

-16

USA

1985

0

-15

USA

1986

0

-14

USA

1987

0

-13

USA

1988

0

-12

USA

1989

0

-11

USA

1990

0

-10

USA

1991

0

-9

USA

1992

0

-8

USA

1993

0

-7

USA

1994

0

-6

USA

1995

0

-5

USA

1996

0

-4

USA

1997

0

-3

USA

1998

0

-2

USA

1999

0

-1

USA

2000

1

0

USA

2001

0

1

USA

2002

0

2

USA

2003

0

3

USA

2004

0

4

USA

2005

0

5

USA

2006

0

6

USA

2007

0

7

USA

2008

0

8

USA

2009

0

9

USA

2010

0

10

USA

2011

0

11

USA

2012

0

12

USA

2013

0

13



Do you have any idea as how I can generate this? All suggestions are 
appreciated!


I’ve tried to create it but failed. I only could generate positive counts using 
this code: 
require(data.table)
setDT(data)[,treatment := seq.int(0,.N-1L), by = cumsum(x)-x]

My sample below: 
dput(data)
structure(list(country_code = c(900L, 900L, 900L, 900L, 900L, 
900L, 900L, 900L, 900L, 900L, 900L, 900L, 900L, 900L, 900L, 900L, 
900L, 900L, 900L, 900L, 900L, 900L, 900L, 900L, 900L, 900L, 900L, 
900L, 900L, 900L, 900L, 900L, 305L, 305L, 305L, 305L, 305L, 305L, 
305L, 305L, 305L, 305L, 305L, 305L, 305L, 305L, 305L, 305L, 305L, 
305L, 305L, 305L, 305L, 305L, 305L, 305L, 305L, 305L, 305L, 305L, 
305L, 305L, 305L, 305L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 
140L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 
140L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 
140L, 140L, 140L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 
471L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 
471L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 471L, 
471L, 471L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 
352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 
352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 352L, 
352L, 490L, 490L, 490L, 490L, 490L, 490L, 490L, 490L, 490L, 490L, 
490L, 490L, 490L, 490L, 490L, 490L, 490L, 490L, 490L, 490L, 490L, 
490L, 490L, 490L, 490L, 490L, 490L, 490L, 490L, 490L, 490L, 490L, 
375L, 375L, 375L, 375L, 375L, 375L, 375L, 375L, 375L, 375L, 375L, 
375L, 375L, 375L, 375L, 375L, 375L, 375L, 375L, 375L, 375L, 375L, 
375L, 375L, 375L, 375L, 375L, 375L, 375L, 375L, 375L, 375L, 220L, 
220L, 220L, 220L, 220L, 220L, 220L, 220L, 220L, 220L, 220L, 220L, 
220L, 220L, 220L, 220L, 220L, 220L, 220L, 220L, 220L, 220L, 220L, 
220L, 220L, 220L, 220L, 220L, 220L, 220L, 220L, 220L, 481L, 481L, 
481L, 481L, 481L, 481L, 481L, 481L, 481L, 481L, 481L, 481L, 481L, 
481L, 481L, 481L, 481L, 481L, 481L, 481L, 481L, 481L, 481L, 481L, 
481L, 481L, 481L, 481L, 481L, 481L, 481L, 481L, 367L, 367L, 367L, 
367L, 367L, 367L, 367L, 367L, 367L, 367L, 367L, 367L, 367L, 367L, 
367L, 367L, 367L, 367L, 367L, 367L, 367L, 367L, 367L, 570L, 570L, 
570L, 570L, 570L, 570L, 570L, 570L, 570L, 570L, 570L, 570L, 570L, 
570L, 570L, 570L, 570L, 570L, 570L, 570L, 570L, 570L, 570L, 570L, 
570L, 570L, 570L, 570L, 570L, 570L, 570L, 570L, 212L, 212L, 212L, 
212L, 212L, 212L, 212L, 212L, 212L, 212L, 212L, 212L, 212L, 212L, 
212L, 212L, 212L, 212L, 212L, 212L, 212L, 212L, 212L, 212L, 212L, 
212L, 212L, 212L, 212L, 212L, 212L, 212L, 359L, 359L, 359L, 359L, 
359L, 359L, 359L, 359L, 359L, 359L, 359L, 359L, 359L, 359L, 359L, 
359L, 359L, 359L, 359L, 359L, 359L, 359L, 359L, 600L, 600L, 600L, 
600L, 600L, 600L, 600L, 600L, 600L, 600L, 600L, 600L, 600L, 600L, 
600L, 600L, 600L, 600L, 600L, 600L, 600L, 600L, 600L, 600L, 600L, 
600L, 600L, 600L, 600L, 600L, 600L, 600L, 565L, 565L, 565L, 565L, 
565L, 565L, 565L, 565L, 565L, 565L, 565L, 565L, 565L, 565L, 565L, 
565L, 565L, 565L, 565L, 565L, 565L, 565L, 565L, 565L, 235L, 235L, 
235L, 235L, 235L, 235L, 235L, 235L, 235L, 235L, 235L, 235L, 235L, 
235L, 235L, 235L, 235L, 235L, 235L, 235L, 235L, 235L, 235L, 235L, 
235L, 235L, 235L, 235L, 235L, 235L, 235L, 235L, 317L, 317L, 317L, 
317L, 317L, 317L, 317L, 317L, 317L, 317L, 317L, 317L, 317L, 317L, 
317L, 317L, 317L, 317L, 317L, 317L, 317L, 230L, 230L, 230L, 230L, 
230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 
230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 230L, 
230L, 230L, 230L, 230L, 230L, 230L, 380L, 380L, 380L, 380L, 380L, 
380L, 380L, 380L, 380L, 380L, 380L, 380L, 380L, 380L, 380L, 380L, 
380L, 380L, 380L, 380L, 380L, 380L, 380L, 380L, 380L, 380L, 380L, 
380L, 380L, 380L, 380L, 380L, 640L, 640L, 

Re: [R] Creating a conditional lag variable in R

2019-07-27 Thread Faradj Koliev
Thank you all. I now have the right solution for this (perhaps of interest to 
some): 

check_pre <- function(idx, k) { pre_vec <- sapply(1:length(idx), function(x) 
+any(idx[x:(pmin(x + k, length(idx)))] %in% 1)); pre_vec[idx == 1] <- 0; 
return(pre_vec) }

df %>%
  group_by(country) %>%
  mutate(
idx = +( (lag(X1) == 0 & X1 == 1) | row_number() == 1 & X1 == 1),
X1_pre4 = check_pre(idx, 4),
X1_pre5 = check_pre(idx, 5),
idx = NULL
  )


> On 27 Jul 2019, at 10:45, Faradj Koliev  wrote:
> 
> Peter Dalgaard, 
> 
> Thanks for this. 
> 
> I’ll try to think of ways to apply this logic. At the moment, I’m trying to 
> do this with “mutate” using dplyr package. But it’s not easy..
> 
>> On 27 Jul 2019, at 10:33, peter dalgaard  wrote:
>> 
>> Some pointers (not tested, may contain blunders...)
>> 
>> (a) you likely need some sort of split-operate-unsplit construct, by 
>> country. E.g.,
>> 
>> myfun <- function(d) {operate on data frame with only one country} 
>> ll <- split(data, data$country)
>> ll.new <- lapply(ll, myfun)
>> data.new <- unsplit(ll.new, data$country)
>> 
>> (There might be a tidyverse idiom for this too)
>> 
>> (b) your X1_pre5count looks like it is the same as cumsum(1-X1)*X1 (within 
>> country)
>> 
>> (c) if you count in the opposite direction, tt <- rev(cumsum(rev(1-X1))) you 
>> get number of years until agreement. Then X1_pre4 should be as.integer(tt 
>> <=4  & tt > 0)
>> 
>> -pd
>> 
>>> On 27 Jul 2019, at 09:13 , Faradj Koliev  wrote:
>>> 
>>> Re-post, now in *plain text*. 
>>> 
>>> 
>>> 
>>> Dear R-users, 
>>> 
>>> I’ve a rather complicated task to do and need all the help I can get. 
>>> 
>>> I have data indicating whether a country has signed an agreement or not 
>>> (1=yes and 0=otherwise). I want to simply create variable that would 
>>> capture the years before the agreement is signed. The aim is to see whether 
>>> pre or post agreement period has any impact on my dependent variables. 
>>> 
>>> More preciesly, I want to create the following variables: 
>>> (i) a variable that is =1 in the 4 years pre/before the agreement, 0 
>>> otherwise; 
>>> (ii) a variable that is =1 5 years pre the agreement and 
>>> (iii) a variable that would count the 4 and 5 years pre the agreement 
>>> (1,2,3,4..). 
>>> 
>>> Please see the sample data below. I have manually added the variables I 
>>> would like to generate in R, labelled as “X1_pre4” ( 4 years before the 
>>> agreement X1), “X2_pre4”, “X1_pret5” ( 5 years before the agreement X5), 
>>> and “X1pre5_count” (which basically count the years, 1,2,3, etc). The X1 
>>> and X2 is the agreement that countries have either signed (1) or not (0). 
>>> Note though that I want the variable to capture all the years up to 4 and 
>>> 5. If it’s only 2 years, it should still be ==1 (please see the example 
>>> below). 
>>> 
>>> To illustrate the logic: the country A has signed the agreement X1 in 1972 
>>> in the sample data,  then, the (i) and (ii) variables as above should be =1 
>>> for the years 1970, 1971, and =0 from 1972 until the end of the study 
>>> period. 
>>> 
>>> The country A has signed the agreement X2 in 1975,  then, the (i) variable 
>>> should be =1 from 1971 to 1974 (post 4 years) and (ii) should be =1 for the 
>>>  1970-1974  period (post 5 years before the agreement is signed). 
>>> 
>>> Later, I would also like to create post_4 and post_5 variables, but I think 
>>> I’ll be able to figure it out once I know how to generate the pre/before 
>>> variables. 
>>> 
>>> All suggestions are much appreciated! 
>>> 
>>> 
>>> 
>>> data<-structure(list(country = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
>>> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
>>> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
>>> 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
>>> 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A", "B", "C"), class = "factor"), 
>>>  year = c(1970L, 1971L, 1972L, 1973L, 1974L, 1975L, 1976L, 
>>>  1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 1984L, 1985L, 
>>>  1986L, 1987L, 1988L, 1970L, 1971L, 1972L, 1973L, 1974L, 1975L, 
>>>  1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 1984L, 
>>>  1985L

Re: [R] Creating a conditional lag variable in R

2019-07-27 Thread Faradj Koliev
Peter Dalgaard, 

Thanks for this. 

I’ll try to think of ways to apply this logic. At the moment, I’m trying to do 
this with “mutate” using dplyr package. But it’s not easy..

> On 27 Jul 2019, at 10:33, peter dalgaard  wrote:
> 
> Some pointers (not tested, may contain blunders...)
> 
> (a) you likely need some sort of split-operate-unsplit construct, by country. 
> E.g.,
> 
> myfun <- function(d) {operate on data frame with only one country} 
> ll <- split(data, data$country)
> ll.new <- lapply(ll, myfun)
> data.new <- unsplit(ll.new, data$country)
> 
> (There might be a tidyverse idiom for this too)
> 
> (b) your X1_pre5count looks like it is the same as cumsum(1-X1)*X1 (within 
> country)
> 
> (c) if you count in the opposite direction, tt <- rev(cumsum(rev(1-X1))) you 
> get number of years until agreement. Then X1_pre4 should be as.integer(tt <=4 
>  & tt > 0)
> 
> -pd
> 
>> On 27 Jul 2019, at 09:13 , Faradj Koliev  wrote:
>> 
>> Re-post, now in *plain text*. 
>> 
>> 
>> 
>> Dear R-users, 
>> 
>> I’ve a rather complicated task to do and need all the help I can get. 
>> 
>> I have data indicating whether a country has signed an agreement or not 
>> (1=yes and 0=otherwise). I want to simply create variable that would capture 
>> the years before the agreement is signed. The aim is to see whether pre or 
>> post agreement period has any impact on my dependent variables. 
>> 
>> More preciesly, I want to create the following variables: 
>> (i) a variable that is =1 in the 4 years pre/before the agreement, 0 
>> otherwise; 
>> (ii) a variable that is =1 5 years pre the agreement and 
>> (iii) a variable that would count the 4 and 5 years pre the agreement 
>> (1,2,3,4..). 
>> 
>> Please see the sample data below. I have manually added the variables I 
>> would like to generate in R, labelled as “X1_pre4” ( 4 years before the 
>> agreement X1), “X2_pre4”, “X1_pret5” ( 5 years before the agreement X5), and 
>> “X1pre5_count” (which basically count the years, 1,2,3, etc). The X1 and X2 
>> is the agreement that countries have either signed (1) or not (0). Note 
>> though that I want the variable to capture all the years up to 4 and 5. If 
>> it’s only 2 years, it should still be ==1 (please see the example below). 
>> 
>> To illustrate the logic: the country A has signed the agreement X1 in 1972 
>> in the sample data,  then, the (i) and (ii) variables as above should be =1 
>> for the years 1970, 1971, and =0 from 1972 until the end of the study 
>> period. 
>> 
>> The country A has signed the agreement X2 in 1975,  then, the (i) variable 
>> should be =1 from 1971 to 1974 (post 4 years) and (ii) should be =1 for the  
>> 1970-1974  period (post 5 years before the agreement is signed). 
>> 
>> Later, I would also like to create post_4 and post_5 variables, but I think 
>> I’ll be able to figure it out once I know how to generate the pre/before 
>> variables. 
>> 
>> All suggestions are much appreciated! 
>> 
>> 
>> 
>> data<-structure(list(country = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
>> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
>> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
>> 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
>> 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A", "B", "C"), class = "factor"), 
>>   year = c(1970L, 1971L, 1972L, 1973L, 1974L, 1975L, 1976L, 
>>   1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 1984L, 1985L, 
>>   1986L, 1987L, 1988L, 1970L, 1971L, 1972L, 1973L, 1974L, 1975L, 
>>   1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 1984L, 
>>   1985L, 1986L, 1987L, 1988L, 1970L, 1971L, 1972L, 1973L, 1974L, 
>>   1975L, 1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 
>>   1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L, 1991L), 
>>   X1 = c(0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
>>   1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 
>>   1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 
>>   0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
>>   1L, 1L), X2 = c(0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
>>   1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
>>   1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 
>>   0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 
>>   1L, 1L, 1L, 1L), X1_pre4 = c(1L, 1L, 0L, 0L, 0L, 0L, 0L, 
>>   0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,

Re: [R] Creating a conditional lag variable in R

2019-07-27 Thread Faradj Koliev
Re-post, now in *plain text*. 



Dear R-users, 

I’ve a rather complicated task to do and need all the help I can get. 

I have data indicating whether a country has signed an agreement or not (1=yes 
and 0=otherwise). I want to simply create variable that would capture the years 
before the agreement is signed. The aim is to see whether pre or post agreement 
period has any impact on my dependent variables. 

More preciesly, I want to create the following variables: 
(i) a variable that is =1 in the 4 years pre/before the agreement, 0 otherwise; 
(ii) a variable that is =1 5 years pre the agreement and 
(iii) a variable that would count the 4 and 5 years pre the agreement 
(1,2,3,4..). 

Please see the sample data below. I have manually added the variables I would 
like to generate in R, labelled as “X1_pre4” ( 4 years before the agreement 
X1), “X2_pre4”, “X1_pret5” ( 5 years before the agreement X5), and 
“X1pre5_count” (which basically count the years, 1,2,3, etc). The X1 and X2 is 
the agreement that countries have either signed (1) or not (0). Note though 
that I want the variable to capture all the years up to 4 and 5. If it’s only 2 
years, it should still be ==1 (please see the example below). 

To illustrate the logic: the country A has signed the agreement X1 in 1972 in 
the sample data,  then, the (i) and (ii) variables as above should be =1 for 
the years 1970, 1971, and =0 from 1972 until the end of the study period. 

The country A has signed the agreement X2 in 1975,  then, the (i) variable 
should be =1 from 1971 to 1974 (post 4 years) and (ii) should be =1 for the  
1970-1974  period (post 5 years before the agreement is signed). 

Later, I would also like to create post_4 and post_5 variables, but I think 
I’ll be able to figure it out once I know how to generate the pre/before 
variables. 

All suggestions are much appreciated! 



data<-structure(list(country = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A", "B", "C"), class = "factor"), 
year = c(1970L, 1971L, 1972L, 1973L, 1974L, 1975L, 1976L, 
1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 1984L, 1985L, 
1986L, 1987L, 1988L, 1970L, 1971L, 1972L, 1973L, 1974L, 1975L, 
1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 1984L, 
1985L, 1986L, 1987L, 1988L, 1970L, 1971L, 1972L, 1973L, 1974L, 
1975L, 1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 
1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L, 1991L), 
X1 = c(0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L), X2 = c(0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L), X1_pre4 = c(1L, 1L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X2_pre4 = c(0L, 1L, 1L, 
1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 
1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X1_pre5 = c(1L, 
1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), 
X1_pre5_count = c(1L, 2L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 2L, 3L, 
4L, 5L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 2L, 3L, 4L, 5L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-60L))

> On 26 Jul 2019, at 21:58, Bert Gunter  wrote:
> 
> Because you posted in HTML, your example got mangled and resulted in an 
> error. Re-post in *plain text* please (making sure that you cut and paste 
> correctly)
> 
> Bert Gunter
> 
> "The trouble with having an open mind is that people keep coming along and 
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> 
> 
> On Fri, Jul 26, 2019 at 12:25 PM Faradj Koliev  wrote:
> Dear R-users, 
> 
&

[R] Creating a conditional lag variable in R

2019-07-26 Thread Faradj Koliev
Dear R-users, 

I’ve a rather complicated task to do and need all the help I can get. 

I have data indicating whether a country has signed an agreement or not (1=yes 
and 0=otherwise). I want to simply create variable that would capture the years 
before the agreement is signed. The aim is to see whether pre or post agreement 
period has any impact on my dependent variables. 

More preciesly, I want to create the following variables: 
(i) a variable that is =1 in the 4 years pre/before the agreement, 0 otherwise; 
(ii) a variable that is =1 5 years pre the agreement and 
(iii) a variable that would count the 4 and 5 years pre the agreement 
(1,2,3,4..). 

Please see the sample data below. I have manually added the variables I would 
like to generate in R, labelled as “X1_pre4” ( 4 years before the agreement 
X1), “X2_pre4”, “X1_pret5” ( 5 years before the agreement X5), and 
“X1pre5_count” (which basically count the years, 1,2,3, etc). The X1 and X2 is 
the agreement that countries have either signed (1) or not (0). Note though 
that I want the variable to capture all the years up to 4 and 5. If it’s only 2 
years, it should still be ==1 (please see the example below). 

To illustrate the logic: the country A has signed the agreement X1 in 1972 in 
the sample data,  then, the (i) and (ii) variables as above should be =1 for 
the years 1970, 1971, and =0 from 1972 until the end of the study period. 

The country A has signed the agreement X2 in 1975,  then, the (i) variable 
should be =1 from 1971 to 1974 (post 4 years) and (ii) should be =1 for the  
1970-1974  period (post 5 years before the agreement is signed). 

Later, I would also like to create post_4 and post_5 variables, but I think 
I’ll be able to figure it out once I know how to generate the pre/before 
variables. 

All suggestions are much appreciated! 



data<–structure(list(country = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A", "B", "C"), class = "factor"), 
year = c(1970L, 1971L, 1972L, 1973L, 1974L, 1975L, 1976L, 
1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 1984L, 1985L, 
1986L, 1987L, 1988L, 1970L, 1971L, 1972L, 1973L, 1974L, 1975L, 
1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 1984L, 
1985L, 1986L, 1987L, 1988L, 1970L, 1971L, 1972L, 1973L, 1974L, 
1975L, 1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 
1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L, 1991L), 
X1 = c(0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L), X2 = c(0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L), X1_pre4 = c(1L, 1L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X2_pre4 = c(0L, 1L, 1L, 
1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 
1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X1_pre5 = c(1L, 
1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), 
X1_pre5_count = c(1L, 2L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 2L, 3L, 
4L, 5L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 2L, 3L, 4L, 5L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-60L))



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Capturing positive and negative changes using R

2019-07-21 Thread Faradj Koliev
Thank you much, this was very helpful. Jim’s and Daniel’s code was spot on. 

Indeed, as Jeff Newmiller pointed out,  one of the problems was that I assumed 
that R could recognise decimals - I did as Richard O’Keefe suggested and the 
problem was gone. 

Thanks again!

Faradj 

> On 21 Jul 2019, at 03:36, Richard O'Keefe  wrote:
> 
> If "Fardadj was expecting R to recognise the comma as the decimal"
> then it might be worth mentioning the 'dec = "."' argument of
> read.table and its friends.
> 
> 
> On Sun, 21 Jul 2019 at 12:48, Jeff Newmiller  <mailto:jdnew...@dcn.davis.ca.us>> wrote:
> It is possible that part of the original problem was that Fardadj was 
> expecting R to recognise the comma as the decimal and he read in that column 
> as a factor without realizing it. Factors are discrete, not continuous.
> 
> He should use the str() function to identify the column types in his data 
> frame.
> 
> On July 20, 2019 6:17:19 PM CDT, Jim Lemon  <mailto:drjimle...@gmail.com>> wrote:
> >Hi Faradj,
> >Rui's advice is correct, here's a way to do it. Note that I have
> >replaced the comma decimal points with full stops for my convenience:
> >
> >fkdf<-read.csv(text="Year,Country,X1,X2
> >1990,United States,0,0.22
> >1991,United States,0,0.22
> >1992,United States,0,0.22
> >1993,United States,0,0.22
> >1994,United States,0,0.22
> >1995,United States,0,0.22
> >1996,United States,0,0.22
> >1997,United States,0,0.5
> >1998,United States,0,0.5
> >1999,United States,0,0.5
> >2000,United States,0,0.5
> >2001,United States,0,0.5
> >2002,United States,2,NA
> >2003,United States,2,0.5
> >2004,United States,2,1
> >2005,United States,1,1
> >2006,United States,1,1
> >2007,United States,1,1
> >2008,United States,1,1
> >2009,United States,1,1
> >2010,United States,1,0.5
> >2011,United States,0,0.5
> >1990,Canada,1,1.5
> >1991,Canada,1,1.5
> >1992,Canada,1,NA
> >1993,Canada,1,1.5
> >1994,Canada,1,1.5
> >1995,Canada,1,1.5
> >1996,Canada,1,1.5
> >1997,Canada,1,1.5
> >1998,Canada,1,2
> >1999,Canada,2,2
> >2000,Canada,2,2
> >2001,Canada,2,2
> >2002,Canada,2,2
> >2003,Canada,1,2
> >2004,Canada,2,0.5
> >2005,Canada,1,0.5
> >2006,Canada,0,0.5
> >2007,Canada,1,0.5
> >2008,Canada,0,0.5
> >2009,Canada,1,0.5
> >2010,Canada,1,0.5
> >2011,Canada,0,1",
> >header=TRUE,stringsAsFactors=FALSE)
> >diffX1<-aggregate(fkdf$X1,by=list(fkdf[,2]),FUN=diff)
> >diffX2<-aggregate(fkdf$X2,by=list(fkdf[,2]),FUN=diff)
> >diffX1<-data.frame(diffX1$Group.1,diffX1$x)
> >diffyears<-unique(fkdf$Year)[-1]
> >names(diffX1)<-c("Country",diffyears)
> >diffX2<-data.frame(diffX2$Group.1,diffX2$x)
> >names(diffX2)<-c("Country",diffyears)
> >
> >Jim
> >
> >On Sun, Jul 21, 2019 at 5:34 AM Faradj Koliev  ><mailto:farad...@gmail.com>>
> >wrote:
> >>
> >> Dear R-users,
> >>
> >> I have a country-year data for 180 countries from 1970 to 2010. I’m
> >interested in capturing positive and negative changes in some of the
> >variables. Some of these variables are continuous (0,25, 0,33, 1, 1,5
> >etc) others are ordered (0,1, 2).
> >>
> >> To do this, I use this code data$X1_change<- +c(FALSE,diff(data$X1))
> >>
> >> My data looks something like this (please see below).
> >>
> >> There’re some problems with this code:  (1) I can’t capture the
> >smaller changes, say from 0,25 to 0,33 ( I get weird numbers). I would
> >love to get the exact difference ( for ex: +1, -0,22, +4, -2 etc).  (2)
> >It can’t make difference between countries. That is, it takes the
> >difference between countries while it should only do this for each
> >country ( for ex: when the US ends in 2011, and Canada starts, it
> >counts this a difference but it shouldn’t, see below). (3) NAs, missing
> >values, is neither a positive or negative change, although it does
> >think that what comes after the NA is a difference.
> >>
> >>  So, I wonder if anyone here can help me to adjust this code. I
> >appreciate all comments.
> >>
> >>
> >> Year
> >> Country
> >> X1
> >> X2
> >> 1990
> >> United States
> >> 0
> >> 0,22
> >> 1991
> >> United States
> >> 0
> >> 0,22
> >> 1992
> >> United States
> >> 0
> >> 0,22
> >> 1993
> >> United States
> 

[R] Capturing positive and negative changes using R

2019-07-20 Thread Faradj Koliev
Dear R-users, 

I have a country-year data for 180 countries from 1970 to 2010. I’m interested 
in capturing positive and negative changes in some of the variables. Some of 
these variables are continuous (0,25, 0,33, 1, 1,5 etc) others are ordered 
(0,1, 2). 

To do this, I use this code data$X1_change<- +c(FALSE,diff(data$X1))

My data looks something like this (please see below).

There’re some problems with this code:  (1) I can’t capture the smaller 
changes, say from 0,25 to 0,33 ( I get weird numbers). I would love to get the 
exact difference ( for ex: +1, -0,22, +4, -2 etc).  (2) It can’t make 
difference between countries. That is, it takes the difference between 
countries while it should only do this for each country ( for ex: when the US 
ends in 2011, and Canada starts, it counts this a difference but it shouldn’t, 
see below). (3) NAs, missing values, is neither a positive or negative change, 
although it does think that what comes after the NA is a difference. 

 So, I wonder if anyone here can help me to adjust this code. I appreciate all 
comments. 
 

Year
Country
X1
X2
1990
United States
0
0,22
1991
United States
0
0,22
1992
United States
0
0,22
1993
United States
0
0,22
1994
United States
0
0,22
1995
United States
0
0,22
1996
United States
0
0,22
1997
United States
0
0,5
1998
United States
0
0,5
1999
United States
0
0,5
2000
United States
0
0,5
2001
United States
0
0,5
2002
United States
2
NA
2003
United States
2
0,5
2004
United States
2
1
2005
United States
1
1
2006
United States
1
1
2007
United States
1
1
2008
United States
1
1
2009
United States
1
1
2010
United States
1
0,5
2011
United States
0
0,5
1990
Canada
1
1,5
1991
Canada
1
1,5
1992
Canada
1
NA
1993
Canada
1
1,5
1994
Canada
1
1,5
1995
Canada
1
1,5
1996
Canada
1
1,5
1997
Canada
1
1,5
1998
Canada
1
2
1999
Canada
2
2
2000
Canada
2
2
2001
Canada
2
2
2002
Canada
2
2
2003
Canada
1
2
2004
Canada
2
0,5
2005
Canada
1
0,5
2006
Canada
0
0,5
2007
Canada
1
0,5
2008
Canada
0
0,5
2009
Canada
1
0,5
2010
Canada
1
0,5
2011
Canada
0
1
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to generate a conditional dummy in R?

2018-05-29 Thread Faradj Koliev
Dear Jim, 

wow! It worked! Thanks a lot. 

I did as you suggested and it worked well with the real data. Although it gave 
me this error: Error in if (!is.na(x$Y[i])) { : argument is of length zero. For 
some reason the X1 produced less observations than it is in the data. But it's 
not a big deal - I identified those cases and simply deleted from the data (it 
was countries that only appeared twice in the data (e.g. USSR Yugoslavia etc). 

Best, 
Faradj 


> 29 maj 2018 kl. 02:15 skrev Jim Lemon :
> 
> Hi Faradj,
> What a problem! I think I have worked it out, but only because the
> result is the one you said you wanted.
> 
> # the sample data frame is named fkdf
> Y2Xby3<-function(x) {
> nrows<-dim(x)[1]
> X<-rep(0,nrows)
> for(i in 1:(nrows-2)) {
>  if(!is.na(x$Y[i])) {
>   if(x$Y[i] == 1 && any(is.na(x$Y[(i+1):(i+2)]))) X[i]<-1
>   if(i > 1) {
>if(X[i-1] == 1) X[i]<-0
>   }
>  }
>  else {
>   if(!is.na(x$Y[i+1])) {
>if(x$Y[i+1] == 1 && is.na(x$Y[i+2]) && X[i] == 0)
> X[i+1]<-1
>   }
>  }
> }
> return(X)
> }
> countries<-as.character(unique(fkdf$country))
> X1<-NULL
> for(country in countries)
> X1<-c(X1,Y2Xby3(fkdf[fkdf$country == country,]))
> X1
>  [1] 1 0 0 1 0 0 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 0
> [38] 1 0 1 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 1 0 1 0 1 0
> [75] 1 0 0 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0
>> fkdf$X
>  [1] 1 0 0 1 0 0 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 0
> [38] 1 0 1 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 1 0 1 0 1 0
> [75] 1 0 0 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0
> 
> Jim
> 
> On Mon, May 28, 2018 at 8:43 PM, Faradj Koliev  wrote:
>> Hi everyone,
>> 
>> I am trying to generate a conditional dummy variable ”X" with the following 
>> rules
>> 
>> set X=1 if Y is =1, two years prior to the NA.  [0,0,NA].
>> 
>> For example, if  the pattern for Y is 0,0,NA then the X variable is =0 for 
>> all  the two years prior to the NA. If the pattern for Y is 0,1,NA or 1,0,NA 
>> then the X =1 . To be clear, if 1,1,NA then the X=1 that  first specific 
>> year, it should only count once (X=1), not twice.
>> 
>> The code that I have now is not complete and I would appreciate some advice 
>> here. This is the code:
>> dat2 <- dat1 %>%
>>  group_by(country) %>%
>>  group_by(grp = cumsum(is.na(lag(Y))), add = TRUE) %>%
>>  mutate(first_year_at_1 = match(1, Y) * any(is.na(Y)) * any(tail(Y, 3) == 
>> 1L),
>> X = {x <- integer(length(Y)) ; x[first_year_at_1] <- 1L ; x}) %>%
>>  ungroup()
>> 
>> It doesn’t really generate what I described above. Any help here would be 
>> much appreciated.
>> 
>> Below you can see my sample data with the desired outcome ”X” dummy in it.
>> 
>> Thank you!
>> 
>>> dput(data)
>> structure(list(year = c(1991L, 1992L, 1993L, 1994L, 1995L, 1996L,
>> 1997L, 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L,
>> 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 1990L, 1991L, 1992L,
>> 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L,
>> 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L,
>> 2011L, 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L,
>> 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L,
>> 2007L, 2008L, 2009L, 2010L, 2011L, 1990L, 1991L, 1992L, 1993L,
>> 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L, 2002L,
>> 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L,
>> 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L,
>> 1999L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L,
>> 2007L, 2008L, 2009L, 2010L, 2011L), country = structure(c(1L,
>> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>> 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L,
>> 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
>> 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
>> 3L, 3L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
>> 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("Canada",
>> "Cuba", "Dominican Republic", "Haiti", "Jamaica"), class = "factor"),
>>Y = c(1L, NA, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA, 1L,
>>1L, NA, 1L, NA, 1L, NA, 1L, NA, NA, 1L, 1L, NA, NA, 1L, NA,
>>1L, NA, 1L, NA, 1L, 1L, 1L,

[R] How to generate a conditional dummy in R?

2018-05-28 Thread Faradj Koliev
Hi everyone, 

I am trying to generate a conditional dummy variable ”X" with the following 
rules

 set X=1 if Y is =1, two years prior to the NA.  [0,0,NA]. 

For example, if  the pattern for Y is 0,0,NA then the X variable is =0 for all  
the two years prior to the NA. If the pattern for Y is 0,1,NA or 1,0,NA then 
the X =1 . To be clear, if 1,1,NA then the X=1 that  first specific year, it 
should only count once (X=1), not twice. 

The code that I have now is not complete and I would appreciate some advice 
here. This is the code: 
dat2 <- dat1 %>% 
  group_by(country) %>% 
  group_by(grp = cumsum(is.na(lag(Y))), add = TRUE) %>% 
  mutate(first_year_at_1 = match(1, Y) * any(is.na(Y)) * any(tail(Y, 3) == 1L), 
 X = {x <- integer(length(Y)) ; x[first_year_at_1] <- 1L ; x}) %>% 
  ungroup()

It doesn’t really generate what I described above. Any help here would be much 
appreciated. 

Below you can see my sample data with the desired outcome ”X” dummy in it.

Thank you! 

> dput(data)
structure(list(year = c(1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 
1997L, 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 
2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 1990L, 1991L, 1992L, 
1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L, 
2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 
2011L, 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 
1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 
2007L, 2008L, 2009L, 2010L, 2011L, 1990L, 1991L, 1992L, 1993L, 
1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L, 2002L, 
2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 
1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 
1999L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 
2007L, 2008L, 2009L, 2010L, 2011L), country = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("Canada", 
"Cuba", "Dominican Republic", "Haiti", "Jamaica"), class = "factor"), 
Y = c(1L, NA, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA, 1L, 
1L, NA, 1L, NA, 1L, NA, 1L, NA, NA, 1L, 1L, NA, NA, 1L, NA, 
1L, NA, 1L, NA, 1L, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA, 
NA, 1L, NA, 1L, 0L, 0L, 0L, 1L, NA, 0L, 1L, 0L, 0L, 0L, 0L, 
0L, 1L, NA, 0L, 1L, 1L, NA, 0L, 1L, NA, 1L, NA, 1L, NA, 1L, 
NA, 1L, NA, 1L, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA, 1L, 
0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, NA, 0L, 1L, 1L, 1L, 
NA, 1L, NA, 0L, 1L, 1L, NA), X = c(1L, 0L, 0L, 1L, 0L, 0L, 
1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 
0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 
0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 
1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 
1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L)), .Names = 
c("year", 
"country", "Y", "X"), class = "data.frame", row.names = c(NA, 
-110L))



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Log plus one transformation in R

2016-12-12 Thread Faradj Koliev
Many thanks! 

logp1(x) worked just fine.

Best,
Faradj

Skickat från min iPhone

> 12 dec. 2016 kl. 22:54 skrev peter dalgaard <pda...@gmail.com>:
> 
> And, for crying out loud... just try it with x = 1.234e-16 or so. One would 
> think that the hint |x| << 1 was obvious enough.
> 
> -pd
> 
>> On 12 Dec 2016, at 18:26 , William Dunlap via R-help <r-help@r-project.org> 
>> wrote:
>> 
>> Print more digits of the quotient or subtract one from it and you will see
>> the difference:
>> 
>>> log1p(0.01)/log(0.01+1) - 1
>> [1] 8.22666379463044e-11
>> 
>> 
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>> 
>> On Mon, Dec 12, 2016 at 8:53 AM, John Sorkin <jsor...@grecc.umaryland.edu>
>> wrote:
>> 
>>> At the risk of being flamed . . .
>>> What is the difference between log1p(x) and log(x+1)?
>>> The two methods appear to give the same results:
>>>> log1p(0.01)/log(0.01+1)
>>> [1] 1
>>> John
>>> 
>>> 
>>> John David Sorkin M.D., Ph.D.
>>> Professor of Medicine
>>> Chief, Biostatistics and Informatics
>>> University of Maryland School of Medicine Division of Gerontology and
>>> Geriatric Medicine
>>> Baltimore VA Medical Center
>>> 10 North Greene Street
>>> GRECC (BT/18/GR)
>>> Baltimore, MD 21201-1524
>>> (Phone) 410-605-7119
>>> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>>> 
>>>>>> William Dunlap via R-help <r-help@r-project.org> 12/12/16 11:38 AM >>>
>>> log1p(x), in the base package computes log(1+x) accurately for small x (and
>>> large).
>>> 
>>> E.g.,
>>>> options(digits=16)
>>>> base::log1p(1e-14)
>>> [1] 9.95e-15
>>>> base::log1p(1e-14) - base::log(1+1e-14)
>>> [1] 7.992778373591124e-18
>>>> as.numeric(log(Rmpfr::mpfr(1,precBits=1000) + Rmpfr::mpfr(1e-14,
>>> precBits=1000))) - log1p(1e-14)
>>> [1] 0
>>> 
>>> 
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>>> 
>>>> On Mon, Dec 12, 2016 at 8:23 AM, Faradj Koliev <farad...@gmail.com> wrote:
>>>> 
>>>> Hi all,
>>>> 
>>>> How do I perform log(x+1) in R?
>>>> 
>>>> log1p_trans() from the package ”scales" doesn’t seem to work for me.
>>>> 
>>>> Best,
>>>> Faradj
>>>> __
>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/
>>>> posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>>>   [[alternative HTML version deleted]]
>>> 
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/
>>> posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>>> *Confidentiality Statement:*
>>> 
>>> This email message, including any attachments, is for ...{{dropped:10}}
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> 
> 
> 
> 
> 
> 
> 
> 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Log plus one transformation in R

2016-12-12 Thread Faradj Koliev
Hi all, 

How do I perform log(x+1) in R? 

log1p_trans() from the package ”scales" doesn’t seem to work for me. 

Best, 
Faradj
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to plot predicted probabilities with 95% CIs

2016-10-03 Thread Faradj Koliev
Dear all, 

I need a little help with plotting predicted probabilities (values). Consider 
the following example

**
data(”mtcars”) 

mfit = lm(mpg ~ vs + disp + cyl, data=mtcars)

newcar=data.frame(vs=c(0,1), disp=230, cyl=6.188)

Pmodel<–predict(mfit, newcar) 
**

I want to plot the effect of ”vs” ( 0 and 1) when all other variables are held 
constant (mean).  

To do this I run this code below:
**
plot(1:2, Pmodel$estimates[1:2,1],ylim=c(0,1),pch=19, xlim=c(.5,2.5), xlab=”X", 
ylab=”Predicted value of Y", xaxt="n", main= ”Predicted value of Y with 95% 
CIs")
arrows(1:2, (Pmodel $estimates[1:2,1]-1.96*Pmodel$estimates[1:2,2]), 1:2, 
(Pmodel$estimates[1:2,1]+1.96*Pmodel$estimates[1:2,2]), length=0.05, angle=90, 
code=3)
axis(1,at=c(1,2), labels=c(”Yes”,"No"))
**
What am I doing wring here? Thanks! 

Best, 
Faradj 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Hickman models with two binary dependent variables in R

2016-08-27 Thread Faradj Koliev
Dear Arne, 

Many thanks for this, 

It actually worked with heckit() command as well, do I need to use selection()? 

Also, I would be really grateful if you can suggest a package that would allow 
for estimation of heckman models with two ordered variables (0-1-2). Can 
sampleSelection handle this? 

Warm regards, 
Faradj 

> 27 aug. 2016 kl. 10:39 skrev Arne Henningsen <arne.henning...@gmail.com>:
> 
> See also:
> 
> http://r-forge.r-project.org/forum/forum.php?thread_id=31866_id=844_id=256
> 
> 
> 
> On 26 August 2016 at 16:11, PIKAL Petr <petr.pi...@precheza.cz> wrote:
>> Hi
>> 
>> See in line
>> 
>>> -Original Message-----
>>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Faradj
>>> Koliev
>>> Sent: Thursday, August 25, 2016 12:32 PM
>>> To: r-help@r-project.org
>>> Subject: [R] Hickman models with two binary dependent variables in R
>>> 
>>> Hi everyone,
>>> 
>>> How do I run Heckman models in R with two binary dependent variables?
>>> 
>>> sampleSelection package in R works with standard heckman models ( binary
>>> DV for the selection equation and continuous DV for the outcome equation).
>>> In my case dependent variables are both binary (actually ordered but I 
>>> didn’t
>>> find anything on that)
>> 
>> From help page
>> 
>> The endogenous variable of the argument 'selection' must have exactly two 
>> levels (e.g. 'FALSE' and 'TRUE', or '0' and '1'). By default the levels are 
>> sorted in increasing order ('FALSE' is before 'TRUE', and '0' is before 
>> '1'). This also applies for the binary outcome equation. For 
>> continuous-oucome cases, the dependent variable(s) should be numeric.
>> 
>> seems to me that both equatio0ns can have binary values.
>> 
>> 
>>> 
>>> So using sampleSelection package one could do this by running:
>>> 
>>> SelectionEquation <- binaryDV1 ~ x1+x2+x3+x4
>>> 
>>> OutcomeEquation <-  binaryDV2~o7+x1+x4+x5
>>> 
>>> HeckmanModel <- heckit(SelectionEquation,OutcomeEquation,
>>> data=mydata, method="2step")
>>> 
>>> 
>>> The problem is that heckit() doesn’t work here. I think in STATA one could
>> 
>> What does it mean. Be more specific.
>> 
>> And provide some data (preferably by dput).
>> Or at least result of str(yourdata) to show that they are appropriate to the 
>> function.
>> And do not post in HTML.
>> 
>> Cheers
>> Petr
>> 
>>> use heckprob command for this. Anyone who knows more than me?
>>> 
>>> Thanks!
>>> 
>>> 
>>> 
>>> 
>>> 
>>>  [[alternative HTML version deleted]]
>>> 
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>> guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> 
>> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou 
>> určeny pouze jeho adresátům.
>> Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
>> jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
>> svého systému.
>> Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
>> jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
>> Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
>> zpožděním přenosu e-mailu.
>> 
>> V případě, že je tento e-mail součástí obchodního jednání:
>> - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, 
>> a to z jakéhokoliv důvodu i bez uvedení důvodu.
>> - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
>> Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany 
>> příjemce s dodatkem či odchylkou.
>> - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
>> dosažením shody na všech jejích náležitostech.
>> - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za 
>> společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn 
>> nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto 
>> emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich 
>> exis

[R] Hickman models with two binary dependent variables in R

2016-08-25 Thread Faradj Koliev
Hi everyone, 

How do I run Heckman models in R with two binary dependent variables? 

sampleSelection package in R works with standard heckman models ( binary DV for 
the selection equation and continuous DV for the outcome equation). In my case 
dependent variables are both binary (actually ordered but I didn’t find 
anything on that)

So using sampleSelection package one could do this by running: 

SelectionEquation <- binaryDV1 ~ x1+x2+x3+x4

OutcomeEquation <-  binaryDV2~o7+x1+x4+x5

HeckmanModel <- heckit(SelectionEquation,OutcomeEquation, data=mydata, 
method="2step")


The problem is that heckit() doesn’t work here. I think in STATA one could use 
heckprob command for this. Anyone who knows more than me? 

Thanks! 





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to create marginal effects tables in R?

2016-08-08 Thread Faradj Koliev
Hi everyone, 

I have three ordered regression models where the ordered dependent variable 
ranges from 0 to 2. What I want to do is create marginal effects tables (not a 
plot) at each level (0, 1, and 2) for all three models. So, three tables with 
each showing the marginal effects at level 0, 1, and 2. 


## create a random data that is similar to my dataset
set.seed(987)
mydata <- data.frame(
  x1= sample(c(0, 1, 2), 100, replace = TRUE),
  x2= sample(c(0, 1, 2, 3, 4), 100, replace = TRUE),
  x3= sample(c(0, 1, 2, 3, 5), 100, replace = TRUE),
  x4= sample(c(1:100), 100, replace = TRUE),
  x5= sample(c(10:1000), 100, replace = TRUE),
  Y1 = sample(c(0, 1, 2), 100, replace = TRUE)
)
head(mydata)

## makeit factor
mydata$Y1 <- as.factor(mydata$Y1)

## My models
M1<- polr(Y1 ~x1+x2+x3+x4, data=mydata, Hess = TRUE,  method="logistic")

M2<- polr(Y1 ~x2+x3+x4+x5, data=mydata, Hess = TRUE,  method="logistic")

M3<- polr(Y1 ~x1+x2+x3+x4+x5, data=mydata, Hess = TRUE,  method="logistic")

## Calculate marginal effects using the erer package

M1ME<- ocME(M1)

M2ME <- ocME(M2)

M3ME <- ocME(M3)


Usually I would use the package stargazer to create proper tables, for example: 
stargazer(M1,M2, M3, type = ”text”)  

However, the output from the OcME()  does not generate the same type of tables 
and nor can I generate tables at each level. stargazer(M1ME$out,M2ME$out, 
M3ME$out,  type = "text" )

Do you have any suggestion as to how to generate these types of tables?



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Likelihood ratio test in porl (MASS)

2016-07-27 Thread Faradj Koliev
Dear Achim Zeileis, dear John Fox,

Thank you for your time! Both worked well. 

lrtest(Restrict, Full)

  #Df  LogLik Df  Chisq Pr(>Chisq)
1  27 -882.00 
2  28 -866.39  1 31.212  2.313e-08 ***


anova(Restrict, Full)

  Resid. df Resid. Dev   TestDf LR stat.  Pr(Chi)
1  2121   1763.999   
2  2120   1732.787 1 vs 2 1 31.21204 2.313266e-08



And both seems to reject the null hypothesis.  Thanks again! 

Best, 
Faradj








> 27 jul 2016 kl. 13:35 skrev Fox, John <j...@mcmaster.ca>:
> 
> Dear Faradj Koliev,
> 
> There is an anova() method for "polr" objects that computes LR chisquare 
> tests for nested models, so a short answer to your question is anova(Full, 
> Restricted).
> 
> The question, however, seems to reflect some misunderstandings. First aov() 
> fits linear analysis-of-variance models, which assume normally distributed 
> errors. These are different from the ordinal regression models, such as the 
> proportional-odds model, fit by polr(). For the former, F-tests *are* LR 
> tests; for the latter, F-tests aren't appropriate.
> 
> I hope this helps,
> John
> 
> -
> John Fox, Professor
> McMaster University
> Hamilton, Ontario
> Canada L8S 4M4
> Web: socserv.mcmaster.ca/jfox
> 
> 
> 
> 
>> -Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Faradj Koliev
>> Sent: July 27, 2016 4:50 AM
>> To: r-help@r-project.org
>> Subject: [R] Likelihood ratio test in porl (MASS)
>> 
>> Dear all,
>> 
>> A quick question: Let’s say I have a full and a restricted model that looks
>> something like this:
>> 
>> Full<- polr(Y ~ X1+X2+X3+X4, data=data, Hess = TRUE,  method="logistic”) #
>> ordered logistic regression
>> 
>> Restricted<- polr(Y ~ X1+X2+X3, data=data, Hess = TRUE,  method="logistic”) #
>> ordered logistic regression
>> 
>> I wanted to conduct the F-test (using aov command) in order to determine
>> whether the information from the X4 variable statistically improves our
>> understanding of Y.
>> However, I’ve been told that the likelihood ratio test is a better 
>> alternative. So,
>> I would like to conduct the LR test. In rms package this is easy -- 
>> lrest(Full,
>> Restricted) — I’m just curious how to perform the same using polr. Thanks!
>>  [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Likelihood ratio test in porl (MASS)

2016-07-27 Thread Faradj Koliev
Dear all, 

A quick question: Let’s say I have a full and a restricted model that looks 
something like this: 

Full<- polr(Y ~ X1+X2+X3+X4, data=data, Hess = TRUE,  method="logistic”) # 
ordered logistic regression 

Restricted<- polr(Y ~ X1+X2+X3, data=data, Hess = TRUE,  method="logistic”) # 
ordered logistic regression 

I wanted to conduct the F-test (using aov command) in order to determine 
whether the information from the X4 variable statistically improves our 
understanding of Y. 
However, I’ve been told that the likelihood ratio test is a better alternative. 
So, I would like to conduct the LR test. In rms package this is easy -- 
lrest(Full, Restricted) — I’m just curious how to perform the same using polr. 
Thanks!
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to plot marginal effects (MEM) in R?

2016-07-22 Thread Faradj Koliev
Dear David Winsemius, 

Thank you!  

The sample make no sense, I know. The real data is too big. So, I only want to 
understand how to plot marginal effects, to visualize them in a proper way. 

Best,


> 22 juli 2016 kl. 08:35 skrev David Winsemius <dwinsem...@comcast.net>:
> 
>> 
>> On Jul 21, 2016, at 2:22 PM, Faradj Koliev <farad...@gmail.com> wrote:
>> 
>> Dear all, 
>> 
>> I have two logistic regression models:
>> 
>> 
>>  • model <- glm(Y ~ X1+X2+X3+X4, data = data, family = "binomial")
>> 
>> 
>> 
>>  • modelInteraction <- glm(Y ~ X1+X2+X3+X4+X1*X4, data = data, family = 
>> "binomial")
>> 
>> To calculate the marginal effects (MEM approach) for these models, I used 
>> the `mfx` package:
>> 
>> 
>>  • a<- logitmfx(model, data=data, atmean=TRUE)
>> 
>> 
>> 
>>   •b<- logitmfx(modelInteraction, data=data, atmean=TRUE)
>> 
>> 
>> What I want to do now is 1) plot all the results for "model" and 2) show the 
>> result just for two variables: X1 and X2. 
>> 3) I also want to plot the interaction term in ”modelInteraction”.
> 
> There is no longer a single "effect" for X1 in modelInteraction in contrast 
> to the manner as there might be an "effect" for X2. There can only be 
> predictions for combined situations with particular combinations of values 
> for X1 and X4.
> 
>> model
> 
> Call:  glm(formula = Y ~ X1 + X2 + X3 + X4, family = "binomial", data = data)
> 
> Coefficients:
> (Intercept)   X1   X2   X3   X4  
>-0.3601   1.3353   0.1056   0.2898  -0.3705  
> 
> Degrees of Freedom: 68 Total (i.e. Null);  64 Residual
> Null Deviance:66.78 
> Residual Deviance: 62.27  AIC: 72.27
> 
> 
>> modelInteraction
> 
> Call:  glm(formula = Y ~ X1 + X2 + X3 + X4 + X1 * X4, family = "binomial", 
>data = data)
> 
> Coefficients:
> (Intercept)   X1   X2   X3   X4X1:X4  
>90.0158 -90.0747   0.1183   0.3064 -15.3688  15.1593  
> 
> Degrees of Freedom: 68 Total (i.e. Null);  63 Residual
> Null Deviance:66.78 
> Residual Deviance: 61.49  AIC: 73.49
> 
> Notice that a naive attempt to plot an X1  "effect" in modelInteraction might 
> pick the -90.07 value which would then ignore both the much larger Intercept 
> value and also ignore the fact that the interaction term has now split the X4 
> (and X1) "effects" into multiple pieces.
> 
> You need to interpret the effects of X1 in the context of a specification of 
> a particular X4 value and not forget that the Intercept should not be 
> ignored. It appears to me that the estimates of the mfx package are 
> essentially meaningless with the problem you have thrown at it.
> 
>> a
> Call:
> logitmfx(formula = model, data = data, atmean = TRUE)
> 
> Marginal Effects:
>   dF/dx Std. Err.   z   P>|z|  
> X1  0.147532  0.087865  1.6791 0.09314 .
> X2  0.015085  0.193888  0.0778 0.93798  
> X3  0.040309  0.063324  0.6366 0.52441  
> X4 -0.050393  0.092947 -0.5422 0.58770  
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> dF/dx is for discrete change for the following variables:
> 
> [1] "X1" "X2" "X4"
>> b
> Call:
> logitmfx(formula = modelInteraction, data = data, atmean = TRUE)
> 
> Marginal Effects:
>dF/dx   Std. Err. z  P>|z|
> X1-1.e+00  1.2121e-07 -8.25e+06 <2e-16 ***
> X2 6.5595e-03  8.1616e-01  8.00e-03 0.9936
> X3 1.6312e-02  2.0326e+00  8.00e-03 0.9936
> X4-9.6831e-01  1.5806e+01 -6.13e-02 0.9511
> X1:X4  8.0703e-01  1.4572e+01  5.54e-02 0.9558
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> dF/dx is for discrete change for the following variables:
> 
> [1] "X1" "X2" "X4"
> 
> I see no sensible interpretation of the phrase "X1 effect" in the comparison 
> tables above. The "p-value" in the second table appears to be nonsense 
> induced by throwing a model formulation that was not anticipated. There is a 
> negligible improvement in the glm fits:
> 
>> anova(model,modelInteraction)
> Analysis of Deviance Table
> 
> Model 1: Y ~ X1 + X2 + X3 + X4
> Model 2: Y ~ X1 + X2 + X3 + X4 + X1 * X4
>  Resid. Df Resid. Dev Df Deviance
> 164 62.274
> 263 61.495  1  0.77908
> 
> 
> So the notion that the "X1 

[R] How to plot marginal effects (MEM) in R?

2016-07-21 Thread Faradj Koliev
Dear all, 

I have two logistic regression models:


   • model <- glm(Y ~ X1+X2+X3+X4, data = data, family = "binomial")



   • modelInteraction <- glm(Y ~ X1+X2+X3+X4+X1*X4, data = data, family = 
"binomial")

To calculate the marginal effects (MEM approach) for these models, I used the 
`mfx` package:


   • a<- logitmfx(model, data=data, atmean=TRUE)



•b<- logitmfx(modelInteraction, data=data, atmean=TRUE)


What I want to do now is 1) plot all the results for "model" and 2) show the 
result just for two variables: X1 and X2. 
3) I also want to plot the interaction term in ”modelInteraction”.


I have been looking around for the solutions but haven't been able to find any. 
I would appreciate any suggestions. 

A reproducible sample: 

> dput(data)
structure(list(Y = c(0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 
0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 
0L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 
0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X1 = c(1L, 0L, 1L, 
0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 
1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 
0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 0L), X2 = c(0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
1L, 0L, 1L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X3 = c(0L, 0L, 0L, 0L, 0L, 
0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 2L, 2L, 3L, 4L, 5L, 0L, 0L, 
1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L
), X4 = c(6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 6L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
7L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L)), .Names = c("Y", "X1", "X2", 
"X3", "X4"), row.names = c(NA, -69L), class = "data.frame")




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to generate “aggregated” time lags?

2016-07-14 Thread Faradj Koliev
Dear all, 

I hope you’re enjoying your summer! 


I've been asked to "aggregate" my time lags from simple 1 year time lag to 1-3 
year time lag. This could be done --I've been told --by simply taking the sum 
or mean of time lag 1,2, and 3. 

I need your help here. How can I generate this "aggregated" time lag variable? 

This is how far I've come:

First, my (example) logistic model: 

print( model<- lrm(Y~X+A2, data=mydata))

Second, I created time lags 1,2,3, for the covariate X in the model. 

mydata$lag1X <- Lag(mydata$X, +1) 
mydata$lag2X <- Lag(mydata$X, +2) 
mydata$lag3X <- Lag(mydata$X, +3) 
Not sure how to go furtherHow do I take the sum or mean of these lag 
variables? How to create ”aggregated” time lag 1-3? All suggestions are very 
welcome!

A reproducible example (with lagged variables included )

dput(mydata)
structure(list(Subject = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("A", 
"B", "C", "D"), class = "factor"), Year = c(1990L, 1991L, 1992L, 
1993L, 1994L, 1995L, 1990L, 1991L, 1992L, 1993L, 1991L, 1992L, 
1993L, 1994L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L
), X = c(1L, 1L, 2L, 3L, 4L, 4L, 0L, 1L, 1L, 2L, 1L, 2L, 3L, 
3L, 1L, 2L, 3L, 4L, 5L, 5L, 6L), A1 = c(1L, 0L, 1L, 1L, 1L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 1L), 
Y = c(0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L), A2 = c(0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 
0L), lag1X = c(NA, 1L, 1L, 2L, 3L, 4L, 4L, 0L, 1L, 1L, 2L, 
1L, 2L, 3L, 3L, 1L, 2L, 3L, 4L, 5L, 5L), lag2X = c(NA, NA, 
1L, 1L, 2L, 3L, 4L, 4L, 0L, 1L, 1L, 2L, 1L, 2L, 3L, 3L, 1L, 
2L, 3L, 4L, 5L), lag3X = c(NA, NA, NA, 1L, 1L, 2L, 3L, 4L, 
4L, 0L, 1L, 1L, 2L, 1L, 2L, 3L, 3L, 1L, 2L, 3L, 4L)), .Names = c("Subject", 
"Year", "X", "A1", "Y", "A2", "lag1X", "lag2X", "lag3X"), row.names = c(NA, 
-21L), class = "data.frame")


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Logistic regression and robust standard errors

2016-07-01 Thread Faradj Koliev
Dear Achim Zeileis, 

Many thanks for your quick and informative answer. 

I’m sure that the vcovCL should work, however, I experience some problems. 


> coeftest(model, vcov=vcovCL(model, cluster=mydata$ID))

First I got this error: 

Error in vcovCL(model, cluster = mydata$ID) : 
  length of 'cluster' does not match number of observations

After checking the observations I got this error: 

Error in vcovCL(model, cluster = mydata$ID) : object 'tets' not found
Called from: vcovCL(model, cluster = mydata$ID)
Browse[1]> 

What can I do to fix it? What am I doing wrong now? 





> 1 jul 2016 kl. 14:57 skrev Achim Zeileis <achim.zeil...@uibk.ac.at>:
> 
> On Fri, 1 Jul 2016, Faradj Koliev wrote:
> 
>> Dear all, 
>> 
>> I use ?polr? command (library: MASS) to estimate an ordered logistic 
>> regression.
>> 
>> My model:   summary( model<- polr(y ~ x1+x2+x3+x4+x1*x2 ,data=mydata, Hess = 
>> TRUE))
>> 
>> But how do I get robust clustered standard errors? 
>> I??ve tried coeftest(resA, vcov=vcovHC(resA, cluster=lipton$ID))
> 
> The vcovHC() function currently does not (yet) have a "cluster" argument. We 
> are working on it but it's not finished yet.
> 
> As an alternative I include the vcovCL() function below that computes the 
> usual simple clustered sandwich estimator. This can be applied to "polr" 
> objects and plugged into coeftest(). So
> 
> coeftest(resA, vcov=vcovCL(resA, cluster=lipton$ID))
> 
> should work.
> 
>> and summary(a <- robcov(model,mydata$ID)).
> 
> The robcov() function does in principle what you want by I'm not sure whether 
> it works with polr(). But for sure it works with lrm() from the "rms" package.
> 
> Hope that helps,
> Z
> 
> vcovCL <- function(object, cluster = NULL, adjust = NULL)
> {
>  stopifnot(require("sandwich"))
> 
>  ## cluster specification
>  if(is.null(cluster)) cluster <- attr(object, "cluster")
>  if(is.null(cluster)) stop("no 'cluster' specification found")
>  cluster <- factor(cluster)
> 
>  ## estimating functions and dimensions
>  ef <- estfun(object)
>  n <- NROW(ef)
>  k <- NCOL(ef)
>  if(n != length(cluster))
>stop("length of 'cluster' does not match number of observations")
>  m <- length(levels(cluster))
> 
>  ## aggregate estimating functions by cluster and compute meat
>  ef <- sapply(levels(cluster), function(i) colSums(ef[cluster == i, ,
>drop = FALSE]))
>  ef <- if(NCOL(ef) > 1L) t(ef) else matrix(ef, ncol = 1L)
>  mt <- crossprod(ef)/n
> 
>  ## bread
>  br <- try(bread(object), silent = TRUE)
>  if(inherits(br, "try-error")) br <- vcov(object) * n
> 
>  ## put together sandwich
>  vc <- 1/n * (br %*% mt %*% br)
> 
>  ## adjustment
>  if(is.null(adjust)) adjust <- class(object)[1L] == "lm"
>  adj <- if(adjust) m/(m - 1L) * (n - 1L)/(n - k) else m/(m - 1L)
> 
>  ## return
>  return(adj * vc)
> }
> 
> 
>> Neither works for me. So I wonder what am I doing wrong here?
>> 
>> 
>> All suggestions are welcome ? thank you!
>>  [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Logistic regression and robust standard errors

2016-07-01 Thread Faradj Koliev
Dear all, 


I use ”polr” command (library: MASS) to estimate an ordered logistic regression.

My model:   summary( model<- polr(y ~ x1+x2+x3+x4+x1*x2 ,data=mydata, Hess = 
TRUE))

But how do I get robust clustered standard errors? 

I’’ve tried   coeftest(resA, vcov=vcovHC(resA, cluster=lipton$ID)) and 
summary(a <- robcov(model,mydata$ID)). Neither works for me. So I wonder what 
am I doing wrong here? 


All suggestions are welcome – thank you! 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to create these variables in R?

2016-06-28 Thread Faradj Koliev
Dear all, 

Let’s say that I have a dataset that looks something like this: 


Subject YearA   
A   19901   
A   19911   
A   19921   
A   19931   
A   19940   
A   19950   
B   19901   
B   19910   
B   19921   
B   19931   
C   19911   
C   19921   
C   19930   
C   19941   
D   19910   
D   19921  
D   19931   
D   19940   
D   19950   
D   19961   
D   19971   


What I would like to do is to create the following three new variables: A1, A2, 
and A3 
The variable A1 should capture/count all 1’s in the variable A that are in a 
row (counting), for each subject-year – but it  should restart counting if 
there 
are two 0’s in a row (displayed 
below). 

The variable A2 should capture 1-2’s (range) in the A1. Displayed in the 
example below.(subject-year) 

The variable A3 should capture all values in the variable A1 that are more than 
2, also displayed in the example data below.(subject-year) 


See the example below for illustration of these variables


Subject YearA   A1  A2  A3
A   19901   1   1   0
A   19911   2   1   0
A   19921   3   0   1
A   19931   4   0   1
A   19940   0   0   0
A   19950   0   0   0
B   19901   1   1   0
B   19910   1   1   0
B   19921   2   1   0
B   19931   3   0   1
C   19911   1   1   0
C   19921   2   1   0
C   19930   2   1   0
C   19941   3   0   1
D   19910   0   0   0
D   19921   1   1   0
D   19931   2   1   0
D   19940   0   0   0
D   19950   0   0   0
D   19961   1   1   0
D   19971   2   1   0


I’ve no clue where to start at this stage –I’d appreciate any suggestions and 
help. 



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Robust clustered errors for probit ordinal regression analysis

2016-04-28 Thread Faradj Koliev
Dear all, 

I’ll need your help with obtaining robust clustered errors. I use polr command 
in MASS package m<–porl(y~x1+x2,data=mydata, method=probit). In the rms 
package, this is as simple as: clusterSE<–robcov(m, mydata$id). Is it possible 
to do something similar for polr object as well? Thank you very much

Best, 
Faradj 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Predicting probabilities in ordinal probit analysis in R

2016-04-26 Thread Faradj Koliev
Dear all, 

I have two questions that are almost completely related to how to do things in 
R.

I am running an ordinal probit regression analysis in R. The dependent variable 
has three levels (0=no action; 1=warning; 2=sanction).

I use the lrm command in the rms package:

print( res1<- lrm(Y ~ x1+x2+x3+x4+x5+x6, y=TRUE, x=TRUE, data=mydata))
I simply couldn't make any sense of the information generated my ?predict.lrm. 
What I want to do is to calculate the marginal effects of all explanatory 
variables for each level of the dependent variable. In Stata, this is very 
simple: mfx compute, predict (outcome(#0)); mfx compute, predict (outcome(#2)) 
and mfx compute, predict (outcome(#3)).

So my first question is: how do I generate marginal effects for each outcome in 
R? 

The second question is related to interaction effects, which I need to include 
in the same model:

print( res1<- lrm(Y ~ x1+x2+x3+x4+x5+x6+x5*x6, y=TRUE, x=TRUE, data=mydata))
If I knew the answer to the first question, I would have ran marginal effects 
with the interaction term included. Then, I would have plotted the predicted 
values of the interaction term.

So the second question is: how do I plot the effects (predicted values) of 
variables in the interaction term?

Many thanks!

Small sample from my dataset (only one country)

dput(mydatasample)

structure(list(year = 1989:2014, country = structure(c(1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "Canada", class = "factor"), 
id = structure(1:26, .Label = c("CAN 1989", "CAN 1990", "CAN 1991", 
"CAN 1992", "CAN 1993", "CAN 1994", "CAN 1995", "CAN 1996", 
"CAN 1997", "CAN 1998", "CAN 1999", "CAN 2000", "CAN 2001", 
"CAN 2002", "CAN 2003", "CAN 2004", "CAN 2005", "CAN 2006", 
"CAN 2007", "CAN 2008", "CAN 2009", "CAN 2010", "CAN 2011", 
"CAN 2012", "CAN 2013", "CAN 2014"), class = "factor"), stage1 = c(1L, 
1L, 0L, 0L, 0L, 0L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 0L, 0L, 0L, 
0L, 2L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L), x1 = c(1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L), x2 = c(1L, 2L, 1L, 2L, 2L, 
2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 
2L, 1L, 2L, 2L, 2L, 2L), x3 = c(9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 8L, 9L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L), x4 = c(31L, 31L, 31L, 31L, 
31L, 30L, 30L, 30L, 31L, 30L, 29L, 30L, 28L, 28L, 28L, 27L, 
29L, 29L, 29L, 28L, 25L, 24L, 23L, NA, NA, NA), x5 = structure(1:26, .Label 
= c("17,12528685", 
"17,14022279", "17,15382785", "17,16610202", "17,17704534", 
"17,18665779", "17,19493938", "17,20571103", "17,21628118", 
"17,22493732", "17,23321101", "17,242041", "17,25213621", 
"17,26110753", "17,27106985", "17,2810902", "17,29094924", 
"17,29891768", "17,30861622", "17,31943819", "17,33088659", 
"17,34202619", "17,35190237", "17,36381421", "17,37537139", 
"17,38618117"), class = "factor"), x5.1 = c(0L, 0L, 0L, 0L, 
1L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 
0L, 0L, 0L, 0L, 1L, 1L, 0L)), .Names = c("year", "country", 
"id", "stage1", "x1", "x2", "x3", "x4", "x5", "x5.1"), class = "data.frame", 
row.names = c(NA, 
-26L))




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Online courses in Event History / Survival analysis with R

2015-05-24 Thread Faradj Koliev
Dear all, 

I am looking for some online courses – paid or free – in survival analysis with 
R. Perhaps you can recommend some interesting online courses? 

Best, 
Faradj Koliev
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Need HELP: how find and use a csv file?

2012-07-10 Thread Faradj Koliev
Hey, 

I am having some problems with importing a csv file into R and then saving it 
for analyzing. 

I got a csv file ( skater.csv) which i could read by typing:
read.csv(file=/Users/kama/Desktop/skatter.csv, header=TRUE, sep=;) 

However, when i enter:skatter.csv-read.csv(skatter.csv, header=TRUE) i 
get this message: 
Error in file(file, rt) : cannot open the connection 
In addition: Warning message: 
In file(file, rt) : 
I have tried with:   skatter.csv-file.choose() and other codes to find the 
file but it does not work. 
Please help me fix this problem, i have been sitting with this one in 4 hours.. 



What i need is to import this file and analyze it using for example histogram. 

I have Mac(update) and the file is saved in csv file... and I'm quite new user 
of R. 



Thank you very much! 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.