Re: [R] Selecting First Incidence from Longitudinal Data

2013-02-24 Thread Frank Harrell
I think we need a task view on longitudinal data manipulation.  There are so
many approaches to this - people need help navigating them.

I tend to stay away from the lapply-split methods as they don't look quite
as clean and may take longer to run.  The aggregate function uses too much
data frame subscripting.  The plyr package and the mApply function in the
Hmisc package provide some other nice solutions.  Often I like to stick with
tapply using constructs like 

with(mydata, tapply(1:nrow(mydata), subjectID, function(i) {... operate on
variables in mydata subscripted by [i] ...)))

Frank


arun kirshna wrote
> Hi,
> 
> I am not sure why you are getting different results.  I couldn't reproduce
> your problem.
> dat1<- read.table(text=" 
> ID    COMPL  SEX  HEREDITY 
> 1    0  1  2 
> 1    0  1  2 
> 1    3  1  2 
> 2    0  0  1 
> 2    1  0  1 
> 2    2  0  1 
> 2    2  0  1 
> 3    0  0  1 
> 3    0  0  1 
> 3    0  0  1 
> 3    0  0  1 
> 3    2  0  1 
> 4    0  1  2 
> 4    0  1  2 
> ",sep="",header=TRUE)
> do.call(rbind,lapply(split(dat1,dat1$ID),function(x) if(any(x$COMPL!=0))
> head(x[x$COMPL!=0,],1) else head(x,1)))
> #  ID COMPL SEX HEREDITY
> #1  1 3   1    2
> #2  2 1   0    1
> #3  3 2   0    1
> #4  4 0   1    2
> 
> 
> You could also try:
> dat1[with(dat1,ave(COMPL,ID,FUN=function(x) if(any(x!=0)) cumsum(x>0) else
> seq_along(x)))==1,] #modification of David's code
> #   ID COMPL SEX HEREDITY
> #3   1 3   1    2
> #5   2 1   0    1
> #12  3 2   0    1
> #13  4 0   1    2
> A.K.
> 
> 
> 
> 
> 
> ________
> From: Tasnuva Tabassum <

> t.tasnuva@

> >
> To: arun <

> smartpink111@

> > 
> Sent: Sunday, February 24, 2013 12:08 AM
> Subject: Re: [R] Selecting First Incidence from Longitudinal Data
> 
> 
> sorry, I tried this. But it gave me answer:
> 
>  #   ID COMPL SEX HEREDITY 
> #1   1 0   1    2    
> #4   2 0   0    1    
> #8   3 0   0    1    
> #13  4 0   1    2    
> 
> 
> 
> 
> On Sat, Feb 23, 2013 at 8:44 PM, arun <

> smartpink111@

> > wrote:
> 
> Hi,
>>Try this:
>>#dat1
>> do.call(rbind,lapply(split(dat1,dat1$ID),function(x) if(any(x$COMPL!=0))
head(x[x$COMPL!=0,],1) else head(x,1)))
>>
>>#  ID COMPL SEX HEREDITY
>>
>>#1  1 3   1    2
>>#2  2 1   0    1
>>#3  3 2   0    1
>>#4  4 0   1    2
>>A.K.
>>
>>
>>
>>
>>
>>
>>
>>From: Tasnuva Tabassum <

> t.tasnuva@

> >
>>To: Xiaogang Su <

> xiaogangsu@

> >
>>Cc: arun <

> smartpink111@

> >; R help <

> r-help@

> >; Rui Barradas <

> ruipbarradas@

> >
>>Sent: Saturday, February 23, 2013 11:23 PM
>>
>>Subject: Re: [R] Selecting First Incidence from Longitudinal Data
>>
>>
>>Hi
>>Thank you very much, but I forgot to tell that I also want to include the
patients for which no complication occurred. That is, for my data I want to
include patient no. 4, for which the COMPL value will be 0.
>>
>>In that case, what R function should I write?
>>
>>
>>
>>
>>On Sat, Feb 23, 2013 at 12:23 PM, Xiaogang Su <

> xiaogangsu@

> > wrote:
>>
>>My bad. I didn't try it out with the real data. Here you go. HTH, X
>>>
>>>
>>>dat <- read.table(text="
>>>ID    COMPL  SEX  HEREDITY
>>>1    0      1      2
>>>1    0      1      2
>>>1    3      1      2
>>>2    0      0      1
>>>2    1      0      1
>>>2    2      0      1
>>>2    2      0      1
>>>3    0      0      1
>>>3    0      0      1
>>>3    0      0      1
>>>3    0      0      1
>>>3    2      0      1
>>>4    0      1      2
>>>4    0      1      2
>>>", header = TRUE)
>>>
>>>
>>>dat0 <- dat[dat$COMPL!=0, ]
>>>dat0$sequence <- as.vector(unlist(lapply(aggregate(dat0$ID,
by=list(dat0$ID),FUN=length)$x, FUN=function(x){seq(1, x)})))
>>>dat0 <- dat0[dat0$sequence==1, ] 
>>>dat0
>>>
>>>
>>>
>>>
>>>On Sat, Feb 23, 2013 at 2:09 PM, arun <

> smartpink111@

> > wrote:
>>>
>>>HI,
>>>>Tried your approach:
&g

Re: [R] Selecting First Incidence from Longitudinal Data

2013-02-24 Thread arun
Hi,

I am not sure why you are getting different results.  I couldn't reproduce your 
problem.
dat1<- read.table(text=" 
ID    COMPL  SEX  HEREDITY 
1    0  1  2 
1    0  1  2 
1    3  1  2 
2    0  0  1 
2    1  0  1 
2    2  0  1 
2    2  0  1 
3    0  0  1 
3    0  0  1 
3    0  0  1 
3    0  0  1 
3    2  0  1 
4    0  1  2 
4    0  1  2 
",sep="",header=TRUE)
do.call(rbind,lapply(split(dat1,dat1$ID),function(x) if(any(x$COMPL!=0)) 
head(x[x$COMPL!=0,],1) else head(x,1)))
#  ID COMPL SEX HEREDITY
#1  1 3   1    2
#2  2 1   0    1
#3  3 2   0    1
#4  4 0   1    2


You could also try:
dat1[with(dat1,ave(COMPL,ID,FUN=function(x) if(any(x!=0)) cumsum(x>0) else 
seq_along(x)))==1,] #modification of David's code
#   ID COMPL SEX HEREDITY
#3   1 3   1    2
#5   2 1   0    1
#12  3 2   0    1
#13  4 0   1    2
A.K.






From: Tasnuva Tabassum 
To: arun  
Sent: Sunday, February 24, 2013 12:08 AM
Subject: Re: [R] Selecting First Incidence from Longitudinal Data


sorry, I tried this. But it gave me answer:

 #   ID COMPL SEX HEREDITY 
#1   1 0   1    2    
#4   2 0   0    1    
#8   3 0   0    1    
#13  4 0   1    2    




On Sat, Feb 23, 2013 at 8:44 PM, arun  wrote:

Hi,
>Try this:
>#dat1
> do.call(rbind,lapply(split(dat1,dat1$ID),function(x) if(any(x$COMPL!=0)) 
>head(x[x$COMPL!=0,],1) else head(x,1)))
>
>#  ID COMPL SEX HEREDITY
>
>#1  1 3   1    2
>#2  2 1   0    1
>#3  3 2   0    1
>#4  4 0   1    2
>A.K.
>
>
>
>
>
>
>
>From: Tasnuva Tabassum 
>To: Xiaogang Su 
>Cc: arun ; R help ; Rui Barradas 
>
>Sent: Saturday, February 23, 2013 11:23 PM
>
>Subject: Re: [R] Selecting First Incidence from Longitudinal Data
>
>
>Hi
>Thank you very much, but I forgot to tell that I also want to include the 
>patients for which no complication occurred. That is, for my data I want to 
>include patient no. 4, for which the COMPL value will be 0.
>
>In that case, what R function should I write?
>
>
>
>
>On Sat, Feb 23, 2013 at 12:23 PM, Xiaogang Su  wrote:
>
>My bad. I didn't try it out with the real data. Here you go. HTH, X
>>
>>
>>dat <- read.table(text="
>>ID    COMPL  SEX  HEREDITY
>>1    0      1      2
>>1    0      1      2
>>1    3      1      2
>>2    0      0      1
>>2    1      0      1
>>2    2      0      1
>>2    2      0      1
>>3    0      0      1
>>3    0      0      1
>>3    0      0      1
>>3    0      0      1
>>3    2      0      1
>>4    0      1      2
>>4    0      1      2
>>", header = TRUE)
>>
>>
>>dat0 <- dat[dat$COMPL!=0, ]
>>dat0$sequence <- as.vector(unlist(lapply(aggregate(dat0$ID, 
>>by=list(dat0$ID),FUN=length)$x, FUN=function(x){seq(1, x)})))
>>dat0 <- dat0[dat0$sequence==1, ] 
>>dat0
>>
>>
>>
>>
>>On Sat, Feb 23, 2013 at 2:09 PM, arun  wrote:
>>
>>HI,
>>>Tried your approach:
>>>
>>>
>>> dat1$sequence <- as.vector(unlist(lapply( aggregate(dat1$ID, 
>>>by=list(dat1$ID),FUN=length)$x, FUN=function(x){seq(1, x)})))
>>> dat0 <- dat1[dat1$sequence==1 & dat1$COMPL!= 0, ] #your second solution
>>> dat0
>>>#[1] ID   COMPL    SEX  HEREDITY sequence
>>>#<0 rows> (or 0-length row.names)
>>> 
>>>
>>>dat1[dat1$sequence==1,] #here the OP wanted first incidence where COMPL!=0
>>>#   ID COMPL SEX HEREDITY sequence
>>>#1   1 0   1    2    1
>>>#4   2 0   0    1    1
>>>#8   3 0   0    1    1
>>>#13  4 0   1    2    1
>>>A.K.
>>>
>>>
>>>
>>>
>>>- Original Message -
>>>From: Xiaogang Su 
>>>To: Rui Barradas 
>>>Cc: r-help@r-project.org
>>>Sent: Saturday, February 23, 2013 2:15 PM
>>>Subject: Re: [R] Selecting First Incidence from Longitudinal Data
>>>
>>>Try this:
>>>dat$sequence <- as.vector(unlist(lapply( aggregate(dat$ID, by=list(x),
>>>FUN=length)$x, FUN=function(x){seq(1, x
>>>dat0 <- dat[dat$sequence==1, ]
>>>
>>>HTH, X
>>>
>>>
>>>On Sat, Feb 23, 2013 at 1:07 PM, Rui Barradas  wrote:
>>>
>>>> Hello,
>>>>
>>>> You can use ?aggregate an

Re: [R] Selecting First Incidence from Longitudinal Data

2013-02-23 Thread arun
Hi,
Try this:
#dat1
 do.call(rbind,lapply(split(dat1,dat1$ID),function(x) if(any(x$COMPL!=0)) 
head(x[x$COMPL!=0,],1) else head(x,1)))
#  ID COMPL SEX HEREDITY
#1  1 3   1    2
#2  2 1   0    1
#3  3 2   0    1
#4  4 0   1    2
A.K.






From: Tasnuva Tabassum 
To: Xiaogang Su  
Cc: arun ; R help ; Rui Barradas 
 
Sent: Saturday, February 23, 2013 11:23 PM
Subject: Re: [R] Selecting First Incidence from Longitudinal Data


Hi
Thank you very much, but I forgot to tell that I also want to include the 
patients for which no complication occurred. That is, for my data I want to 
include patient no. 4, for which the COMPL value will be 0.

In that case, what R function should I write? 




On Sat, Feb 23, 2013 at 12:23 PM, Xiaogang Su  wrote:

My bad. I didn't try it out with the real data. Here you go. HTH, X
>
>
>dat <- read.table(text="
>ID    COMPL  SEX  HEREDITY
>1    0      1      2
>1    0      1      2
>1    3      1      2
>2    0      0      1
>2    1      0      1
>2    2      0      1
>2    2      0      1
>3    0      0      1
>3    0      0      1
>3    0      0      1
>3    0      0      1
>3    2      0      1
>4    0      1      2
>4    0      1      2
>", header = TRUE)
>
>
>dat0 <- dat[dat$COMPL!=0, ]
>dat0$sequence <- as.vector(unlist(lapply(aggregate(dat0$ID, 
>by=list(dat0$ID),FUN=length)$x, FUN=function(x){seq(1, x)})))
>dat0 <- dat0[dat0$sequence==1, ] 
>dat0
>
>
>
>
>On Sat, Feb 23, 2013 at 2:09 PM, arun  wrote:
>
>HI,
>>Tried your approach:
>>
>>
>> dat1$sequence <- as.vector(unlist(lapply( aggregate(dat1$ID, 
>>by=list(dat1$ID),FUN=length)$x, FUN=function(x){seq(1, x)})))
>> dat0 <- dat1[dat1$sequence==1 & dat1$COMPL!= 0, ] #your second solution
>> dat0
>>#[1] ID   COMPL    SEX  HEREDITY sequence
>>#<0 rows> (or 0-length row.names)
>> 
>>
>>dat1[dat1$sequence==1,] #here the OP wanted first incidence where COMPL!=0
>>#   ID COMPL SEX HEREDITY sequence
>>#1   1 0   1    2    1
>>#4   2 0   0    1    1
>>#8   3     0   0    1    1
>>#13  4 0   1    2    1
>>A.K.
>>
>>
>>
>>
>>- Original Message -
>>From: Xiaogang Su 
>>To: Rui Barradas 
>>Cc: r-help@r-project.org
>>Sent: Saturday, February 23, 2013 2:15 PM
>>Subject: Re: [R] Selecting First Incidence from Longitudinal Data
>>
>>Try this:
>>dat$sequence <- as.vector(unlist(lapply( aggregate(dat$ID, by=list(x),
>>FUN=length)$x, FUN=function(x){seq(1, x
>>dat0 <- dat[dat$sequence==1, ]
>>
>>HTH, X
>>
>>
>>On Sat, Feb 23, 2013 at 1:07 PM, Rui Barradas  wrote:
>>
>>> Hello,
>>>
>>> You can use ?aggregate and ?head to do what you want. Try the following.
>>>
>>>
>>>
>>> dat <- read.table(text="
>>>
>>> ID    COMPL  SEX  HEREDITY
>>> 1    0      1      2
>>> 1    0      1      2
>>> 1    3      1      2
>>> 2    0      0      1
>>> 2    1      0      1
>>> 2    2      0      1
>>> 2    2      0      1
>>> 3    0      0      1
>>> 3    0      0      1
>>> 3    0      0      1
>>> 3    0      0      1
>>> 3    2      0      1
>>> 4    0      1      2
>>> 4    0      1      2
>>> ", header = TRUE)
>>>
>>> aggregate(. ~ ID, data = subset(dat, COMPL != 0), head, 1)
>>>
>>>
>>> Hope this helps,
>>>
>>> Rui Barradas
>>>
>>> Em 23-02-2013 14:28, Tasnuva Tabassum escreveu:
>>>
>>>  I have a longitudinal competing risk data of the form:
>>>>
>>>> ID    COMPL  SEX   HEREDITY
>>>> 1     0       1      2
>>>> 1     0       1      2
>>>> 1     3       1      2
>>>> 2     0       0      1
>>>> 2     1       0      1
>>>> 2     2       0      1
>>>> 2     2       0      1
>>>> 3     0       0      1
>>>> 3     0       0      1
>>>> 3     0       0      1
>>>> 3     0       0      1
>>>> 3     2       0      1
>>>> 4     0       1      2
>>>> 4     0       1      2.
>>>>
>>>> Where, COMPL= health complication of diabetic patients which has value
>>>> labels   as  0= no complication,1=coronary heart disease, 2=retinopathy,
>>>> 3=
>>>> nephropathy.
>>>>
>>>>

Re: [R] Selecting First Incidence from Longitudinal Data

2013-02-23 Thread Tasnuva Tabassum
Hi
Thank you very much, but I forgot to tell that I also want to include the
patients for which no complication occurred. That is, for my data I want to
include patient no. 4, for which the COMPL value will be 0.

In that case, what R function should I write?


On Sat, Feb 23, 2013 at 12:23 PM, Xiaogang Su  wrote:

> My bad. I didn't try it out with the real data. Here you go. HTH, X
>
> dat <- read.table(text="
> IDCOMPL  SEX  HEREDITY
> 10  1  2
> 10  1  2
> 13  1  2
> 20  0  1
> 21  0  1
> 22  0  1
> 22  0  1
> 30  0  1
> 30  0  1
> 30  0  1
> 30  0  1
> 32  0  1
> 40  1  2
> 40  1  2
> ", header = TRUE)
>
> dat0 <- dat[dat$COMPL!=0, ]
> dat0$sequence <- as.vector(unlist(lapply(aggregate(dat0$ID,
> by=list(dat0$ID),FUN=length)$x, FUN=function(x){seq(1, x)})))
> dat0 <- dat0[dat0$sequence==1, ]
> dat0
>
>
>
> On Sat, Feb 23, 2013 at 2:09 PM, arun  wrote:
>
>> HI,
>> Tried your approach:
>>
>>
>>  dat1$sequence <- as.vector(unlist(lapply( aggregate(dat1$ID,
>> by=list(dat1$ID),FUN=length)$x, FUN=function(x){seq(1, x)})))
>>  dat0 <- dat1[dat1$sequence==1 & dat1$COMPL!= 0, ] #your second solution
>>  dat0
>> #[1] ID   COMPLSEX  HEREDITY sequence
>> #<0 rows> (or 0-length row.names)
>>
>>
>> dat1[dat1$sequence==1,] #here the OP wanted first incidence where COMPL!=0
>> #   ID COMPL SEX HEREDITY sequence
>> #1   1 0   121
>> #4   2 0   011
>> #8   3 0   0        1    1
>> #13  4     0   1    2    1
>> A.K.
>>
>>
>>
>> - Original Message -
>> From: Xiaogang Su 
>> To: Rui Barradas 
>> Cc: r-help@r-project.org
>> Sent: Saturday, February 23, 2013 2:15 PM
>> Subject: Re: [R] Selecting First Incidence from Longitudinal Data
>>
>> Try this:
>> dat$sequence <- as.vector(unlist(lapply( aggregate(dat$ID, by=list(x),
>> FUN=length)$x, FUN=function(x){seq(1, x
>> dat0 <- dat[dat$sequence==1, ]
>>
>> HTH, X
>>
>>
>> On Sat, Feb 23, 2013 at 1:07 PM, Rui Barradas 
>> wrote:
>>
>> > Hello,
>> >
>> > You can use ?aggregate and ?head to do what you want. Try the following.
>> >
>> >
>> >
>> > dat <- read.table(text="
>> >
>> > IDCOMPL  SEX  HEREDITY
>> > 10  1  2
>> > 10  1  2
>> > 13  1  2
>> > 20  0  1
>> > 21  0  1
>> > 22  0  1
>> > 22  0  1
>> > 30  0  1
>> > 30  0  1
>> > 30  0  1
>> > 30  0  1
>> > 32  0  1
>> > 40  1  2
>> > 40  1  2
>> > ", header = TRUE)
>> >
>> > aggregate(. ~ ID, data = subset(dat, COMPL != 0), head, 1)
>> >
>> >
>> > Hope this helps,
>> >
>> > Rui Barradas
>> >
>> > Em 23-02-2013 14:28, Tasnuva Tabassum escreveu:
>> >
>> >  I have a longitudinal competing risk data of the form:
>> >>
>> >> IDCOMPL  SEX   HEREDITY
>> >> 1 0   1  2
>> >> 1 0   1  2
>> >> 1 3   1  2
>> >> 2 0   0  1
>> >> 2 1   0  1
>> >> 2 2   0  1
>> >> 2 2   0  1
>> >> 3 0   0  1
>> >> 3 0   0  1
>> >> 3 0   0  1
>> >> 3 0   0  1
>> >> 3 2   0  1
>> >> 4 0   1  2
>> >> 4 0   1  2.
>> >>
>> >> Where, COMPL= health complication of diabetic patients which has value
>> >> labels   as  0= no complication,1=coronary heart disease,
>> 2=retinopathy,
>> >> 3=
>> >> nephropathy.
>> >>
>> >>
>> >> I want to select only the first complication that occurred to each
>> >> patient.
>> >> What R function can I use?
>> >>
>> >> [[alternative HTML version deleted]]
>> >>
>> >> __**
>> >> R-help@r-project.

Re: [R] Selecting First Incidence from Longitudinal Data

2013-02-23 Thread Xiaogang Su
My bad. I didn't try it out with the real data. Here you go. HTH, X

dat <- read.table(text="
IDCOMPL  SEX  HEREDITY
10  1  2
10  1  2
13  1  2
20  0  1
21  0  1
22  0  1
22  0  1
30  0  1
30  0  1
30  0  1
30  0  1
32  0  1
40  1  2
40  1  2
", header = TRUE)

dat0 <- dat[dat$COMPL!=0, ]
dat0$sequence <- as.vector(unlist(lapply(aggregate(dat0$ID,
by=list(dat0$ID),FUN=length)$x, FUN=function(x){seq(1, x)})))
dat0 <- dat0[dat0$sequence==1, ]
dat0



On Sat, Feb 23, 2013 at 2:09 PM, arun  wrote:

> HI,
> Tried your approach:
>
>
>  dat1$sequence <- as.vector(unlist(lapply( aggregate(dat1$ID,
> by=list(dat1$ID),FUN=length)$x, FUN=function(x){seq(1, x)})))
>  dat0 <- dat1[dat1$sequence==1 & dat1$COMPL!= 0, ] #your second solution
>  dat0
> #[1] ID   COMPLSEX  HEREDITY sequence
> #<0 rows> (or 0-length row.names)
>
>
> dat1[dat1$sequence==1,] #here the OP wanted first incidence where COMPL!=0
> #   ID COMPL SEX HEREDITY sequence
> #1   1 0   121
> #4   2 0   011
> #8   3 0   011
> #13  4 0   121
> A.K.
>
>
>
> - Original Message -
> From: Xiaogang Su 
> To: Rui Barradas 
> Cc: r-help@r-project.org
> Sent: Saturday, February 23, 2013 2:15 PM
> Subject: Re: [R] Selecting First Incidence from Longitudinal Data
>
> Try this:
> dat$sequence <- as.vector(unlist(lapply( aggregate(dat$ID, by=list(x),
> FUN=length)$x, FUN=function(x){seq(1, x
> dat0 <- dat[dat$sequence==1, ]
>
> HTH, X
>
>
> On Sat, Feb 23, 2013 at 1:07 PM, Rui Barradas 
> wrote:
>
> > Hello,
> >
> > You can use ?aggregate and ?head to do what you want. Try the following.
> >
> >
> >
> > dat <- read.table(text="
> >
> > IDCOMPL  SEX  HEREDITY
> > 10  1  2
> > 10  1  2
> > 13  1  2
> > 20  0  1
> > 21  0  1
> > 22  0  1
> > 22  0  1
> > 30  0  1
> > 30  0  1
> > 30  0  1
> > 30  0  1
> > 32  0  1
> > 40  1  2
> > 40  1  2
> > ", header = TRUE)
> >
> > aggregate(. ~ ID, data = subset(dat, COMPL != 0), head, 1)
> >
> >
> > Hope this helps,
> >
> > Rui Barradas
> >
> > Em 23-02-2013 14:28, Tasnuva Tabassum escreveu:
> >
> >  I have a longitudinal competing risk data of the form:
> >>
> >> IDCOMPL  SEX   HEREDITY
> >> 1 0   1  2
> >> 1 0   1  2
> >> 1 3   1  2
> >> 2 0   0  1
> >> 2 1   0  1
> >> 2 2   0  1
> >> 2 2   0  1
> >> 3 0   0  1
> >> 3 0   0  1
> >> 3 0   0  1
> >> 3 0   0  1
> >> 3 2   0  1
> >> 4 0   1  2
> >> 4 0   1  2.
> >>
> >> Where, COMPL= health complication of diabetic patients which has value
> >> labels   as  0= no complication,1=coronary heart disease, 2=retinopathy,
> >> 3=
> >> nephropathy.
> >>
> >>
> >> I want to select only the first complication that occurred to each
> >> patient.
> >> What R function can I use?
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __**
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/**listinfo/r-help<
> https://stat.ethz.ch/mailman/listinfo/r-help>
> >> PLEASE do read the posting guide http://www.R-project.org/**
> >> posting-guide.html <http://www.R-project.org/posting-guide.html>
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >>
> > __**
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/**listinfo/r-help<
> https://stat.ethz.ch/mailman/listinfo/r-help>
> > PLEASE do read the posting guide http://www.R-project.org/**
> > posting-guide.html <http://www.R-project.org/posting-guide.html>
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> ==
> X

Re: [R] Selecting First Incidence from Longitudinal Data

2013-02-23 Thread arun
HI,
Tried your approach:


 dat1$sequence <- as.vector(unlist(lapply( aggregate(dat1$ID, 
by=list(dat1$ID),FUN=length)$x, FUN=function(x){seq(1, x)}))) 
 dat0 <- dat1[dat1$sequence==1 & dat1$COMPL!= 0, ] #your second solution
 dat0
#[1] ID   COMPL    SEX  HEREDITY sequence
#<0 rows> (or 0-length row.names)
  

dat1[dat1$sequence==1,] #here the OP wanted first incidence where COMPL!=0
#   ID COMPL SEX HEREDITY sequence
#1   1 0   1    2    1
#4   2 0   0    1    1
#8   3 0   0    1    1
#13  4 0   1    2    1
A.K.



- Original Message -
From: Xiaogang Su 
To: Rui Barradas 
Cc: r-help@r-project.org
Sent: Saturday, February 23, 2013 2:15 PM
Subject: Re: [R] Selecting First Incidence from Longitudinal Data

Try this:
dat$sequence <- as.vector(unlist(lapply( aggregate(dat$ID, by=list(x),
FUN=length)$x, FUN=function(x){seq(1, x
dat0 <- dat[dat$sequence==1, ]

HTH, X


On Sat, Feb 23, 2013 at 1:07 PM, Rui Barradas  wrote:

> Hello,
>
> You can use ?aggregate and ?head to do what you want. Try the following.
>
>
>
> dat <- read.table(text="
>
> ID    COMPL  SEX  HEREDITY
> 1    0      1      2
> 1    0      1      2
> 1    3      1      2
> 2    0      0      1
> 2    1      0      1
> 2    2      0      1
> 2    2      0      1
> 3    0      0      1
> 3    0      0      1
> 3    0      0      1
> 3    0      0      1
> 3    2      0      1
> 4    0      1      2
> 4    0      1      2
> ", header = TRUE)
>
> aggregate(. ~ ID, data = subset(dat, COMPL != 0), head, 1)
>
>
> Hope this helps,
>
> Rui Barradas
>
> Em 23-02-2013 14:28, Tasnuva Tabassum escreveu:
>
>  I have a longitudinal competing risk data of the form:
>>
>> ID    COMPL  SEX   HEREDITY
>> 1     0       1      2
>> 1     0       1      2
>> 1     3       1      2
>> 2     0       0      1
>> 2     1       0      1
>> 2     2       0      1
>> 2     2       0      1
>> 3     0       0      1
>> 3     0       0      1
>> 3     0       0      1
>> 3     0       0      1
>> 3     2       0      1
>> 4     0       1      2
>> 4     0       1      2.
>>
>> Where, COMPL= health complication of diabetic patients which has value
>> labels   as  0= no complication,1=coronary heart disease, 2=retinopathy,
>> 3=
>> nephropathy.
>>
>>
>> I want to select only the first complication that occurred to each
>> patient.
>> What R function can I use?
>>
>>         [[alternative HTML version deleted]]
>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> __**
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide http://www.R-project.org/**
> posting-guide.html <http://www.R-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
==
Xiaogang Su, Ph.D.
Associate Professor & Statistician
School of Nursing, University of Alabama
Birmingham, AL 35294-1210
(205) 934-2355 [Office]
x...@uab.edu
xiaogan...@gmail.com
https://sites.google.com/site/xgsu00/

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting First Incidence from Longitudinal Data

2013-02-23 Thread Xiaogang Su
To account for COMP,

dat$sequence <- as.vector(unlist(lapply( aggregate(dat$ID, by=list(dat$ID),
FUN=length)$x, FUN=function(x){seq(1, x
dat0 <- dat[dat$sequence==1 & dat$COMPL!= 0, ]

HTH, X


On Sat, Feb 23, 2013 at 1:15 PM, Xiaogang Su  wrote:

> Try this:
> dat$sequence <- as.vector(unlist(lapply( aggregate(dat$ID, by=list(x),
> FUN=length)$x, FUN=function(x){seq(1, x
> dat0 <- dat[dat$sequence==1, ]
>
> HTH, X
>
>
> On Sat, Feb 23, 2013 at 1:07 PM, Rui Barradas wrote:
>
>> Hello,
>>
>> You can use ?aggregate and ?head to do what you want. Try the following.
>>
>>
>>
>> dat <- read.table(text="
>>
>> IDCOMPL  SEX  HEREDITY
>> 10  1  2
>> 10  1  2
>> 13  1  2
>> 20  0  1
>> 21  0  1
>> 22  0  1
>> 22  0  1
>> 30  0  1
>> 30  0  1
>> 30  0  1
>> 30  0  1
>> 32  0  1
>> 40  1  2
>> 40  1  2
>> ", header = TRUE)
>>
>> aggregate(. ~ ID, data = subset(dat, COMPL != 0), head, 1)
>>
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>> Em 23-02-2013 14:28, Tasnuva Tabassum escreveu:
>>
>>  I have a longitudinal competing risk data of the form:
>>>
>>> IDCOMPL  SEX   HEREDITY
>>> 1 0   1  2
>>> 1 0   1  2
>>> 1 3   1  2
>>> 2 0   0  1
>>> 2 1   0  1
>>> 2 2   0  1
>>> 2 2   0  1
>>> 3 0   0  1
>>> 3 0   0  1
>>> 3 0   0  1
>>> 3 0   0  1
>>> 3 2   0  1
>>> 4 0   1  2
>>> 4 0   1  2.
>>>
>>> Where, COMPL= health complication of diabetic patients which has value
>>> labels   as  0= no complication,1=coronary heart disease, 2=retinopathy,
>>> 3=
>>> nephropathy.
>>>
>>>
>>> I want to select only the first complication that occurred to each
>>> patient.
>>> What R function can I use?
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __**
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/**listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/**
>>> posting-guide.html 
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> ==
> Xiaogang Su, Ph.D.
> Associate Professor & Statistician
> School of Nursing, University of Alabama
> Birmingham, AL 35294-1210
> (205) 934-2355 [Office]
> x...@uab.edu
> xiaogan...@gmail.com
> https://sites.google.com/site/xgsu00/




-- 
==
Xiaogang Su, Ph.D.
Associate Professor & Statistician
School of Nursing, University of Alabama
Birmingham, AL 35294-1210
(205) 934-2355 [Office]
x...@uab.edu
xiaogan...@gmail.com
https://sites.google.com/site/xgsu00/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting First Incidence from Longitudinal Data

2013-02-23 Thread Xiaogang Su
Try this:
dat$sequence <- as.vector(unlist(lapply( aggregate(dat$ID, by=list(x),
FUN=length)$x, FUN=function(x){seq(1, x
dat0 <- dat[dat$sequence==1, ]

HTH, X


On Sat, Feb 23, 2013 at 1:07 PM, Rui Barradas  wrote:

> Hello,
>
> You can use ?aggregate and ?head to do what you want. Try the following.
>
>
>
> dat <- read.table(text="
>
> IDCOMPL  SEX  HEREDITY
> 10  1  2
> 10  1  2
> 13  1  2
> 20  0  1
> 21  0  1
> 22  0  1
> 22  0  1
> 30  0  1
> 30  0  1
> 30  0  1
> 30  0  1
> 32  0  1
> 40  1  2
> 40  1  2
> ", header = TRUE)
>
> aggregate(. ~ ID, data = subset(dat, COMPL != 0), head, 1)
>
>
> Hope this helps,
>
> Rui Barradas
>
> Em 23-02-2013 14:28, Tasnuva Tabassum escreveu:
>
>  I have a longitudinal competing risk data of the form:
>>
>> IDCOMPL  SEX   HEREDITY
>> 1 0   1  2
>> 1 0   1  2
>> 1 3   1  2
>> 2 0   0  1
>> 2 1   0  1
>> 2 2   0  1
>> 2 2   0  1
>> 3 0   0  1
>> 3 0   0  1
>> 3 0   0  1
>> 3 0   0  1
>> 3 2   0  1
>> 4 0   1  2
>> 4 0   1  2.
>>
>> Where, COMPL= health complication of diabetic patients which has value
>> labels   as  0= no complication,1=coronary heart disease, 2=retinopathy,
>> 3=
>> nephropathy.
>>
>>
>> I want to select only the first complication that occurred to each
>> patient.
>> What R function can I use?
>>
>> [[alternative HTML version deleted]]
>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> __**
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/**
> posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
==
Xiaogang Su, Ph.D.
Associate Professor & Statistician
School of Nursing, University of Alabama
Birmingham, AL 35294-1210
(205) 934-2355 [Office]
x...@uab.edu
xiaogan...@gmail.com
https://sites.google.com/site/xgsu00/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting First Incidence from Longitudinal Data

2013-02-23 Thread Rui Barradas

Hello,

You can use ?aggregate and ?head to do what you want. Try the following.



dat <- read.table(text="
IDCOMPL  SEX  HEREDITY
10  1  2
10  1  2
13  1  2
20  0  1
21  0  1
22  0  1
22  0  1
30  0  1
30  0  1
30  0  1
30  0  1
32  0  1
40  1  2
40  1  2
", header = TRUE)

aggregate(. ~ ID, data = subset(dat, COMPL != 0), head, 1)


Hope this helps,

Rui Barradas

Em 23-02-2013 14:28, Tasnuva Tabassum escreveu:

I have a longitudinal competing risk data of the form:

IDCOMPL  SEX   HEREDITY
1 0   1  2
1 0   1  2
1 3   1  2
2 0   0  1
2 1   0  1
2 2   0  1
2 2   0  1
3 0   0  1
3 0   0  1
3 0   0  1
3 0   0  1
3 2   0  1
4 0   1  2
4 0   1  2.

Where, COMPL= health complication of diabetic patients which has value
labels   as  0= no complication,1=coronary heart disease, 2=retinopathy, 3=
nephropathy.


I want to select only the first complication that occurred to each patient.
What R function can I use?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting First Incidence from Longitudinal Data

2013-02-23 Thread arun
Hi,
You can also use:
 do.call(rbind,lapply(split(dat1,dat1$ID),function(x) head(x[x$COMPL!=0,],1)))
#  ID COMPL SEX HEREDITY
#1  1 3   1    2
#2  2 1   0    1
#3  3 2   0    1






- Original Message -
From: Tasnuva Tabassum 
To: r-help@r-project.org
Cc: 
Sent: Saturday, February 23, 2013 9:28 AM
Subject: [R] Selecting First Incidence from Longitudinal Data

I have a longitudinal competing risk data of the form:

ID    COMPL  SEX   HEREDITY
1     0       1      2
1     0       1      2
1     3       1      2
2     0       0      1
2     1       0      1
2     2       0      1
2     2       0      1
3     0       0      1
3     0       0      1
3     0       0      1
3     0       0      1
3     2       0      1
4     0       1      2
4     0       1      2.

Where, COMPL= health complication of diabetic patients which has value
labels   as  0= no complication,1=coronary heart disease, 2=retinopathy, 3=
nephropathy.


I want to select only the first complication that occurred to each patient.
What R function can I use?

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting First Incidence from Longitudinal Data

2013-02-23 Thread David Winsemius

On Feb 23, 2013, at 6:28 AM, Tasnuva Tabassum wrote:

> I have a longitudinal competing risk data of the form:
> 
> IDCOMPL  SEX   HEREDITY
> 1 0   1  2
> 1 0   1  2
> 1 3   1  2
> 2 0   0  1
> 2 1   0  1
> 2 2   0  1
> 2 2   0  1
> 3 0   0  1
> 3 0   0  1
> 3 0   0  1
> 3 0   0  1
> 3 2   0  1
> 4 0   1  2
> 4 0   1  2.
> 
> Where, COMPL= health complication of diabetic patients which has value
> labels   as  0= no complication,1=coronary heart disease, 2=retinopathy, 3=
> nephropathy.
> 
> 
> I want to select only the first complication that occurred to each patient.
> What R function can I use?
> 

> dat[ with(dat, ave(COMPL, ID, FUN=function(x) cumsum(x>0) ) ) ==1,]
   ID COMPL SEX HEREDITY
3   1 3   12
5   2 1   01
12  3 2   01



>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting First Incidence from Longitudinal Data

2013-02-23 Thread arun
Hi,
Try this:
dat1<- read.table(text="
ID    COMPL  SEX  HEREDITY
1    0  1  2
1    0  1  2
1    3  1  2
2    0  0  1
2    1  0  1
2    2  0  1
2    2  0  1
3    0  0  1
3    0  0  1
3    0  0  1
3    0  0  1
3    2  0  1
4    0  1  2
4    0  1  2
",sep="",header=TRUE)
library(plyr)
dat2<- dat1[ddply(dat1,.(ID),summarise,COMPL!=0)[,2],]
 aggregate(.~ID,data=dat2,head,1)
#  ID COMPL SEX HEREDITY
#1  1 3   1    2
#2  2 1   0    1
#3  3 2   0    1
A.K. 




- Original Message -
From: Tasnuva Tabassum 
To: r-help@r-project.org
Cc: 
Sent: Saturday, February 23, 2013 9:28 AM
Subject: [R] Selecting First Incidence from Longitudinal Data

I have a longitudinal competing risk data of the form:

ID    COMPL  SEX   HEREDITY
1     0       1      2
1     0       1      2
1     3       1      2
2     0       0      1
2     1       0      1
2     2       0      1
2     2       0      1
3     0       0      1
3     0       0      1
3     0       0      1
3     0       0      1
3     2       0      1
4     0       1      2
4     0       1      2.

Where, COMPL= health complication of diabetic patients which has value
labels   as  0= no complication,1=coronary heart disease, 2=retinopathy, 3=
nephropathy.


I want to select only the first complication that occurred to each patient.
What R function can I use?

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.