Hi, Try this: dat1$idx<-with(dat1,ifelse(is.na(delais)|delais<45 & delais>20, 1,ifelse(delais<60 & delais>=45,2,ifelse(delais<=90 & delais>=60,3,NA)))) dat1$idx1<-c(dat1$idx[-head(dat1$idx,1)],1)
library(zoo) res1<-do.call(rbind,lapply(split(dat1,dat1$patient_id),function(x) {x$idx[as.logical(cumsum(is.na(x$idx)))]<-NA; x1<-x[!is.na(x$idx),]; x1[,6:8]<-na.locf(x1[,6:8]);x1$idx1[is.na(x1$idx1)]<-1; x2<-x1[rep(seq_len(nrow(x1)),x1$idx1),]; x2$delais[duplicated(x2$delais,fromLast=FALSE)]<-0; x2$t<-seq(0,nrow(x2)-1,1);x2[,-c(8:9)]})) row.names(res1)<- 1:nrow(res1) res2<- res1[,c(1:3,8,4:7)] res2 # patient_id number responsed_at t delais scores1 scores2 scores3 #1 1 1 2010-05-26 0 NA 2.6 0.5 0.7 #2 1 2 2010-07-07 1 42 2.5 0.5 0.7 #3 1 3 2010-08-14 2 38 2.3 0.5 0.7 #4 1 3 2010-08-14 3 0 2.3 0.5 0.7 #5 1 4 2010-10-01 4 48 2.5 0.7 0.6 #6 1 4 2010-10-01 5 0 2.5 0.7 0.6 #7 1 4 2010-10-01 6 0 2.5 0.7 0.6 #8 1 5 2010-12-01 7 61 2.5 0.7 0.6 #9 2 1 2011-07-19 0 NA 2.5 0.8 0.5 #10 2 1 2011-07-19 1 0 2.5 0.8 0.5 #11 2 1 2011-07-19 2 0 2.5 0.8 0.5 #12 2 2 2011-09-22 3 65 2.6 0.8 0.5 #13 2 3 2011-10-26 4 34 2.7 0.8 0.5 #14 3 1 2011-07-17 0 NA 2.8 0.5 0.6 A.K. ________________________________ From: GUANGUAN LUO <guanguan...@gmail.com> To: arun <smartpink...@yahoo.com> Sent: Friday, May 17, 2013 9:33 AM Subject: Re: how to calculate the mean in a period of time? Hello, Thank you for your help the lines added to the tables are the precedent lines but not the followed lines, if i just change x2<-x1[rep(seq_len(nrow(x1)-1), is that ok? and so the delais should be changed too, isn't it? GG 2013/5/17 arun <smartpink...@yahoo.com> Hi, >No problem. > >Arun > > > > > > >________________________________ >From: GUANGUAN LUO <guanguan...@gmail.com> >To: arun <smartpink...@yahoo.com> >Sent: Friday, May 17, 2013 4:19 AM > >Subject: Re: how to calculate the mean in a period of time? > > > >Ah, yes, that is the wrong thing i have written. Thank you so much. the output >which you have got is right. >Thanks a lot. >GG > > >2013/5/16 arun <smartpink...@yahoo.com> > >Hi, >>The output you showed is not clear especially the for the scores3, >> >> 2 2 2011-09-22 3 65 2.6 >> 0.8 0.8 >>2 3 2011-10-26 4 34 2.7 >> 0.8 0.8 >>3 1 2011-07-17 0 NA 2.8 >> 0.5 0.6 >>In the input data, the scores3 column didn't had 0.8. >> >> >>This is what I got: >>dat1<- read.table(text=" >> >>patient_id number responsed_at delais scores1 scores2 >>scores3 >> 1 1 2010-05-26 NA 2.6 >> 0.5 0.7 >> 1 2 2010-07-07 42 2.5 >> NA NA >> 1 3 2010-08-14 38 2.3 >> NA NA >> 1 4 2010-10-01 48 2.5 >> 0.7 0.6 >> 1 5 2010-12-01 61 2.5 >> NA NA >>2 1 2011-07-19 NA 2.5 >> 0.8 0.5 >>2 2 2011-09-22 65 2.6 >> NA NA >>2 3 2011-10-26 34 2.7 >> NA NA >>3 1 2011-07-17 NA 2.8 >> 0.5 0.6 >>3 2 2011-10-30 103 2.6 >> NA NA >>3 3 2011-12-23 54 2.5 >> NA NA >>",sep="",header=TRUE,stringsAsFactors=FALSE) >> >> dat1$idx<-with(dat1,ifelse(is.na(delais)|delais<45 & delais>20, >>1,ifelse(delais<60 & delais>=45,2,ifelse(delais<=90 & delais>=60,3,NA)))) >>library(zoo) >>res<-do.call(rbind,lapply(split(dat1,dat1$patient_id),function(x) >>{x$idx[as.logical(cumsum(is.na(x$idx)))]<-NA; x1<-x[!is.na(x$idx),]; >>x1[,6:8]<-na.locf(x1[,6:8]);x2<-x1[rep(seq_len(nrow(x1)),x1$idx),]; >>x2$delais[duplicated(x2$delais,fromLast=TRUE)]<-0; >>x2$t<-seq(0,nrow(x2)-1,1);x2})) >> >> >>row.names(res)<- 1:nrow(res) >> res1<- res[,c(1:3,9,4:7)] >>res1 >># patient_id number responsed_at t delais scores1 scores2 scores3 >>#1 1 1 2010-05-26 0 NA 2.6 0.5 0.7 >>#2 1 2 2010-07-07 1 42 2.5 0.5 0.7 >>#3 1 3 2010-08-14 2 38 2.3 0.5 0.7 >>#4 1 4 2010-10-01 3 0 2.5 0.7 0.6 >>#5 1 4 2010-10-01 4 48 2.5 0.7 0.6 >>#6 1 5 2010-12-01 5 0 2.5 0.7 0.6 >>#7 1 5 2010-12-01 6 0 2.5 0.7 0.6 >>#8 1 5 2010-12-01 7 61 2.5 0.7 0.6 >>#9 2 1 2011-07-19 0 NA 2.5 0.8 0.5 >>#10 2 2 2011-09-22 1 0 2.6 0.8 0.5 >>#11 2 2 2011-09-22 2 0 2.6 0.8 0.5 >>#12 2 2 2011-09-22 3 65 2.6 0.8 0.5 >>#13 2 3 2011-10-26 4 34 2.7 0.8 0.5 >>#14 3 1 2011-07-17 0 NA 2.8 0.5 0.6 >> >>A.K. >> >>________________________________ >>From: GUANGUAN LUO <guanguan...@gmail.com> >>To: arun <smartpink...@yahoo.com> >>Sent: Thursday, May 16, 2013 12:05 PM >> >>Subject: Re: how to calculate the mean in a period of time? >> >> >> >> >>Hello, AK, >>now i have a problem really complicated for me, >> >>Now my table is like this: >> >>patient_id number responsed_at delais scores1 scores2 >>scores3 >> 1 1 2010-05-26 NA 2.6 >> 0.5 0.7 >> 1 2 2010-07-07 42 2.5 >> NA NA >> 1 3 2010-08-14 38 2.3 >> NA NA >> 1 4 2010-10-01 48 2.5 >> 0.7 0.6 >> 1 5 2010-12-01 61 2.5 >> NA NA >>2 1 2011-07-19 NA 2.5 >> 0.8 0.5 >>2 2 2011-09-22 65 2.6 >> NA NA >>2 3 2011-10-26 34 2.7 >> NA NA >>3 1 2011-07-17 NA 2.8 >> 0.5 0.6 >>3 2 2011-10-30 103 2.6 >> NA NA >>3 3 2011-12-23 54 2.5 >> NA NA >> >>explications: delais = the date of "responsed_at" - the date of precedent >>"responsed_at" >>scores1 is measured every month >>scores2 and 3 are measured every three months >> >>first thing is : if the 20<delais <45, this count one month of delais >>if the 45<=delais <60, this count two month of delais,so add one line with >>the copy of the precedent line,and change the delais to 0 >> >>if the 60 <= delais <=90, this count three month of delais, so add two lines >>with the copy of the precedent line,and change these two delais to 0 >> >>if the delais >90, delete all the following lines >> >>and add a column "t", "t" means the month, "t" is in order >> >>second thing is : >>I want to replace NA of scores2 and scores3 with the precedent scores >> >>and finally get a table like this: >> >>patient_id number responsed_at t delais scores1 scores2 >> scores3 >> 1 1 2010-05-26 0 NA 2.6 >> 0.5 0.7 # scores2 and 3 are mesured every 3 >> 1 2 2010-07-07 1 42 2.5 >> 0.5 0.7 ## months,replace the following with >>precedent numbers >> 1 3 2010-08-14 2 38 2.3 >> 0.5 0.7 # copy this line >>1 3 2010-08-14 3 0 2.3 >> 0.5 0.7 # add one line here and change delais to 0 >> 1 4 2010-10-01 4 48 2.5 >> 0.7 0.6 >>1 4 2010-10-01 5 0 2.5 >> 0.7 0.6 >>1 4 2010-10-01 6 0 2.5 >> 0.7 0.6 >> 1 5 2010-12-01 7 61 2.5 >> 0.7 0.6 >>2 1 2011-07-19 0 NA 2.5 >> 0.8 0.5 >>2 1 2011-07-19 1 0 2.5 >> 0.8 0.5 >>2 1 2011-07-19 2 0 2.5 >> 0.8 0.5 >>2 2 2011-09-22 3 65 2.6 >> 0.8 0.8 >>2 3 2011-10-26 4 34 2.7 >> 0.8 0.8 >>3 1 2011-07-17 0 NA 2.8 >> 0.5 0.6 >>(3 2 2011-10-30 103 2.6 >> 0.5 0.6)# delete these 2 line >> >>(3 3 2011-12-23 54 2.5 >> 0.5 0.6) ## and delete the following lines of this patient >>because the delais is 103 which superior to 90 >> >> >> >>Do you know how can i get this? >>Thank you so much >> >>GG >> >> >>2013/5/9 GUANGUAN LUO <guanguan...@gmail.com> >> >>Hello, >>>Because according to the database, the first time of collecting the data is >>>just for an entry, an inclusion of the patients. What i should focus on is >>>the difference between two month. If i don't include the t0, i think i will >>>lose the information. I don't know if this is pertinent. >>>Thank you very much for your help, AK >>> >>>GG >>> >>> >>> >>>2013/5/9 arun <smartpink...@yahoo.com> >>> >>>HI GG, >>>>I did the code according the calculation shown by you. For example, in the >>>>first case, you had scores of (0+1+2+3)/4, which is not 3 months. It is >>>>3+ initial month. After that it is only 3 months that is repeating. So, >>>>my code is doing this: >>>> 1st group (the scores) >>>>(0+1+2+3)/4 >>>>2nd group( scores) >>>>(5+6+7)/3 >>>>3rd group >>>>(8+9+10)/3 >>>> >>>>Here, the months are used only for illustration. I am taking the mean of >>>>the scores. >>>> >>>>So, if you are only doing every 3 months, why do you need to combine the >>>>initial scores with the first 3 months? Anyway, I did what you asked for. >>>> >>>> >>>>Regards, >>>>Arun >>>> >>>> >>>>________________________________ >>>>From: GUANGUAN LUO <guanguan...@gmail.com> >>>>To: arun <smartpink...@yahoo.com> >>>>Sent: Thursday, May 9, 2013 4:19 AM >>>> >>>>Subject: Re: how to calculate the mean in a period of time? >>>> >>>> >>>> >>>>Hi,Thank you so much for your help. I have seen your code. It's very >>>>complicated. The time 0 is the beginning of collecting all the information. >>>> >>>>The time 1 is the first month when follow the patients, time 2 is the >>>>second month ect. If i want to calculate the means of every 3 month do you >>>>think i should do like this? >>>> >>>>GG >>>> >>>>2013/5/7 arun <smartpink...@yahoo.com> >>>> >>>>Hi, >>>>>Your question is still not clear. >>>>>May be this helps: >>>>> >>>>>dat2<- read.table(text=" >>>>> >>>>>patient_id t scores >>>>>1 0 1.6 >>>>>1 1 2.6 >>>>>1 2 2.2 >>>>>1 3 1.8 >>>>>2 0 2.3 >>>>>2 2 2.5 >>>>>2 4 2.6 >>>>>2 5 1.5 >>>>>",sep="",header=TRUE) >>>>> >>>>>library(plyr) >>>>> dat2New<-ddply(dat2,.(patient_id),summarize,t=seq(min(t),max(t))) >>>>> res<-join(dat2New,dat2,type="full") >>>>>res1<-do.call(rbind,lapply(split(res,res$patient_id),function(x) >>>>>{x1<-x[x$t!=0,];do.call(rbind,lapply(split(x1,((x1$t-1)%/%3)+1),function(y) >>>>> {y1<-if(any(y$t==1)) rbind(x[x$t==0,],y) else y; >>>>>data.frame(patient_id=unique(y1$patient_id),scores=mean(y1$scores,na.rm=TRUE))}) >>>>> ) })) >>>>> row.names(res1)<-1:nrow(res1) >>>>>res1$period<-with(res1,ave(patient_id,patient_id,FUN=seq)) >>>>> res1 >>>>># patient_id scores period >>>>>#1 1 2.05 1 >>>>>#2 2 2.40 1 >>>>>#3 2 2.05 2 >>>>> >>>>> >>>>>A.K. >>>>> >>>>> >>>>> >>>>> >>>>>________________________________ >>>>>From: GUANGUAN LUO <guanguan...@gmail.com> >>>>>To: arun <smartpink...@yahoo.com> >>>>>Sent: Tuesday, May 7, 2013 11:29 AM >>>>>Subject: Re: how to calculate the mean in a period of time? >>>>> >>>>> >>>>> >>>>> >>>>>Yes , as you have said, probably , it's not continuous. >>>>> >>>>> >>>>>2013/5/7 arun <smartpink...@yahoo.com> >>>>> >>>>>Hi, >>>>>>Your question is not clear. You mentioned to calculate the mean of 3 >>>>>>months, but infact you added the scores for t=0,1,2,3 as first 3 months, >>>>>>then possibly 4,5,6 as the next. So, it is not exactly three months. >>>>>>Isn't it? >>>>>> >>>>>> >>>>>>Dear R experts, >>>>>>sorry to trouble you again. >>>>>>My data is like this now : >>>>>>patient_id t scores >>>>>>1 0 1.6 >>>>>>1 1 2.6 >>>>>>1 2 2.2 >>>>>>1 3 1.8 >>>>>>2 0 2.3 >>>>>>2 2 2.5 >>>>>>2 4 2.6 >>>>>>2 5 1.5 >>>>>> >>>>>>I want to calculate the mean of period of 3 months, just get a table like >>>>>>this >>>>>> >>>>>>patient_id period scores >>>>>>1 1 2.05 >>>>>>(1.6+2.6+2.2+1.8)/4 >>>>>>2 1 2.4 >>>>>>(2.3+2.5)/2 >>>>>>2 2 2.05 >>>>>>(2.6+1.5)/2 >>>>>> >>>>>>thank you in avance >>>>>> >>>>> >>>> >>> >> > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.