Thank you all for your quick answers. With respect to my question on smooth noncumulative baseline cox hazard, I followed Prof Brian Ripley and I used the following:
library(survival) plot(basehaz(coxfinal2)[,2]/365.25+1945,basehaz(coxfinal2)[,1],t="l") xx <- seq(min(basehaz(coxfinal2)[,2]/365.25+1945),max(basehaz(coxfinal2)[,2]/365.2 5+1945),length=100) #my start value was 1st january 1945 library(pspline) lines(xx, predict(sm.spline(x=basehaz(coxfinal2)[,2]/365.25+1945,y=basehaz(coxfinal2)[ ,1],norder=2), xarg=xx,nderiv=1)) it might seem that computing the derivative when time is expressed in years gives the annual probability of event. The previous commands give a graphic exactly identical to: plot(basehaz(coxfinal2)[,2],basehaz(coxfinal2)[,1],t="l") xx <- seq(min(basehaz(coxfinal2)[,2]),max(basehaz(coxfinal2)[,2]),length=100) lines(xx, 365.25*predict(sm.spline(x=basehaz(coxfinal2)[,2],y=basehaz(coxfinal2)[,1],n order=2), xarg=xx,nderiv=1)) # [second command] However, if p is the probability of event for the 1st day of a given year, it is not obvious to me that the probability that there is one event for the 1st year equals 365*p. Am I mistaken? If no, what does the second command computes? So if someone can help me say what is the time unit for the risk shown by lines(xx, predict(sm.spline(x=basehaz(coxfinal2)[,2]/365.25+1945,y=basehaz(coxfinal2)[ ,1],norder=2), xarg=xx,nderiv=1)) ... @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ With respect to censoring, I think we all agree: Peter Dalgaard wrote: > Prof Brian Ripley <[EMAIL PROTECTED]> writes: > > > > I'm doing the same job as Hegre et al. (studying civil wars) but with the > > > counting process formulation of the Cox model. (I use intervals, my formula > > > looks like Surv(start,stop,status)~ etc.). > > > > Careful, that is left- and right- censored, not intervals. Surv has a > > type= argument. > > Nitpick: That's left-*truncated* and right-censored (the status refers > to the condition at the right end, people who die before the start are > not registered at all). I use the following dataset: id start stop status ... covariates 1 1 365 0 ... 1 365 400 1 ... [the war starts at 400 and ens at 550] 1 550 730 0 ... [there are possibly repeated events so the country re-enters the study] 2 1 365 0 ... 2 365 730 0 ... etc... where there is one id for every country, that is several lines for each country (each line thus representing an "interval" of time). with coxph(Surv(start, stop, status, type = "interval") ~ x1+...+cluster(id) I did not meant interval censoring (althought I think it is present here for country 1 from time 400 to 550), I meant "interval" in the same meaning as in the R help for Surv: "time2ending time of the interval for interval censored or counting process data only. Intervals are assumed to be open on the left and closed on the right, (start, end]. For counting process data, event indicates whether an event occurred at the end of the interval." "Surv has a type= argument." Yes, and the help says "The default is "right" or "counting" depending on whether the time2 argument is absent or present, respectively." Here, I omited the type, which means I used a counting process. Thus, the union of all intervals for country 2 (here, lines 4 and 5) lead to one big interval which is left truncated and right censored. Anyway, I think there is no ambiguity, since if one tries type = "interval" it says: Error in coxph(Surv(start, stop, status, type = "interval") ~ .... Cox model doesn't support "interval" survival data But thanks to Prof. Ripley for the comment, as I am not fully aware of the exact terminology in English. Regards, Mayeul KAUFFMANN ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html