[R] Survival Analysis questions

2008-07-01 Thread Viet Le
Dear all,I'm a newbie to Survival Analysis and I have some questions about the 
functions in the survival package.First, what's the difference between survreg 
and coxph? Say fit1<- survreg(survObj ~ var1 + var2 + var3, dist= 'whatever') 
and fit2<- coxph(survObj ~ var1 + var2 + var3), what's the difference between 
fit1 and fit2? They all do regression on explanatory variables. What does it 
mean dist in survreg?Second, what's the difference in these two specifications 
of the proportional hazard model?coxph(survObj ~ var1 + 
strata(var2))coxph(survObj ~ var1*strata(var2))Thanx,Pierre
_
[[elided Hotmail spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] survival analysis and censoring

2008-03-12 Thread Terry Therneau

  In your particular case I don't think that censoring is an issue, at least 
not 
for the reason that you discuss.  The basic censoring assumption in the Cox 
model is that subjects who are censored have the same future risk as those who 
were a. not censored and b. have the same covariates.  
   The real problem with informative censoring are the covaraites that are not 
in the model; ones that I likely don't even know exist.  Assume for instance 
that some unknown exposure X, Perth sunlight say, makes people much more likely 
to get both of the outcomes.  Assume further that it matters, i.e., the study 
includes a reasonable number of people with and without this exposure.  Then 
someone who has an early heart attack actually has a higher risk of colorectal 
cancer than a colleague of the same age/sex/followup who did not have a heart 
attack, the reason being that the HA guy is more likely to be from Perth.
   
   Your simulation went wrong by not actually accounting for time.  You created 
an outcome table for CC & HD and added a random time vector to it.  If someone 
would have had CC at 2 years and now has HD at 1 year, you can't just change 
the 
status to make them censored at 2.  The gambling analogy would be kicking 
someone out of the casino just before they win -- it does odd things to the 
odds.
   Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] survival analysis and censoring

2008-03-12 Thread Geoff Russell
Dear Prof. Therneau,

Many thanks for this,

On 3/13/08, Terry Therneau <[EMAIL PROTECTED]> wrote:
>
>   In your particular case I don't think that censoring is an issue, at least 
> not
>  for the reason that you discuss.  The basic censoring assumption in the Cox
>  model is that subjects who are censored have the same future risk as those 
> who
>  were a. not censored and b. have the same covariates.
>The real problem with informative censoring are the covaraites that are not
>  in the model; ones that I likely don't even know exist.  Assume for instance
>  that some unknown exposure X, Perth sunlight say, makes people much more 
> likely
>  to get both of the outcomes.  Assume further that it matters, i.e., the study
>  includes a reasonable number of people with and without this exposure.  Then
>  someone who has an early heart attack actually has a higher risk of 
> colorectal
>  cancer than a colleague of the same age/sex/followup who did not have a heart
>  attack, the reason being that the HA guy is more likely to be from Perth.
>
>Your simulation went wrong by not actually accounting for time.  You 
> created
>  an outcome table for CC & HD and added a random time vector to it.  If 
> someone
>  would have had CC at 2 years and now has HD at 1 year, you can't just change 
> the
>  status to make them censored at 2.  The gambling analogy would be kicking
>  someone out of the casino just before they win -- it does odd things to the
>  odds.

I'm still astonished that this is the explanation, but I've spent an
hour playing with
my little R code model and this is exactly the problem. Score 1 for solid
maths and 0 for my intuition.

Many Thanks,
Geoff



>
>Terry Therneau
>
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Survival Analysis with two different events

2008-06-29 Thread sickboyedd

Hello all,

I am hoping to use survival analysis to examine whether parasite attack
increases nest death in a species of social wasp. I therefore have data for

1. Whether the nest "died" in the 6 week census period ("Status", where
1=died, 0=survived)
2. The day number of death/last recorded day it was observed alive.
3. Whether the nest was attacked by the parasite (0/1 as with 1.)
4. The day number of attack/ last recorded day the nest was observed without
a parasite.

i.e. example dataset:

status   death   para   paraday
0   42   0   42
1   32   0   42
1   25   1   13
0   42   1   25 ...

I've looked over r-help, as well as in Crawley etc., but I have yet to find
a solution. Can anyone point me in the right direction or literature?

Many thanks,

Edd Almond
-- 
View this message in context: 
http://www.nabble.com/Survival-Analysis-with-two-different-events-tp18183526p18183526.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survival Analysis with two different events

2008-06-29 Thread Ben Bolker
sickboyedd  gmail.com> writes:

> 
> 
> Hello all,
> 
> I am hoping to use survival analysis to examine whether parasite attack
> increases nest death in a species of social wasp. I therefore have data for
> 
> 1. Whether the nest "died" in the 6 week census period ("Status", where
> 1=died, 0=survived)
> 2. The day number of death/last recorded day it was observed alive.
> 3. Whether the nest was attacked by the parasite (0/1 as with 1.)
> 4. The day number of attack/ last recorded day the nest was observed without
> a parasite.
> 
> i.e. example dataset:
> 
> status   death   para   paraday
> 0   42   0   42
> 1   32   0   42
> 1   25   1   13
> 0   42   1   25 ...
> 
> I've looked over r-help, as well as in Crawley etc., but I have yet to find
> a solution. Can anyone point me in the right direction or literature?
> 

   You might want to send this to r-sig-ecology if you need further
discussion.  In the meantime, the very simplest
thing (conditioning on whether the nest was
attacked or not) would be

library(survival)
c1 = coxph(Surv(death,status)~para,data=mydata)

(you should definitely read up a bit on survival analysis,
Cox proportional hazards, etc..  I think there's a chapter
in the book by Scheiner and Gurevitch, geared towards
ecologists).

Dealing with parasite attack in a more fine-grained
way (i.e. assessing mortality before and after parasitism)
would be a little trickier, but I wouldn't worry about
it until after you've understood the first stage of
the analysis.

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survival Analysis with two different events

2008-06-30 Thread Terry Therneau
sickboyedd  gmail.com> writes:

> 
> 
> Hello all,
> 
> I am hoping to use survival analysis to examine whether parasite attack
> increases nest death in a species of social wasp. I therefore have data for
> 
> 1. Whether the nest "died" in the 6 week census period ("Status", where
> 1=died, 0=survived)
> 2. The day number of death/last recorded day it was observed alive.
> 3. Whether the nest was attacked by the parasite (0/1 as with 1.)
> 4. The day number of attack/ last recorded day the nest was observed without
> a parasite.
> 
> i.e. example dataset:
> 
> status   death   para   paraday
> 0   42   0   42
> 1   32   0   42
> 1   25   1   13
> 0   42   1   25 ...
> 
> I've looked over r-help, as well as in Crawley etc., but I have yet to find
> a solution. Can anyone point me in the right direction or literature?
> 

 The classic solution in biomedical work is a time-dependent covariate.  Create
 a new data set like this:
  time1   time2  status  parasite
   042   0 0
   032   1 0
   013   0 0
   13   25   1 1
   
   ...
   
The key is lines 3 and 4, which show the colony parasite free from day 0 to 
13, and with parasite from day 13 to 25.  Then one uses a Cox model with
fit <- coxph(Surv(time1, time2, status) ~ parasite)
summary(fit)

It estimates the increase in death rate with parasite versus no parasite.  
These 
models were originally developed for treatment regimens that change over time.
A given colony (subject) can have as many lines of data as you want, subject to 
the fact that the time intervals can't overlap (which would correspond to two 
copies of the same person alive at the same time).  The status variable for a 
multi-line dataset =1 if THIS interval ends with an event.  Look at the 
survival 
analysis chapter of Venables & Ripley, Modern Applied Statistics with S, for 
further insight. (or many other books)

Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Survival analysis with no events in one treatment group

2007-12-30 Thread John Field
I'm trying to fit a Cox proportional hazards model to some hospital 
admission data.  About 25% of the patients have had at least one 
admission, and of these, 40% have had two admissions within the 12 
month period of the study.  Each patients has had one of 4 
treatments, and one of the treatment groups has had no admissions for 
the period.  I used:

surv.obj<-Surv(time=time1,time2=time2,event=event,type="counting")
model<-coxph(surv.obj~Treatment+cluster(Subject))

and, as explained in the coxph help page, I get a warning message 
about convergence because the MLE of one of the coefficients is 
infinite since there are no admissions in one group.

I'm looking for suggestions about how to proceed with an analysis of 
these data.  I'd prefer not to ignore the fact that there are 
multiple admissions, but any alternative ideas I have at the moment do this.

Many thanks,
John Field
=
Faculty of Health Sciences Statistical Support Service
The University of Adelaide, Australia 5005

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survival analysis with no events in one treatment group

2007-12-30 Thread Daniel Malter
Hi John,

I am on the slow side - can you provide sample code. How can one treatment
group have no admissions? 

Let's say there are treatments W, X, Y, Z. Do you mean that NONE of the
patients who got admitted the first time and, say, received treatment X
during the first admission, have ever had a second admission (in your data).
And for the other treatments W, Y, and Z some of those who got admitted the
first time came in a second time?

Cheers and a happy new year's eve,
Daniel

-
cuncta stricte discussurus
-

-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im
Auftrag von John Field
Gesendet: Sunday, December 30, 2007 9:16 PM
An: R-help@r-project.org
Betreff: [R] Survival analysis with no events in one treatment group

I'm trying to fit a Cox proportional hazards model to some hospital
admission data.  About 25% of the patients have had at least one admission,
and of these, 40% have had two admissions within the 12 month period of the
study.  Each patients has had one of 4 treatments, and one of the treatment
groups has had no admissions for the period.  I used:

surv.obj<-Surv(time=time1,time2=time2,event=event,type="counting")
model<-coxph(surv.obj~Treatment+cluster(Subject))

and, as explained in the coxph help page, I get a warning message about
convergence because the MLE of one of the coefficients is infinite since
there are no admissions in one group.

I'm looking for suggestions about how to proceed with an analysis of these
data.  I'd prefer not to ignore the fact that there are multiple admissions,
but any alternative ideas I have at the moment do this.

Many thanks,
John Field
=
Faculty of Health Sciences Statistical Support Service The University of
Adelaide, Australia 5005

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survival analysis with no events in one treatment group

2007-12-30 Thread Daniel Malter
Hi again,

I meant some rows of sample data, rather than sample code. Sorry about that.

Daniel 


-
cuncta stricte discussurus
-

-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im
Auftrag von John Field
Gesendet: Sunday, December 30, 2007 9:16 PM
An: R-help@r-project.org
Betreff: [R] Survival analysis with no events in one treatment group

I'm trying to fit a Cox proportional hazards model to some hospital
admission data.  About 25% of the patients have had at least one admission,
and of these, 40% have had two admissions within the 12 month period of the
study.  Each patients has had one of 4 treatments, and one of the treatment
groups has had no admissions for the period.  I used:

surv.obj<-Surv(time=time1,time2=time2,event=event,type="counting")
model<-coxph(surv.obj~Treatment+cluster(Subject))

and, as explained in the coxph help page, I get a warning message about
convergence because the MLE of one of the coefficients is infinite since
there are no admissions in one group.

I'm looking for suggestions about how to proceed with an analysis of these
data.  I'd prefer not to ignore the fact that there are multiple admissions,
but any alternative ideas I have at the moment do this.

Many thanks,
John Field
=
Faculty of Health Sciences Statistical Support Service The University of
Adelaide, Australia 5005

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survival analysis with no events in one treatment group

2007-12-30 Thread John Field
Hi Daniel,

Sorry, it may have been clearer if I had used 
"subjects" instead of "patients".  The treatments 
were administered to all subjects, and then in 
the succeeding 12 months, some were hospitalised 
and some were not.  Hence only about 25% of the subjects were hospitalised.

The start of the data:
subjtime1   time2   event Treat
1   0   6.2 1   A
1   6.2 12  0   A
2   0   12  0   A

so subject 1 was hospitalised at 6.2 months, subject 2 not at all.

Hope this makes it clearer.
Regards,
John





.At 12:59 PM 31/12/2007, Daniel Malter wrote:
>Hi John,
>
>I am on the slow side - can you provide sample code. How can one treatment
>group have no admissions?
>
>Let's say there are treatments W, X, Y, Z. Do you mean that NONE of the
>patients who got admitted the first time and, say, received treatment X
>during the first admission, have ever had a second admission (in your data).
>And for the other treatments W, Y, and Z some of those who got admitted the
>first time came in a second time?
>
>Cheers and a happy new year's eve,
>Daniel
>
>-
>cuncta stricte discussurus
>-
>
>-Ursprüngliche Nachricht-
>Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im
>Auftrag von John Field
>Gesendet: Sunday, December 30, 2007 9:16 PM
>An: R-help@r-project.org
>Betreff: [R] Survival analysis with no events in one treatment group
>
>I'm trying to fit a Cox proportional hazards model to some hospital
>admission data.  About 25% of the patients have had at least one admission,
>and of these, 40% have had two admissions within the 12 month period of the
>study.  Each patients has had one of 4 treatments, and one of the treatment
>groups has had no admissions for the period.  I used:
>
>surv.obj<-Surv(time=time1,time2=time2,event=event,type="counting")
>model<-coxph(surv.obj~Treatment+cluster(Subject))
>
>and, as explained in the coxph help page, I get a warning message about
>convergence because the MLE of one of the coefficients is infinite since
>there are no admissions in one group.
>
>I'm looking for suggestions about how to proceed with an analysis of these
>data.  I'd prefer not to ignore the fact that there are multiple admissions,
>but any alternative ideas I have at the moment do this.
>
>Many thanks,
>John Field
>=
>Faculty of Health Sciences Statistical Support Service The University of
>Adelaide, Australia 5005
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survival analysis with no events in one treatment group

2007-12-30 Thread Daniel Malter
Hi John,

to me it still seems that you have two "problems", given your first email.
The first one, if I am correct, is that you have NO admissions in one of the
groups. That is, you have Treat A, B, C, D and for one of these treatments
there is not a single admission. Then you cannot estimate a survival
function for this group because nobody died (got hospitalized). I think you
have to exclude this group because your survival rate is 100% and your
hazard rate 0%. But more proficient people may have better advice with that.

That subject 2 did not get hospitalized is not a problem though. The
observation is censored at 12 months. Since there are more patients in his
treatment group (subject 1) who had an event, the survival function can be
estimated. 

To the multiple admissions problem: To account for the fact that some have
more than one event you may create a "number of events" variable, say
numEvents, and include that in a strata() argument to your regression call:

surv.obj<-Surv(time=time1,time2=time2,event=event,type="counting")
model<-coxph(surv.obj~Treatment+strata(numEvents))

Use strata if you think that your baseline hazard is different in the
different strata (in your case: if you think that the baseline hazard of
having an event differs with having had prior event(s)). At the same time
you assume that your treatment effect  - the beta on your Treat variable -
is the same across all strata. If you have reason to assume that the effect
of your treatment varies (interacts) with the number of prior events, then
it is not the correct approach. In addition you may include the
cluster(Subject) or frailty(Subject) commands.

Instead of strata you might also consider using a dummy variable, coding the
number of prior events (if the maximum number admissions per patient is
reasonably small and the number of cell frequencies reasonably large).

Finally, you may want to consult Terry Thernau and Patricia Grambsch's book
"Modeling survival data". It shows how to apply the techniques in R/S-Plus.
I find it invaluable.

Does that help you?

Daniel


-
cuncta stricte discussurus
-

-Ursprüngliche Nachricht-
Von: John Field [mailto:[EMAIL PROTECTED] 
Gesendet: Sunday, December 30, 2007 10:14 PM
An: Daniel Malter; R-help@r-project.org
Betreff: Re: AW: [R] Survival analysis with no events in one treatment group

Hi Daniel,

Sorry, it may have been clearer if I had used "subjects" instead of
"patients".  The treatments were administered to all subjects, and then in
the succeeding 12 months, some were hospitalised and some were not.  Hence
only about 25% of the subjects were hospitalised.

The start of the data:
subjtime1   time2   event Treat
1   0   6.2 1   A
1   6.2 12  0   A
2   0   12  0   A

so subject 1 was hospitalised at 6.2 months, subject 2 not at all.

Hope this makes it clearer.
Regards,
John





.At 12:59 PM 31/12/2007, Daniel Malter wrote:
>Hi John,
>
>I am on the slow side - can you provide sample code. How can one 
>treatment group have no admissions?
>
>Let's say there are treatments W, X, Y, Z. Do you mean that NONE of the 
>patients who got admitted the first time and, say, received treatment X 
>during the first admission, have ever had a second admission (in your
data).
>And for the other treatments W, Y, and Z some of those who got admitted 
>the first time came in a second time?
>
>Cheers and a happy new year's eve,
>Daniel
>
>-
>cuncta stricte discussurus
>-
>
>-Ursprüngliche Nachricht-
>Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
>Im Auftrag von John Field
>Gesendet: Sunday, December 30, 2007 9:16 PM
>An: R-help@r-project.org
>Betreff: [R] Survival analysis with no events in one treatment group
>
>I'm trying to fit a Cox proportional hazards model to some hospital 
>admission data.  About 25% of the patients have had at least one 
>admission, and of these, 40% have had two admissions within the 12 
>month period of the study.  Each patients has had one of 4 treatments, 
>and one of the treatment groups has had no admissions for the period.  I
used:
>
>surv.obj<-Surv(time=time1,time2=time2,event=event,type="counting")
>model<-coxph(surv.obj~Treatment+cluster(Subject))
>
>and, as explained in the coxph help page, I get a warning message about 
>convergence because the MLE of one of the coefficients is infinite 
>since there are no admissions in one group.
>
>I'm looking for suggestions about how to proceed with an analysis of 
>these data.  I'd prefer not to ignore the fact that there are multiple 
>admissions, but any alternative ideas I have at the moment do this.
>
>Many thanks

Re: [R] Survival analysis with no events in one treatment group

2007-12-31 Thread Terry Therneau
John,
   The key issue with a data set that has no events in one group is that the 
usual Wald tests, i.e., beta/se(beta), do not work.  They is based on a Taylor 
series argument of the usual type: f(x) = f(x0) + a polynomial in (x-x0), and 
for infinite beta "x" and "x0" are so far apart that the approximation isn't 
even close.  The score and likelihood ratio statistics are just fine, however. 
So if you completely ignore any printout that involves the "infinite beta" 
columns of the estimated variance matrix all should be well.
   
   In your case, a second issue is that the likelihood ratio test is not valid 
for data sets with multiple events per subject. You have to account for the 
within subject correlation in some way, either by moving to a random effects 
model or a gee type variance.  You have done the latter by adding "cluster" to 
your coxph call.  Thus, only the "robust score test" line of your final output 
is reliable.  The LR test is tainted by multiple events, and the robust Wald by 
the infinite beta.  Use summary() to print all the test statistics.
   
  Assume you have variables x1 and x2 in a model, and want to test the 
importance of adding x3.  Stepwise regression programs often use score tests 
for 
this, but most users have never done it and don't know how. 
> fit1 <- coxph(Surv(time1, time2, event) ~ x1 + x2 + cluster(subject),
   subset= (!is.na(x3)))
> fit2 <- coxph(Surv(time1, time2, event) ~ x1 + x2 + x3 + 
  cluster(subject), init=c(coef(fit1), 0))
  
Now the "robust score test" in fit2 is a test of (coef1, coef2, 0) [no need for 
x3] vs the model with x3.  If you have no missing values you can skip the 
subset 
argument, but make sure that the two fits are actually on the same data set: 
always check that the "n=" part of the printout matches.

Terry Therneau
Mayo Clinic

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] comparing SAS and R survival analysis with time-dependent covariates

2008-12-04 Thread Svetlana Eden

Dear R-help,

I was comparing SAS (I do not know what version it is) and R (version 
2.6.0 (2007-10-03) on Linux) survival analyses with time-dependent 
covariates. The results differed significantly so I tried to understand 
on a short example where I went wrong. The following example shows that 
even when argument 'method' in R function coxph and argument 'ties' in 
SAS procedure phreg are the same, the results of Cox regr.  are 
different. This seems to happen when there are ties in the 
events/covariates times.

My question is what software, R or SAS, is more reliable for the 
survival analysis with time-dependent covariates or if you could point 
out a problem in the following example.

Example.   SAS gives HR=3.236:

data trythis;
input id days timedeli stat;
datalines;
  13.5  1
  2   1.51  1
  36   1000 0
  48   1000 1
  58 1  0
  621  1000 1
  7113  1
run;
proc phreg data=trythis;
  model days*stat(0)=deli/risklimits ties=exact;
  if  timedeli>days then deli=0; else deli=1;
run;

Example (continued).  R gives HR=3.91:

tmp = data.frame(id=c(1, 1, 2, 2, 3, 4, 5, 5, 6, 7, 7), start=c(0.0, 
0.5, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 3.0), end=c(0.5,  3.0,  
1.0,  1.5,  6.0,  8.0,  1.0,  8.0, 21.0,  3.0, 11.0), delir=c(0, 1, 0, 
1, 0, 0, 0, 1, 0, 0, 1), outcome=c(0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1))
tmp
surv = Surv(time=tmp$start, time2=tmp$end, tmp$outcome)
cphres = coxph(surv ~ tmp$delir, method="exact")
summary(cphres)[["coef"]]

After breaking a tie b/w an event and a time-dependent observation, R 
gives the same result as SAS.

tmp$end[2]=tmp$end[2] + .1
tmp
surv = Surv(time=tmp$start, time2=tmp$end, tmp$outcome)
cphres = coxph(surv ~ tmp$delir, method="exact")
summary(cphres)[["coef"]]

Thank you so much for time,

Svetlana






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] survival analysis: time-variant covariates, right censored, re-current events

2009-08-21 Thread clue_less

I have a data set for survival analysis in R. The data have time-variant
covariates, and right-censored outcomes, which could also be re-current
events.

What R package should I use?

Thanks!
-- 
View this message in context: 
http://www.nabble.com/survival-analysis%3A-time-variant-covariates%2C-right-censored%2C-re-current-events-tp25083950p25083950.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] comparing SAS and R survival analysis with time-dependent covariates

2008-12-05 Thread Terry Therneau
  This query of "why do SAS and S give different answers for Cox models" comes 
up every so often.  The two most common reasons are that
a. they are using different options for the ties
b. the SAS and S data sets are slightly different.
You have both errors.

First, make sure I have the same data set by reading a common file, and then
compare the results.

tmt54% more sdata.txt
 1   0.0  0.5 0   0
 1   0.5  3.0 1   1
 2   0.0  1.0 0   0
 2   1.0  1.5 1   1
 3   0.0  6.0 0   0
 4   0.0  8.0 0   1
 5   0.0  1.0 0   0
 5   1.0  8.0 1   0
 6   0.0 21.0 0   1
 7   0.0  3.0 0   0
 7   3.0 11.0 1   1

tmt55% more test.sas
options linesize=80;

data trythis;
infile 'sdata.txt';
input id start end delir outcome;

proc phreg data=trythis;
  model (start, end)*outcome(0)=delir/ ties=discrete;

proc phreg data=trythis;
  model (start, end)*outcome(0)=delir/ ties=efron;


tmt56% more test.r
trythis <- read.table('sdata.txt',
  col.names=c("id", "start", "end", "delir", "outcome"))

coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='exact')
coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='efron')

-
 I now get comparable answers.  Note that Cox's "exact partial likelihood" is 
the correct form to use for discrete time data.  I labeled this as the 'exact' 
method and SAS as the 'discrete' method.  The "exact marginal likelihood" of 
Prentice et al, which SAS calls the 'exact' method is not implemented in S.
 
  As to which package is more reliable, I can only point to a set of formal 
test 
cases that are found in Appendix E of the book by Therneau and Grambsch.  These 
are small data sets where the coefficients, log-likelihood, residuals, etc have 
all been worked out exactly in closed form.  R gets all of these test cases 
right, SAS gets almost all.
 
Terry Therneau

-
Svetlan Eden wrote
Dear R-help,

I was comparing SAS (I do not know what version it is) and R (version 
2.6.0 (2007-10-03) on Linux) survival analyses with time-dependent 
covariates. The results differed significantly so I tried to understand 
on a short example where I went wrong. The following example shows that 
even when argument 'method' in R function coxph and argument 'ties' in 
SAS procedure phreg are the same, the results of Cox regr.  are 
different. This seems to happen when there are ties in the 
events/covariates times.

My question is what software, R or SAS, is more reliable for the 
survival analysis with time-dependent covariates or if you could point 
out a problem in the following example.

 ...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] comparing SAS and R survival analysis with time-dependent covariates

2008-12-05 Thread Svetlana Eden


Thank you so much, this was very helpful.

Svetlana

Terry Therneau wrote:
  This query of "why do SAS and S give different answers for Cox models" comes 
up every so often.  The two most common reasons are that

a. they are using different options for the ties
b. the SAS and S data sets are slightly different.
You have both errors.

First, make sure I have the same data set by reading a common file, and then
compare the results.

tmt54% more sdata.txt
 1   0.0  0.5 0   0
 1   0.5  3.0 1   1
 2   0.0  1.0 0   0
 2   1.0  1.5 1   1
 3   0.0  6.0 0   0
 4   0.0  8.0 0   1
 5   0.0  1.0 0   0
 5   1.0  8.0 1   0
 6   0.0 21.0 0   1
 7   0.0  3.0 0   0
 7   3.0 11.0 1   1

tmt55% more test.sas
options linesize=80;

data trythis;
infile 'sdata.txt';
input id start end delir outcome;

proc phreg data=trythis;
  model (start, end)*outcome(0)=delir/ ties=discrete;

proc phreg data=trythis;
  model (start, end)*outcome(0)=delir/ ties=efron;


tmt56% more test.r
trythis <- read.table('sdata.txt',
  col.names=c("id", "start", "end", "delir", "outcome"))

coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='exact')
coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='efron')

-
 I now get comparable answers.  Note that Cox's "exact partial likelihood" is 
the correct form to use for discrete time data.  I labeled this as the 'exact' 
method and SAS as the 'discrete' method.  The "exact marginal likelihood" of 
Prentice et al, which SAS calls the 'exact' method is not implemented in S.
 
  As to which package is more reliable, I can only point to a set of formal test 
cases that are found in Appendix E of the book by Therneau and Grambsch.  These 
are small data sets where the coefficients, log-likelihood, residuals, etc have 
all been worked out exactly in closed form.  R gets all of these test cases 
right, SAS gets almost all.
 
 	Terry Therneau


-
Svetlan Eden wrote
Dear R-help,

I was comparing SAS (I do not know what version it is) and R (version 
2.6.0 (2007-10-03) on Linux) survival analyses with time-dependent 
covariates. The results differed significantly so I tried to understand 
on a short example where I went wrong. The following example shows that 
even when argument 'method' in R function coxph and argument 'ties' in 
SAS procedure phreg are the same, the results of Cox regr.  are 
different. This seems to happen when there are ties in the 
events/covariates times.


My question is what software, R or SAS, is more reliable for the 
survival analysis with time-dependent covariates or if you could point 
out a problem in the following example.


 ...
 






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] survival analysis: time-variant covariates, right censored, re-current events

2009-08-21 Thread David Winsemius


On Aug 21, 2009, at 1:16 PM, clue_less wrote:



I have a data set for survival analysis in R. The data have time- 
variant
covariates, and right-censored outcomes, which could also be re- 
current

events.

What R package should I use?


Have you considered recasting the data into interval-censored form and  
using either Harrell's package:Design or Therneau's package:survival?


--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Survival-Analysis: How to get numerical values from survfit (and not just a plot)?

2009-02-17 Thread Bernhard Reinhardt

Hi!

I came across R just a few days ago since I was looking for a toolbox 
for cox-regression.


I´ve read
"Cox Proportional-Hazards Regression for Survival Data
Appendix to An R and S-PLUS Companion to Applied Regression" from John Fox.

As described therein plotting survival-functions works well 
(plot(survfit(model))). But I´d like to do some manipulation with the 
survival-functions before plotting them e.g. dividing one 
survival-function by another.


survfit() only returns an object of the following structure

 n events median 0.9LCL 0.9UCL
55.000 55.000  1.033  0.696  1.637

Can you tell me how I can calculate a survival- or baseline-function out 
of these values and how I extract the values from the object? I´m sure 
the calculation is done by the corresponding plot-routine, but I 
couldn´t find that one either.


Regards

Bernhard

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survival-Analysis: How to get numerical values from survfit (and not just a plot)?

2009-02-17 Thread Arthur Allignol

Hi,

See ?survfit.object

if fit is the object you get using survfit,
fit$surv will give you the survival probability.

Best,
arthur

Bernhard Reinhardt wrote:

Hi!

I came across R just a few days ago since I was looking for a toolbox 
for cox-regression.


I´ve read
"Cox Proportional-Hazards Regression for Survival Data
Appendix to An R and S-PLUS Companion to Applied Regression" from John Fox.

As described therein plotting survival-functions works well 
(plot(survfit(model))). But I´d like to do some manipulation with the 
survival-functions before plotting them e.g. dividing one 
survival-function by another.


survfit() only returns an object of the following structure

 n events median 0.9LCL 0.9UCL
55.000 55.000  1.033  0.696  1.637

Can you tell me how I can calculate a survival- or baseline-function out 
of these values and how I extract the values from the object? I´m sure 
the calculation is done by the corresponding plot-routine, but I 
couldn´t find that one either.


Regards

Bernhard

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survival-Analysis: How to get numerical values from survfit (and not just a plot)?

2009-02-17 Thread David Winsemius
I seriously doubt that a survfit object could only contain that  
information. I suspect that you are erroneously thinking that what  
print.survfit offers is the entire story.


What does str(survfit(, data=) ) show you?

> data(aml)
> aml.mdl <- survfit(Surv(time, status) ~ x, data=aml)
# this is with survfit.Design since I load Hmisc/Design by default now.
> str(aml.mdl)
List of 18
 $ n: int 23
 $ time : num [1:20] 9 13 18 23 28 31 34 45 48 161 ...
 $ n.risk   : num [1:20] 11 10 8 7 6 5 4 3 2 1 ...
 $ n.event  : num [1:20] 1 1 1 1 0 1 1 0 1 0 ...
 $ surv : num [1:20] 0.909 0.818 0.716 0.614 0.614 ...
 $ type : chr "right"
 $ ntimes.strata: Named int [1:2] 10 10
  ..- attr(*, "names")= chr [1:2] "1" "2"
 $ strata   : Named num [1:2] 10 10
  ..- attr(*, "names")= chr [1:2] "Maintained" "Nonmaintained"
 $ strata.all   : Named int [1:2] 11 12
  ..- attr(*, "names")= chr [1:2] "Maintained" "Nonmaintained"
 $ std.err  : num [1:20] 0.0953 0.1421 0.1951 0.2487 0.2487 ...
 $ upper: num [1:20] 0.987 0.951 0.899 0.835 0.835 ...
 $ lower: num [1:20] 0.508 0.447 0.35 0.266 0.266 ...
 $ conf.type: chr "log-log"
 $ conf.int : num 0.95
 $ maxtime  : num 161
 $ units: chr "Day"
 $ time.label   : chr "time"
 $ call : language survfit(formula = Surv(time, status) ~ x,  
data = aml)

 - attr(*, "class")= chr "survfit"
>

I also don't think survfit returns a Cox model.


On Feb 17, 2009, at 10:37 AM, Bernhard Reinhardt wrote:


Hi!

I came across R just a few days ago since I was looking for a  
toolbox for cox-regression.


I´ve read
"Cox Proportional-Hazards Regression for Survival Data
Appendix to An R and S-PLUS Companion to Applied Regression" from  
John Fox.


As described therein plotting survival-functions works well  
(plot(survfit(model))). But I´d like to do some manipulation with  
the survival-functions before plotting them e.g. dividing one  
survival-function by another.


survfit() only returns an object of the following structure

n events median 0.9LCL 0.9UCL
55.000 55.000  1.033  0.696  1.637

Can you tell me how I can calculate a survival- or baseline-function  
out of these values and how I extract the values from the object? I 
´m sure the calculation is done by the corresponding plot-routine,  
but I couldn´t find that one either.


Regards

Bernhard

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survival-Analysis: How to get numerical values from survfit (and not just a plot)?

2009-02-17 Thread Eik Vettorazzi

Hi Bernhard,
I'm wondering what you will expect to get in "dividing" two proportional 
survival curves from a fitted cox model.
Anyway, you can provide a newdata object to the survfit function 
containing any combination of cofactors you are interested in and then 
use summary, eg:


fit <- coxph( Surv(futime,fustat)~resid.ds+rx+ecog.ps,data=ovarian)
summary(survfit( fit,newdata=data.frame(rx=1, ecog.ps=2, resid.ds=0)))

hth.

Bernhard Reinhardt schrieb:

Hi!

I came across R just a few days ago since I was looking for a toolbox 
for cox-regression.


I´ve read
"Cox Proportional-Hazards Regression for Survival Data
Appendix to An R and S-PLUS Companion to Applied Regression" from John 
Fox.


As described therein plotting survival-functions works well 
(plot(survfit(model))). But I´d like to do some manipulation with the 
survival-functions before plotting them e.g. dividing one 
survival-function by another.


survfit() only returns an object of the following structure

 n events median 0.9LCL 0.9UCL
55.000 55.000  1.033  0.696  1.637

Can you tell me how I can calculate a survival- or baseline-function 
out of these values and how I extract the values from the object? I´m 
sure the calculation is done by the corresponding plot-routine, but I 
couldn´t find that one either.


Regards

Bernhard

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


--
Eik Vettorazzi
Institut für Medizinische Biometrie und Epidemiologie
Universitätsklinikum Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/42803-8243
F ++49/40/42803-7790

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.