Re: [R] Schoenfeld residuals for a null model coxph

2012-02-24 Thread Terry Therneau
 begin included message ---
I have a coxph model like

coxph(Surv(start, stop, censor) ~ x + y, mydata)

I would like to calculate the Schoenfeld residuals for the null, i.e the
same model where the beta hat vector (in practical terms, the coeff
vector spat out by summary()) is constrained to be all 0s --all lese
stays the same. 

- end inclusion ---

You can get the fit for any coefficient you want by setting the initial
values and asking for 0 iterations.

  fit0 <- coxph(Surv(start, stop, censor) ~ x + y, mydata, 
init=c(0,0), iter=0)
  resid(fit0, type='schoenfeld')

Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Schoenfeld residuals for a null model coxph

2012-02-23 Thread Federico Calboli
Hi,

I have a coxph model like

coxph(Surv(start, stop, censor) ~ x + y, mydata)

I would like to calculate the Schoenfeld residuals for the null, i.e the same 
model where the beta hat vector (in practical terms, the coeff vector spat out 
by summary()) is constrained to be all 0s --all lese stays the same. 

I could calculate it by hand, but I was wondering if there is a way of doing it 
with resid() -- resid(my.coxph.mod, type = 'schoenfeld') works wery well for 
the true Schoenfeld residuals, but I haven't figured out how to use it 
constraining the beta hat vector to be all 0s.

BW

F

--
Federico C. F. Calboli
Neuroepidemiology and Ageing Research
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Schoenfeld Residuals with tied data

2009-06-16 Thread Terry Therneau
-- begin included message 
Thank you for the reply. I really appreciate it.

I calculated the Scoenfeld residual per event and my results are the
following:

finage  race
17  -0.33942334 -2.0722187270.29024804
20  0.394600944 5.303968774 0.517689472
25  0.488184603 -1.438140647-0.359535162
25  -0.511815397-1.438140647-0.359535162

However, the results from R are shown as below:

fin  agerace
17  -0.3394233  -2.072219   0.290248
20  0.3946009   5.303969 0.5176895
25  0.4724112   -1.615875   -0.4039688
25  -0.5275888  -1.615875   -0.4039688

  end inclusion ---

Bessy,
  By default the coxph function uses the Efron approximation for ties (more 
accurate than the Breslow approx).  You are computing the Schoenfeld residuals 
using the betas from coxph (Efron), and then applying a Breslow formula.  I can 
reproduce your results by forcing the same computation in R:
  
  > fit1 <- coxph(Surv(week, arrest) ~ fin + age + race, bessdata)
  > fit2 <- coxph(Surv(week, arrest) ~ fin + age + race, bessdata,
 init=coef(fit1), iter=0, method='breslow')
  > resid(fit2, 'schoen')
 fin   age   race
 17 -0.3394233 -2.072219  0.2902480
 20  0.3946009  5.303969  0.5176895
 25  0.4881846 -1.438141 -0.3595352
 25 -0.5118154 -1.438141 -0.3595352

 The Efron approximation is simple for computation of beta, but a bit more 
subtle when doing the residuals from the fit (but still not hard).  You can 
find 
a detailed worked example in E.1.2 of the book by Therneau and Grambsch.
 
  If you had run the phreg procedure in SAS you would not have had this 
problem: 
it contains exactly the same incorrect computation.  (Appendix E of the book 
contains a set of very small data sets where the correct answers can be worked 
out in closed form.  SAS gets all of the Breslow approx computations correct 
and 
about 1/2 of the Efron ones.  R passes all the tests.)
  
Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Schoenfeld Residuals with tied data

2009-06-16 Thread Bessy

Thank you for the reply. I really appreciate it.

I calculated the Scoenfeld residual per event and my results are the
following:

finage  race
17  -0.33942334 -2.0722187270.29024804
20  0.394600944 5.303968774 0.517689472
25  0.488184603 -1.438140647-0.359535162
25  -0.511815397-1.438140647-0.359535162

However, the results from R are shown as below:

fin  agerace
17  -0.3394233  -2.072219   0.290248
20  0.3946009   5.303969 0.5176895
25  0.4724112   -1.615875   -0.4039688
25  -0.5275888  -1.615875   -0.4039688

These two results are both calculated based on the same dataset, which is
shown as below (the formula was attached with the original message):

weekarrest  fin age race
17  1   0   18  1
20  1   1   27  1
25  1   1   19  0
25  1   0   19  0
52  0   1   23  1
52  0   0   19  0


At time 17 and 20, there is no ties, my results are almost the same as R's.
However, the dataset contains two failures at time 25, I had got different
results from R's. 

If I summed the Schoenfeld residuls at a given event time, for example, 25
in my dataset, I cannot get the common results.

Does R modified the formula when meeting ties?

Thank you so much.

Bessy



Terry Therneau wrote:
> 
> 
> Formally, the Schoenfeld residuals are defined as one residual per event
> time.
> 
> I found that when there are tied events, however, that plots of the
> residuals 
> could be hard to visually interpret: sometimes a residual was large
> because of a 
> lack of fit at that point, sometimes because several death happened
> concurrently 
> at that point.
> 
> Since the Shoenfeld residual is a sum over the events at each time point,
> coxph 
> returns one residual per EVENT, rather than one per event time.  The plots
> work 
> much better (my subjective opinion).  If you sum the returned residuals at
> a 
> given event time, you will get the common result.
> 
>   Terry Therneau
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Schoenfeld-Residuals-with-tied-data-tp2403p24050423.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Schoenfeld Residuals with tied data

2009-06-15 Thread Terry Therneau

Formally, the Schoenfeld residuals are defined as one residual per event time.

I found that when there are tied events, however, that plots of the residuals 
could be hard to visually interpret: sometimes a residual was large because of 
a 
lack of fit at that point, sometimes because several death happened 
concurrently 
at that point.

Since the Shoenfeld residual is a sum over the events at each time point, coxph 
returns one residual per EVENT, rather than one per event time.  The plots work 
much better (my subjective opinion).  If you sum the returned residuals at a 
given event time, you will get the common result.

Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Schoenfeld Residuals with tied data

2009-06-15 Thread Bessy

Dear all,

I am struggling with calculation of Schoenfeld residuals of my Cox Ph
models.

Based on the formula as attached, I calculated the Schoenfeld residuals for
both non tied and tied data, respectively.

And then I validated my results with R using the same data sets. However, I
found that my results for non-tied data was ok but the results for tied data
were different from R's. 

How does R calculate the Schoenfeld Residuals for tied data? Could someone
give me some hints, please?

Thank you so much.

Bessy http://www.nabble.com/file/p2403/Schoenfeld%2BResiduals.doc
Schoenfeld+Residuals.doc 
-- 
View this message in context: 
http://www.nabble.com/Schoenfeld-Residuals-with-tied-data-tp2403p2403.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Schoenfeld residuals

2009-04-14 Thread Laura Bonnett
Hi,

Thank you for your comments and apologies for the delay in replying.
rem.Rcens =1 for the censored variables.  The problem arises because I
am not strictly looking at time to death.  Instead I am looking at
time to 12-month remission in epilepsy.  Therefore a lot of people
have the same event i.e. they successfully achieve 12-month remission
from day 1 of the treatment.

I think I shall avoid the problem by 'excluding' patients with
immediate 12-month remission i.e. I will look at patients with
immediate success in a separate analysis to patients with delayed
success.

Thanks for your help,

Laura


2009/4/6 Terry Therneau :
> Laura Bonnett was kind enough to send me a copy of the data that caused the
> plotting error, since it was an error I had not seen before.
>
> 1. The latest version of survival gives a nicer error message:
>
>> fit <- coxph(Surv(rem.Remtime, rem.Rcens) ~ all.sex, nearma)
>> cfit <- cox.zph(fit)
>> plot(cfit)
> Error in plot.cox.zph(cfit) :
>   Spline fit is singular, try a smaller degrees of freedom
>
> 2. What's the problem?
>  There are 1085 events in the data set (rem.Rcens==1), and of these 502 are
> tied events on exactly day 365.  The plot.cox.zph function tries to fit a
> smoothing spline to the data to help the eye; the fit gives weight 1 to each
> death and having this high a proportion of ties creates problems for the
> underlying regression.
>
> 3.
>> plot(cfit, df=2)
>  Warning messages:
> 1: In approx(xx, xtime, seq(min(xx), max(xx), length = 17)[2 * (1:8)]) :
>  collapsing to unique 'x' values
> 2: In approx(xtime, xx, temp) : collapsing to unique 'x' values
>
>  These warning messages are ignorable.  I'll work on making them go away.
>
>
> 4. A shot in the dark -- is perchance the variable rem.Rcens=1 a marker of a
> censored observation, and the events are 0?  (A whole lot of events at 1 year 
> is
> suspicious, but half censored at one year is believable.) Then the proper 
> coxph
> code is
>
>> fit2 <-  coxph(Surv(rem.Remtime, rem.Rcens==0) ~ all.sex, nearma)
>
>        Terry Therneau
>
>
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Schoenfeld residuals

2009-04-06 Thread Terry Therneau
Laura Bonnett was kind enough to send me a copy of the data that caused the 
plotting error, since it was an error I had not seen before.

1. The latest version of survival gives a nicer error message:
 
> fit <- coxph(Surv(rem.Remtime, rem.Rcens) ~ all.sex, nearma)
> cfit <- cox.zph(fit)
> plot(cfit)
Error in plot.cox.zph(cfit) : 
   Spline fit is singular, try a smaller degrees of freedom

2. What's the problem?
  There are 1085 events in the data set (rem.Rcens==1), and of these 502 are 
tied events on exactly day 365.  The plot.cox.zph function tries to fit a 
smoothing spline to the data to help the eye; the fit gives weight 1 to each 
death and having this high a proportion of ties creates problems for the 
underlying regression.
  
3. 
> plot(cfit, df=2)
  Warning messages:
1: In approx(xx, xtime, seq(min(xx), max(xx), length = 17)[2 * (1:8)]) :
  collapsing to unique 'x' values
2: In approx(xtime, xx, temp) : collapsing to unique 'x' values

  These warning messages are ignorable.  I'll work on making them go away.


4. A shot in the dark -- is perchance the variable rem.Rcens=1 a marker of a 
censored observation, and the events are 0?  (A whole lot of events at 1 year 
is 
suspicious, but half censored at one year is believable.) Then the proper coxph 
code is

> fit2 <-  coxph(Surv(rem.Remtime, rem.Rcens==0) ~ all.sex, nearma)

Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Schoenfeld Residuals

2009-04-03 Thread Laura Bonnett
Thank you for your comments.  I have about 200 out of 2000 tied data
points which makes the situation more complicated!  I'll have at look
at the book section you referred to.  With regards to making the ylim
finite, I'm not sure how I can go about that given that I don't
understand why it isn't already!

Thank you for your help,

Laura

2009/4/3 David Winsemius :
> I am not sure that ties are the only reason. If I create a few ties in the
> ovarian dataset that Therneau and Lumley provide, all I get are some
> warnings:
>> ovarian[4:5, 1] <- mean(ovarian[4:5, 1])
>> ovarian[6:8, 1] <- mean(ovarian[6:8, 1])
>> fit <- coxph( Surv(futime, fustat) ~ age + rx, ovarian)
>> temp<- cox.zph(fit)
>
>> plot(temp)
> Warning messages:
> 1: In approx(xx, xtime, seq(min(xx), max(xx), length.out = 17)[2 *  :
>  collapsing to unique 'x' values
> 2: In approx(xtime, xx, temp) : collapsing to unique 'x' values
>
> The error message you get is requesting a finite ylim. Have you considered
> acceding with that request?
>
> Alternative: Assuming the number of tied survival times is modest, have you
> tried jitter-ing the rem.Remtime variable a few times to see it the results
> are stable?
>
> If the number of ties is large, then you need to review Thernaeu & Gramsch
> section 3.3
>
> --
> David Winsemius
>
> On Apr 3, 2009, at 7:57 AM, Laura Bonnett wrote:
>
>> Dear All,
>>
>> Sorry to bother you again.
>>
>> I have a model:
>> coxfita=coxph(Surv(rem.Remtime/365,rem.Rcens)~all.sex,data=nearma)
>> and I'm trying to do a plot of Schoenfeld residuals using the code:
>> plot(cox.zph(coxfita))
>> abline(h=0,lty=3)
>>
>> The error message I get is:
>> Error in plot.window(...) : need finite 'ylim' values
>> In addition: Warning messages:
>> 1: In sqrt(x$var[i, i] * seval) : NaNs produced
>> 2: In min(x) : no non-missing arguments to min; returning Inf
>> 3: In max(x) : no non-missing arguments to max; returning -Inf
>>
>> My data (nearma) has a lot of rem.Remtime entries which are equal i.e
>> large amounts of tied data.  If I remove the entries where this is the
>> case from the dataset I get the results I want!
>>
>> Please can someone explain why removing paients with tied remission
>> time has such an effect on the code and also how to remedy the problem
>> without removing patients?
>>
>> Thank you very much,
>>
>> Laura.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Schoenfeld Residuals

2009-04-03 Thread David Winsemius
I am not sure that ties are the only reason. If I create a few ties in  
the ovarian dataset that Therneau and Lumley provide, all I get are  
some warnings:

> ovarian[4:5, 1] <- mean(ovarian[4:5, 1])
> ovarian[6:8, 1] <- mean(ovarian[6:8, 1])
> fit <- coxph( Surv(futime, fustat) ~ age + rx, ovarian)
> temp<- cox.zph(fit)

> plot(temp)
Warning messages:
1: In approx(xx, xtime, seq(min(xx), max(xx), length.out = 17)[2 *  :
  collapsing to unique 'x' values
2: In approx(xtime, xx, temp) : collapsing to unique 'x' values

The error message you get is requesting a finite ylim. Have you  
considered acceding with that request?


Alternative: Assuming the number of tied survival times is modest,  
have you tried jitter-ing the rem.Remtime variable a few times to see  
it the results are stable?


If the number of ties is large, then you need to review Thernaeu &  
Gramsch section 3.3


--
David Winsemius

On Apr 3, 2009, at 7:57 AM, Laura Bonnett wrote:


Dear All,

Sorry to bother you again.

I have a model:
coxfita=coxph(Surv(rem.Remtime/365,rem.Rcens)~all.sex,data=nearma)
and I'm trying to do a plot of Schoenfeld residuals using the code:
plot(cox.zph(coxfita))
abline(h=0,lty=3)

The error message I get is:
Error in plot.window(...) : need finite 'ylim' values
In addition: Warning messages:
1: In sqrt(x$var[i, i] * seval) : NaNs produced
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf

My data (nearma) has a lot of rem.Remtime entries which are equal i.e
large amounts of tied data.  If I remove the entries where this is the
case from the dataset I get the results I want!

Please can someone explain why removing paients with tied remission
time has such an effect on the code and also how to remedy the problem
without removing patients?

Thank you very much,

Laura.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Schoenfeld Residuals

2009-04-03 Thread Laura Bonnett
Dear All,

Sorry to bother you again.

I have a model:
coxfita=coxph(Surv(rem.Remtime/365,rem.Rcens)~all.sex,data=nearma)
and I'm trying to do a plot of Schoenfeld residuals using the code:
plot(cox.zph(coxfita))
abline(h=0,lty=3)

The error message I get is:
Error in plot.window(...) : need finite 'ylim' values
In addition: Warning messages:
1: In sqrt(x$var[i, i] * seval) : NaNs produced
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf

My data (nearma) has a lot of rem.Remtime entries which are equal i.e
large amounts of tied data.  If I remove the entries where this is the
case from the dataset I get the results I want!

Please can someone explain why removing paients with tied remission
time has such an effect on the code and also how to remedy the problem
without removing patients?

Thank you very much,

Laura.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.