Re: [R] Coxph not converging with continuous variable

2012-09-04 Thread Ravi Varadhan
Dear Terry,

I agree that this example is highly atypical.  Having said that, my experience 
with optimization algorithms is that scaling (a.k.a. standardizing) the 
continuous covariates is greatly helpful in terms of convergence.  Have you 
considered automatically standardizing the continuous covariates, and then 
converting the scaled coefficients back to the original scale?  Of course, the 
end user could do this just as easily!

Best,
Ravi

Ravi Varadhan, Ph.D.
Assistant Professor
The Center on Aging and Health
Division of Geriatric Medicine & Gerontology
Johns Hopkins University
rvarad...@jhmi.edu
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coxph not converging with continuous variable

2012-09-03 Thread Terry Therneau



On 09/03/2012 05:00 AM, r-help-requ...@r-project.org wrote:

The coxph function in R is not working for me when I use a continuous predictor 
in the model. Specifically, it
> fails to converge, even when bumping up the number of max iterations 
or setting reasonable initial values.
> The estimated Hazard ratio from the model is incorrect (verified by 
an AFT model). I've isolated it to the "x1"
> variable in the example below, which is log-normally distributed. The 
x1 here has extreme values, but I've
> been able to reproduce the problem intermittently with less extreme 
values. It seemed odd that I couldn't find
> this question asked anywhere, so I'm wondering if I'm just not seeing 
a mistake I've made.

 
Alex Keil
UNC Chapel Hill


Congratulations-- it's been a long time since someone managaed to break
the iteration code in coxph.

I used your example, but simplifed to using n=1000 and a 1 variable 
model.  The quantiles of your x1 variable are

> round(quantile(xx, c(0, 5:10/10)),2)
   0%   50%   60%   70%   80%   90%  100%
 0.06  2.67  3.75  5.74  8.93 15.04 98.38

For a coefficient of 1 (close to the real solution) you have one subject 
with a risk of death that 999 times the average risk (he should die 
before his next heartbeat) and another with relative risk of 1.99e-40 
(an immortal).  The key components of a Cox model iteration are, it 
turns out, weighted means and variances.  For this data 99.99929 % of 
the weight falls on a single observation, i.e., at beta=1 you have an 
effective sample size of 1. The optimal coefficient is the one that best 
predicts that single subject's death time.


  Due to the computational round off error that results, the routine 
takes a step of size 1.7 from a starting estimate of 1.0 when it should 
take a stop of size of about .05, then falls into step halving to 
overcome the mistake.  Rinse and repeat.


I could possibly make coxph resistant to this data set, but at the cost 
of a major rewrite and significantly slower performance.  Can you 
convince me that this data set has any relevance?


Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coxph not converging with continuous variable

2012-09-02 Thread Keil, Alex
The coxph function in R is not working for me when I use a continuous predictor 
in the model. Specifically, it fails to converge, even when bumping up the 
number of max iterations or setting reasonable initial values. The estimated 
Hazard ratio from the model is incorrect (verified by an AFT model). I've 
isolated it to the "x1" variable in the example below, which is log-normally 
distributed. The x1 here has extreme values, but I've been able to reproduce 
the problem intermittently with less extreme values. It seemed odd that I 
couldn't find this question asked anywhere, so I'm wondering if I'm just not 
seeing a mistake I've made.

Any help is appreciated to get this model to converge. Code, output, 
sessionInfo follows signature.

Alex Keil
UNC Chapel Hill


Here's an example:

library(survival)
set.seed(8182828)
#data
N = 10
shape = 0.75
hr = 3.5
betax <- -log(hr)/(shape)
ctime = 5

z1 <- rbinom(N, 1, 0.5)
z2 <- rbinom(N, 1, 0.5)
x1 <- exp(rnorm(N, 0 + 2*z1, 1))
wscale1 <- exp(0 + betax*x1 + z1 + z2 )
time1 <- rweibull(N, shape, wscale1)
t1 <- pmin(time1, rep(ctime, N))
d1 <- 1*(t1 == time1)
dat <- as.data.frame(cbind(t1, time1, d1, z1, z2, x1))
 summary(dat)

#output
> summary(dat)
   t1   time1   d1   z1z2   
x1  
 Min.   :0.00   Min.   : 0.   Min.   :0.   Min.   :0.0   Min.   
:0.   Min.   :  0.0147  
 1st Qu.:0.04   1st Qu.: 0.   1st Qu.:1.   1st Qu.:0.0   1st 
Qu.:0.   1st Qu.:  0.9552  
 Median :0.008400   Median : 0.0084   Median :1.   Median :0.5   Median 
:0.   Median :  2.7052  
 Mean   :0.295397   Mean   : 0.3260   Mean   :0.9906   Mean   :0.5   Mean   
:0.4997   Mean   :  6.8948  
 3rd Qu.:0.178325   3rd Qu.: 0.1783   3rd Qu.:1.   3rd Qu.:1.0   3rd 
Qu.:1.   3rd Qu.:  7.7409  
 Max.   :5.00   Max.   :52.8990   Max.   :1.   Max.   :1.0   Max.   
:1.   Max.   :431.2779 

> exp((cox1 <- coxph(Surv(t1, d1)~  x1 + z1+ z2, ties="breslow"))[[1]]) #hrs
   x1z1z2 
3.3782387 0.4925040 0.4850214 
Warning message:
In fitter(X, Y, strats, offset, init, control, weights = weights,  :
  Ran out of iterations and did not converge


#accelerated failure time model confirms estimate is off from coxph

> aft1 <- survreg(Surv(t1, d1)~ x1 + z1 + z2, dist="weibull"); coefs <- 
> aft1$coefficients; scale <- aft1$scale
>   data.frame(psi = -coefs[2], scale, hr=exp(-coefs[2]*(1/scale))) #hr, 
> scale is "weibull shape" in sas
psiscale   hr
x1 1.670406 1.91 3.499958
#end output

> sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] splines   stats graphics  grDevices utils datasets  methods   base  
   

other attached packages:
[1] survival_2.36-14

loaded via a namespace (and not attached):
[1] timereg_1.6-5
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.