Hello, I have questions regarding penalized Cox regression using survival package (functions coxph() and ridge()). I am using R 2.8.0 on Ubuntu Linux and survival package version 2.35-4.
Question 1. Consider the following example from help(ridge): > fit1 <- coxph(Surv(futime, fustat) ~ rx + ridge(age, ecog.ps, theta=1), > ovarian) As I understand, this builds a model in which `rx' is the predictor, whereas ridge penalty term contains variables `age' and `ph.ecog'. Could someone explain what it means to regularize on parameters which are not part of the model? Based on definition of Cox ridge regression (see for example [1]), or any other regularized regression, the penalty term is a function of the coefficients corresponding to the predictor variables, and nothing else. Question 2. Consider a similar example: > library(survival) > lfit2 <- coxph(Surv(time, status) ~ age+ph.ecog + ridge(age, ph.ecog, > theta=1), cancer) > print(lfit2) Call: coxph(formula = Surv(time, status) ~ age + ph.ecog + ridge(age, ph.ecog, theta = 1), data = cancer) coef se(coef) se2 Chisq DF p age 1.13e-02 0.111 9.32e-03 0.01 1 0.92 ph.ecog 4.43e-01 1.398 1.16e-01 0.10 1 0.75 ridge(age) 2.60e-21 0.110 4.85e-17 0.00 1 1.00 ridge(ph.ecog) 5.14e-22 1.393 0.00 1 1.00 Iterations: 1 outer, 3 Newton-Raphson Degrees of freedom for terms= 0 0 0 Likelihood ratio test=19.1 on 0.01 df, p=3.54e-08 n=227 (1 observation deleted due to missingness) Warning message: In sqrt((diag(x$var2))[kk]) : NaNs produced What is the meaning of the ridge(age) and ridge(ph.ecog) coefficients? Again, based on the definition of Cox ridge regression, it simply adds a penalty term to the standard Cox regression function, and doesn't introduce any new predictors. What to make of the ridge(age) and ridge(ph.ecog) rows in the output? Question 3. What is the origin and significance of the warning in the previous example: Warning message: In sqrt((diag(x$var2))[kk]) : NaNs produced Thank you very much for your help, Ljubomir [1] Bovelstad et al., Predicting survival from microarray data - a comparative study (Bioinformatics, Vol. 23, no. 16, 2007, pp. 2080-2087). ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.