Re: [R] Cox ridge regression
Question 1. Consider the following example from help(ridge): fit1 - coxph(Surv(futime, fustat) ~ rx + ridge(age, ecog.ps, theta=1), ovarian) As I understand, this builds a model in which `rx' is the predictor, whereas ridge penalty term contains variables `age' and `ph.ecog'. Could someone explain what it me... The ridge term introduces age as a predictor AND penalizes it. The model above has 3 predictors, 2 of them penalized. Later in the post you have a model with both age and ridge(age). This puts age in the model twice, once as a free parameter and once as a penalized one. Not surprisingly, the second ends up with a coefficient of 0 (within machine precision of zero). The warning message you got about NaN is likely related to this, that there are redundant terms in the model. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cox ridge regression
Thank you Terry, that answered all questions. As a suggestion, help page for ridge() might indicate that the ridge term simultaneously introduces predictors and penalizes them. Ljubomir From: Terry Therneau thern...@mayo.edu To: Ljubomir Buturovic ljubo...@sfsu.edu Cc: r-help@r-project.org Subject: Re: Cox ridge regression Date: Mon, 3 Aug 2009 09:20:42 -0500 (CDT) Question 1. Consider the following example from help(ridge): fit1 - coxph(Surv(futime, fustat) ~ rx + ridge(age, ecog.ps, theta=1), ovarian) As I understand, this builds a model in which `rx' is the predictor, whereas ridge penalty term contains variables `age' and `ph.ecog'. Could someone explain what it me... The ridge term introduces age as a predictor AND penalizes it. The model above has 3 predictors, 2 of them penalized. Later in the post you have a model with both age and ridge(age). This puts age in the model twice, once as a free parameter and once as a penalized one. Not surprisingly, the second ends up with a coefficient of 0 (within machine precision of zero). The warning message you got about NaN is likely related to this, that there are redundant terms in the model. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cox ridge regression
Hello, I have questions regarding penalized Cox regression using survival package (functions coxph() and ridge()). I am using R 2.8.0 on Ubuntu Linux and survival package version 2.35-4. Question 1. Consider the following example from help(ridge): fit1 - coxph(Surv(futime, fustat) ~ rx + ridge(age, ecog.ps, theta=1), ovarian) As I understand, this builds a model in which `rx' is the predictor, whereas ridge penalty term contains variables `age' and `ph.ecog'. Could someone explain what it means to regularize on parameters which are not part of the model? Based on definition of Cox ridge regression (see for example [1]), or any other regularized regression, the penalty term is a function of the coefficients corresponding to the predictor variables, and nothing else. Question 2. Consider a similar example: library(survival) lfit2 - coxph(Surv(time, status) ~ age+ph.ecog + ridge(age, ph.ecog, theta=1), cancer) print(lfit2) Call: coxph(formula = Surv(time, status) ~ age + ph.ecog + ridge(age, ph.ecog, theta = 1), data = cancer) coef se(coef) se2 Chisq DF p age1.13e-02 0.1119.32e-03 0.01 1 0.92 ph.ecog4.43e-01 1.3981.16e-01 0.10 1 0.75 ridge(age) 2.60e-21 0.1104.85e-17 0.00 1 1.00 ridge(ph.ecog) 5.14e-22 1.393 0.00 1 1.00 Iterations: 1 outer, 3 Newton-Raphson Degrees of freedom for terms= 0 0 0 Likelihood ratio test=19.1 on 0.01 df, p=3.54e-08 n=227 (1 observation deleted due to missingness) Warning message: In sqrt((diag(x$var2))[kk]) : NaNs produced What is the meaning of the ridge(age) and ridge(ph.ecog) coefficients? Again, based on the definition of Cox ridge regression, it simply adds a penalty term to the standard Cox regression function, and doesn't introduce any new predictors. What to make of the ridge(age) and ridge(ph.ecog) rows in the output? Question 3. What is the origin and significance of the warning in the previous example: Warning message: In sqrt((diag(x$var2))[kk]) : NaNs produced Thank you very much for your help, Ljubomir [1] Bovelstad et al., Predicting survival from microarray data - a comparative study (Bioinformatics, Vol. 23, no. 16, 2007, pp. 2080-2087). __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.