and why we do not need to divide by the number of days in the year
anymore?
Here I'm guessing (since you don't offer enough evidence to
confirm) that the
difference is in the time scales used in your Aidsp$survtime
versus some other
example to which you are comparing .
Both models are run from the same data, so I am not expecting any
differences in
time scales.
To get similar results, I need to actually run the following
equations:
expected_lifetime_in_years = exp(fit)/365.25 ---
> Linear
prediction of the Proportional hazard model
expected_lifetime_in_years = 1/exp(fit)
---> Linear
prediction of the Cox model
where fit come from the linear prediction of each models,
respectively.
Actually, in the code below, I re-run the models and predictions
based on a
yearly sampling time (instead of daily).
Again, to get similar results, I now need to actually run the
following
equations:
expected_lifetime_in_years = exp(fit)
---> Linear
prediction of the Proportional hazard model
expected_lifetime_in_years = 1/exp(fit)
---> Linear
prediction of the Cox model
I think I understand the logic behind the results of the
proportional hazard
model, but not for the prediction of the Cox model.
Thank you for your help. I hope this is not a too stupid hole in my
logic.
Here is the self contained R code to produce the charts:
library(MASS);
library(survival);
#Same data but parametric fit
make.aidsp <- function(){
cutoff <- 10043 # July 1987 in julian days
btime <- pmin(cutoff, Aids2$death) - pmin(cutoff, Aids2$diag)
atime <- pmax(cutoff, Aids2$death) - pmax(cutoff, Aids2$diag)
survtime <- btime + 0.5*atime
status <- as.numeric(Aids2$status)
data.frame(survtime, status = status - 1, state = Aids2$state,
T.categ = Aids2$T.categ, age = Aids2$age, sex = Aids2$sex)
}
Aidsp <- make.aidsp()
# MASS example with the proportional hazard model
par(mfrow = c(1, 2));
(aids.ps <- survreg(Surv(survtime + 0.9, status) ~ state + T.categ +
pspline(age, df=6), data = Aidsp))
zz <- predict(aids.ps, data.frame(state = factor(rep("NSW", 83),
levels =
levels(Aidsp$state)),
T.categ = factor(rep("hs", 83), levels = levels(Aidsp$T.categ)),
age =
0:82), se = T, type = "linear")
plot(0:82, exp(zz$fit)/365.25, type = "l", ylim = c(0, 2), xlab =
"age", ylab =
"expected lifetime (years)")
lines(0:82, exp(zz$fit+1.96*zz$se.fit)/365.25, lty = 3, col = 2)
lines(0:82, exp(zz$fit-1.96*zz$se.fit)/365.25, lty = 3, col = 2)
rug(Aidsp$age + runif(length(Aidsp$age), -0.5, 0.5), ticksize =
0.015)
# Same example but with a Cox model instead
(aids.pscp <- coxph(Surv(survtime + 0.9, status) ~ state + T.categ +
pspline(age, df=6), data = Aidsp))
zzcp <- predict(aids.pscp, data.frame(state = factor(rep("NSW",
83), levels =
levels(Aidsp$state)),
T.categ = factor(rep("hs", 83), levels = levels(Aidsp$T.categ)),
age =
0:82), se = T, type = "lp")
plot(0:82, 1/exp(zzcp$fit), type = "l", ylim = c(0, 2), xlab =
"age", ylab =
"expected lifetime (years)")
lines(0:82, 1/exp(zzcp$fit+1.96*zzcp$se.fit), lty = 3, col = 2)
lines(0:82, 1/exp(zzcp$fit-1.96*zzcp$se.fit), lty = 3, col = 2)
rug(Aidsp$age + runif(length(Aidsp$age), -0.5, 0.5), ticksize =
0.015)
# Change the sampling time from daily to yearly
par(mfrow = c(1, 1));
(aids.ps <- survreg(Surv((survtime + 0.9)/365.25, status) ~ state +
T.categ +
pspline(age, df=6), data = Aidsp))
zz <- predict(aids.ps, data.frame(state = factor(rep("NSW", 83),
levels =
levels(Aidsp$state)),
T.categ = factor(rep("hs", 83), levels = levels(Aidsp$T.categ)),
age =
0:82), se = T, type = "linear")
plot(0:82, exp(zz$fit), type = "l", ylim = c(0, 2), xlab = "age",
ylab =
"expected lifetime (years)")
(aids.pscp <- coxph(Surv((survtime + 0.9)/365.25, status) ~ state +
T.categ +
pspline(age, df=6), data = Aidsp))
zzcp <- predict(aids.pscp, data.frame(state = factor(rep("NSW",
83), levels =
levels(Aidsp$state)),
T.categ = factor(rep("hs", 83), levels = levels(Aidsp$T.categ)),
age =
0:82), se = T, type = "lp")
lines(0:82, 1/exp(zzcp$fit),col=4)