On Mon, 16 Jun 2008, Peter Dalgaard wrote:
Jin Wang wrote:
I tried to compare if cch() and coxph() can generate same result for
same case cohort data
Use the standard data in cch(): nwtco
Since in cch contains the cohort size=4028, while ccoh.data size =1154
after selection, but coxph does not contain info of cohort size=4028.
The rough estimate between coxph() and cch() is same, but the lower
and upper CI and P-value are a little different. Can we exactly use
coxph() to repeat cch() using with appropriate configuration in
coxph()? Is SAS a better way(PHREG,CASECOH.SAS) to implement
time-dependent case-cohort?
I think you need to read the literature, in particular the paper by
Therneau (!) and Li, which among other things details the implementation
of the Self-Prentice estimator. With that in mind, it should not be
surprising that it is non-trivial how to get correct SE's out of coxph.
What _is_ surprising (at least somewhat) is how close the robust SE are
to those of the Self-Prentice method -- if I understand correctly, the
connection is that Self-Prentice uses jackknifing for the contribution
from subcohort sampling plus the standard Cox asymptotic variance and
the robust method effectively uses jackknifing for both.
Yes. The cch() methods all do a model-based analysis of the full cohort and a
finite-sampling analysis of the second-phase sampling.
For Cox models the model-based and jackknife variances are usually very close.
The nwtco data is actually an unusually bad fit to the Cox model and the
differences are larger than usual.
(I'm a bit puzzled about why cch() insists on having unique id's,
though. Doesn't _look_ like it would be too hard to get rid of that
restriction, at least for S-P, which admittedly is the only method I
spent enough time studying. And that was a some years ago.)
If you have only one event per person the only problem is that the code isn't
written that way. On the other hand, if you do have additional time-varying
covariates there will be a (possibly useful) efficiency gain from using more
efficient methods than cch() provides, with calibration of weights based on
covariates inside and outside the subcohort.
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
[EMAIL PROTECTED] University of Washington, Seattle
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.