In the Cox regression case, the probable explanation is that you have
ties in your data; Stata and coxph may have different defaults for
handling ties. Read the manuals!
The difference in sign in the other cases is simply due to different
definitions of the models. I am sure it is well documented in relevant
manuals.
Göran
On 2014-05-30 13:37, Hiyoshi, Ayako wrote:
Dear R users,
Hi, thank you so much for your help in advance.
I have been using Stata but new to R. For my paper revision using
Aalen's survival analysis, I need to use R, as the command including
Aalen's survival seems to be available in R (32-bit, version 3.1.0
(2014-04-10)) but less ready to be used in Stata (version 13/SE).
To make sure that I can do basics, I have fitted logistic regression
and Cox proportional hazard regression using R and Stata.
The data I used were from UCLA R's textbook example page:
<http://www.ats.ucla.edu/stat/r/examples/asa/asa_ch1_r.htm.>
http://www.ats.ucla.edu/stat/r/examples/asa/asa_ch1_r.htm. I used
this in Stata too.
When I fitted logistic regression as below, the estimates were
exactly same between R and Stata.
<Example using logistic regression>
R:
logistic1 <- glm(censor ~ age + drug, data=xxxx, family =
"binomial")
summary(logistic1)
exp(cbind(OR=coef(logistic1), confint(logistic1)))
OR 2.5 % 97.5 % (Intercept) 1.0373731 0.06358296 16.797896
age 1.0436805 0.96801933 1.131233 drug 0.7192149
0.26042635 1.937502
Stata:
logistic censor age i.drug OR CI_lower CI_upper age |
1.043681 .9662388 1.127329 drug | .719215 .2665194
1.940835 _cons | 1.037373 .065847 16.3431
However, when I fitted Cox proportional hazard regression, there were
some discrepancies in coefficient (and exponentiated hazard ratios).
<Example using Cox proportioanl hazard regression>
R:
cox1 <- coxph(Surv(time, censor) ~ drug, age, data=xxxx)
summary(cox1)
Call: coxph(formula = Surv(time, censor) ~ drug + age, data = xxxx)
n= 100, number of events= 80 coef exp(coef) se(coef) z Pr(>|z|)
drug 1.01670 2.76405 0.25622 3.968 7.24e-05 *** age 0.09714
1.10202 0.01864 5.211 1.87e-07 *** --- Signif. codes: 0 '***' 0.001
'**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 exp(coef) exp(-coef) lower .95 upper
.95 drug 2.764 0.3618 1.673 4.567 age 1.102
0.9074 1.062 1.143 Concordance= 0.711 (se = 0.042 ) Rsquare=
0.324 (max possible= 0.997 ) Likelihood ratio test= 39.13 on 2 df,
p=3.182e-09 Wald test = 36.13 on 2 df, p=1.431e-08
Score (logrank) test = 38.39 on 2 df, p=4.602e-09
Stata:
stset time, f(censor) stcox drug age
------------------------------------------------------------------------------
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
drug | 2.563531 .6550089 3.68 0.000 1.55363 4.229893
age | 1.095852 .02026 4.95 0.000 1.056854
1.136289
------------------------------------------------------------------------------
The HR estimates for drug was 2.76 from R, but 2.56 from Stata.
I searched in internet for explanation, but could not find any.
In parametric survival regression with exponential distribution, R
and Stata's coefficients were completely opposite while the values
were exactly same (i.e. say 0.08 for Stata and -0.08 for R). I
suspected something like this
(http://www.theanalysisfactor.com/ordinal-logistic-regression-mystery/)
going on, but for Cox proportional hazard regression, i coudl not
find any resource helping me.
I highly appreciate if anyone could explain this for me, or suggest
me resource that I can read.
Thank you so much for your help.
Best,
Ayako
[[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org
mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.