I'm doing my usual "how does that work in R" thing with some Stata projects. I find a gross gap between the Stata and R in Cox PH models, and I hope you can give me some pointers about what goes wrong. I'm getting signals from R/Survival that the model just can't be estimated, but Stata spits out numbers just fine.
I wonder if I should specify initial values for coxph?
I got a dataset from a student who uses Stata and try to replicate in R. I will share data to you in case you want to see for yourself. Let me know if you want text or Stata data file.
In R, I try this:
> cox2 <- coxph(Surv(yrs2,ratify)~ accession+ haz.wst+ haz.in +haz.out+ wefgov+ rle+ rqe + pol.free +tai.2001 + ny.gdp.pcap.pp.cd + eio, data=dat3, control=coxph.control(iter.max=1000),singular.ok=T)
Warning message:
Ran out of iterations and did not converge in: fitter(X, Y, strats, offset, init, control, weights = weights,
So I wrote out the file exatly as it was in R into Stata dataset
> write.dta(dat3,"cleanBasel.dta") Warning message: Abbreviating variable names in: write.dta(dat3, "cleanBasel.dta")
Here's the Stata output:
. use "/home/pauljohn/ps/ps909/AdvancedRegression/duration_2/cleanBasel.dta" (Written by R. )
. stset yrs2, failure (ratify)
failure event: ratify != 0 & ratify < . obs. time interval: (0, yrs2] exit on or before: failure
---------------------------------------------------------------------------- > -- 21 total obs. 0 exclusions ---------------------------------------------------------------------------- > -- 21 obs. remaining, representing 21 failures in single record/single failure data 78 total analysis time at risk, at risk from t = 0 earliest observed entry t = 0
. stcox accessin haz_wst haz_in haz_out wefgov rle rqe pol_free tai_2001 ny_gd eio, robust
> nohr
failure _d: ratify analysis time _t: yrs2
Iteration 0: log pseudo-likelihood = -49.054959 Iteration 1: log pseudo-likelihood = -45.021682 Iteration 2: log pseudo-likelihood = -44.525187 Iteration 3: log pseudo-likelihood = -44.521588 Iteration 4: log pseudo-likelihood = -44.521586 Refining estimates: Iteration 0: log pseudo-likelihood = -44.521586
Cox regression -- Breslow method for ties
No. of subjects = 21 Number of obs = 21
No. of failures = 21
Time at risk = 78
Wald chi2(11) = 81.64
Log pseudo-likelihood = -44.521586 Prob > chi2 = 0.0000
------------------------------------------------------------------------------ | Robust _t | Coef. Std. Err. z P>|z| -------------+---------------------------------------------------------------- accessin | -1.114101 .6343663 -1.76 0.079 haz_wst | 2.32e-08 1.08e-07 0.22 0.829 haz_in | 3.78e-06 2.46e-06 1.54 0.124 haz_out | -3.80e-07 3.76e-07 -1.01 0.312 wefgov | 2.139127 .9136992 2.34 0.019 rle | 1.827482 1.500878 1.22 0.223 rqe | -3.126696 1.332069 -2.35 0.019 pol_free | -.4498276 .291764 -1.54 0.123 tai_2001 | -2.895922 2.577401 -1.12 0.261 ny_gd___ | -.0003223 .0002194 -1.47 0.142 eio | -.0577773 .0726064 -0.80 0.426 ------------------------------------------------------------------------------
. last observed exit t = 7
----------------------------------
Paul Johnson Dept. of Political Science University of Kansas
______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html