Re: [R] Question regarding significance of a covariate in a coxme survival model

Teresa Iglesias Sat, 28 Aug 2010 20:29:41 -0700

On Sat, Aug 28, 2010 at 8:38 PM, David Winsemius <dwinsem...@comcast.net>wrote:


>
> On Aug 27, 2010, at 4:32 PM, Teresa Iglesias wrote:
>
>  Christopher David Desjardins <desja004 <at> umn.edu> writes:
>>
>>
>>> Hi,
>>> I am running a Cox Mixed Effects Hazard model using the library coxme. I
>>> am trying to model time to onset (age_sym1) of thought problems (e.g.
>>> hearing voices) (sym1).  As I have siblings in my dataset,  I have
>>> decided to account for this by including a random effect for family
>>> (famid). My covariate of interest is Mother's diagnosis where a 0 is
>>> bipolar, 1 is control, and 2 is major depression.  I am trying to fit
>>> the following model.
>>>
>>> thorp1 <- coxme(Surv(age_sym1, sym1) ~ lifedxm + (1 | famid), data =
>>> bip.surv)
>>>
>>> Which provides the following output:
>>>
>>> -------------------------------------------------
>>>
>>>> thorp1
>>>>
>>> Cox mixed-effects model fit by maximum likelihood
>>>  Data: bip.surv
>>>  events, n = 99, 189 (3 observations deleted due to missingness)
>>>  Iterations= 10 54
>>>                    NULL Integrated Penalized
>>> Log-likelihood -479.0372 -467.3549 -435.2096
>>>                  Chisq    df          p   AIC     BIC
>>> Integrated loglik 23.36 3.00 3.3897e-05 17.36     9.58
>>>  Penalized loglik 87.66 47.27 3.2374e-04 -6.88 -129.54
>>> Model: Surv(age_sym1, sym1) ~ lifedxm + (1 | famid)
>>> Fixed coefficients
>>>                     coef exp(coef) se(coef)      z      p
>>> lifedxmCONTROL -1.0576257 0.3472794 0.3872527 -2.73 0.0063
>>> lifedxmMAJOR   -0.6329957 0.5309987 0.3460847 -1.83 0.0670
>>> Random effects
>>>  Group Variable Std Dev    Variance
>>>  famid Intercept 0.9877770 0.9757033
>>>
>>> --------------------------------------------------
>>>
>>> So it appears that there is a difference in hazard rates associated with
>>> Mother's diagnosis between BIPOLAR and CONTROL. Just to be safe, I fit a
>>> model without this covariate present and decided to compare the models
>>> with AIC and see if fit decreased with this covariate not in the model.
>>>
>>> ----------------------------------------------------------
>>>  thorp1.n <- coxme(Surv(age_sym1, sym1) ~ (1 | famid), data = bip.surv)
>>>
>>>> thorp1.n
>>>>
>>> Cox mixed-effects model fit by maximum likelihood
>>>  Data: bip.surv
>>>  events, n = 99, 189 (3 observations deleted due to missingness)
>>>  Iterations= 10 54
>>>                    NULL Integrated Penalized
>>> Log-likelihood -479.0372 -471.3333 -436.0478
>>>                  Chisq    df          p    AIC     BIC
>>> Integrated loglik 15.41 1.00 0.00008663 13.41     10.81
>>>  Penalized loglik 85.98 49.48 0.00099915 -12.97 -141.37
>>> Model:  Surv(age_sym1, sym1) ~ (1 | famid)
>>> Random effects
>>>  Group Variable Std Dev Variance
>>>  famid Intercept 1.050949 1.104495
>>> ------------------------------------------------------------
>>>
>>> The problem is that the AIC for the model w/o lifedxm is 13.41 and the
>>> model w/ lifedxm is 17.36. So fit actually improved without lifedxm in
>>> the model but really the fit is indistinguishable if I use Burnham &
>>> Anderson, 2002's criteria.
>>>
>>> So my conundrum is this - The AIC and the p-values appear to disagree.
>>> The p-value would suggest that lifedxm should be included in the model
>>> and the AIC says it doesn't improve fit. In general, I tend to favor the
>>> AIC (I usually work with large sample sizes and multilevel models) but I
>>> am just beginning with survival models and I am curious if a survival
>>> model guru might suggest whether lifedxm needs to be in the model or not
>>> based on my results here? Would people generally use the AIC in this
>>> situation?  Also, I can't run anova() on this models because of the
>>> random effect.
>>>
>>> I am happy to provide the data if necessary.
>>>
>>> Please cc me on a reply.
>>>
>>> Thanks,
>>> Chris
>>>
>>
>> Hi Chris,
>> Did you get an answer to why the p-value and AIC score disagreed?
>> I am getting the same results with my own data. They're not small
>> disagreements either. The AIC score is telling me the opposite of
>> what the p-value and the parameter coef are saying.
>> The p-value and the coef for the predictor variable are in agreement.
>>
>> I've also noticed in some published papers with tables containing
>> p-values and AIC scores that the chisq p-value and AIC score
>> are in opposition too. This makes me think I'm missing something
>> and that this all actually makes sense.
>>
>> However given that AIC = â 2 (log L) + 2K (where K is the number of
>> parameters)
>> and seeing as how the log-likelihood scores below are in the hundreds,
>> shouldn't
>> the AIC score be in the hundreds as well??
>>
>
> That is different than my understanding of AIC. I thought that the AIC and
> BIC both took as input the difference in -2LL and then adjusted those
> differences for the differences in number of degrees of freedom. That
> perspective would inform one that the absolute value of the deviance ( =
> -2LL)  is immaterial and only differences are subject to interpretation. The
> p-values are calculated as Wald tests, which are basically first order
> approximations and have lower credence than model level comparisons. The OP
> also seemed to have ignored the fact that the penalized estimates of AIC and
> BIC went in the opposite direction from what he might have hoped.
>
>  --------------------------------------
>>
>> model0 (intercept only model)
>>                              NULL Integrated Penalized
>> Log-likelihood -119.8470  -117.9749 -113.1215
>>
>>                        Chisq   df        p           AIC    BIC
>> Integrated loglik  3.74 1.00 0.052989  1.74   0.08
>> Penalized loglik 13.45 7.06 0.063794 -0.68 -12.43
>>
>>
>> Random effects
>> Group   Variable  Std Dev   Variance
>> Site    Intercept    0.6399200 0.4094977
>>
>>
>> _____
>> model1 (before vs after)
>>                            NULL Integrated Penalized
>> Log-likelihood -119.8470  -112.1598 -108.1663
>>
>>                              Chisq   df          p        AIC   BIC
>> Integrated loglik 15.37 2.00 0.00045863 11.37  8.05
>> Penalized loglik 23.36 7.06 0.00153710  9.25 -2.49
>>
>> Fixed coefficients
>>               Coef       exp(coef)  se(coef)         z       p
>> Time  -1.329997 0.2644779 0.4009229 -3.32 0.00091
>>
>> Random effects
>> Group   Variable  Std Dev   Variance
>> Site    Intercept 0.5625206 0.3164294
>>
>> ______
>> model2 (stim1 vs stim2)
>>                              NULL Integrated Penalized
>> Log-likelihood -119.8470  -117.8677  -113.167
>>
>>                             Chisq   df        p      AIC    BIC
>> Integrated loglik  3.96 2.00 0.138160 -0.04  -3.37
>> Penalized loglik 13.36 7.83 0.093314 -2.31 -15.34
>>
>> Fixed coefficients
>>            coef          exp(coef)  se(coef)       z       p
>> stim 0.1621367  1.176021 0.3534788 0.46 0.65
>>
>> Random effects
>> Group  Variable  Std Dev   Variance
>> Site   Intercept   0.6262600 0.3922016
>> ----------------------------------------------
>>
>> If you didn't get an answer, hopefully, someone that understands all this
>> will
>> reply for both of us.
>>
>
> Yes. it would be good to get more authoritative opinion. Perhaps my
> standing as a non-statistician will prompt a real statistician to weigh in
> and correct any misapprehensions I have been laboring under.
>
> --
>
> David Winsemius, MD
> West Hartford, CT
>
> Hi David,
Maybe I misunderstand your reply about AIC (and BIC) but my understanding is
that the AIC "score" as an absolute value is meaningless but the "weights"
calculated from the models you want to compare is what matters. So the AIC
is used to calculate the relative Akaike weight among all the models you
wish to compare where the weights add to 1.
As far as I can tell, coxme gives the model's AIC "score" but is not
comparing any models yet. Perhaps that is up to the user to calculate
weights given the scores and figure out which model performs best.
The output from coxme above (if I understand correctly) is the AIC score for
each model (evaluated independent of other models) and that the AIC score
(is? should be?) calculated using the LL given as output. If this is true
then I don't see how one could get those AIC scores from those LLs.

                           Integrated    Penalized
Log-likelihood   -117.8677  -113.167

                               AIC    BIC
Integrated loglik   -0.04  -3.37
Penalized loglik  -2.31 -15.34

-2(-117.86)+2(2) â  -0.04

-- 
Teresa Iglesias
Davis, CA

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question regarding significance of a covariate in a coxme survival model

Reply via email to