Re: [R] Survival analysis MLE gives NA or enormous standard errors
On Tue, 27 Jul 2010, Christopher David Desjardins wrote: Hi Charles, On Fri, 2010-07-23 at 14:40 -0700, Charles C. Berry wrote: On Fri, 23 Jul 2010, Christopher David Desjardins wrote: Sorry. I should have included some data. I've attached a subset of my data (50/192) cases in a Rdata file and have pasted it below. Running anova I get the following: anova(sr.reg.s4.nore) Df Deviance Resid. Df-2*LL P(>|Chi|) NULL NA NA45 33.89752NA as.factor(lifedxm) 2 2.43821143 31.45931 0.2954943 That would just be an omnibus test right and should that first NULL NA line be worrisome? What if I want to test specifically that CONTROL and BIPOLAR were different and that MAJOR DEPRESSION and BIPOLAR were different? You wrote: Construct a likelikehood ratio test for each hypothesis by fitting three models - two containing each term and one containing both - and comparing each simpler model to the fuller model. Would that be recoding lifedxm (presently a dummy variable where 0 - BIPOLAR, 1 - CONTROL, and 2 - MAJOR DEPRESSED) as three seperate variables: CONTROL (0 - No, 1 - Yes), BIPOLAR (0 - N0, 1 - Yes), and MAJOR DEPRESSED (0 - No, 1 - Yes) and then comparing the following models with anova()? CONTROL + BIPOLAR to MAJOR CONTROL + MAJOR to BIPOLAR I am sorry I am just a little confused. Basically I want to know if BIPOLAR is a higher risk than MAJOR and CONTROL and if MAJOR is a higher risk than CONTROL. It would help communication if you would use a standard notation like R. The meaning of pseudocodes tends to be a bit fuzzy. And from far below, it seems you have a "factor" called lifedxm not a 'dummy variable' in the sense that that term is often used. Here are three models defined using the formula language: frm1 <- Surv(age_sym4, sym4) ~ I(lifedxm=="MAJOR") frm2 <- Surv(age_sym4, sym4) ~ I(lifedxm=="BIPOLAR") frm3 <- Surv(age_sym4, sym4) ~ I(lifedxm=="MAJOR") + I(lifedxm=="BIPOLAR") The model of frm1 is nested under that of frm3. The model of frm2 is nested under that of frm3. anova( frm1, frm3 ) will report on the effect of adding I(lifedxm=="BIPOLAR") to frm1. i.e. it will give the increase in likelihood associated with that term. anova( frm2, frm3 ) will report on the effect of adding I(lifedxm=="MAJOR") to frm2. Chuck Thank you very much for all your help, Chris I'll look at Hauck-Donner effect. Thanks, Chris bip.surv.s age_sym4 sym4 lifedxm 1 16.128680 MAJOR 2 19.326490 MAJOR 3 16.550310 CONTROL 4 19.367560 CONTROL 5 16.090350 MAJOR 6 21.505820 MAJOR 7 16.361400 MAJOR 8 20.572210 MAJOR 9 16.457220 CONTROL 10 19.945240 CONTROL 11 15.791920 MAJOR 12 20.766600 MAJOR 13 16.150580 BIPOLAR 14 19.258040 BIPOLAR 15 17.363450 MAJOR 16 21.180010 MAJOR 17 NA0 BIPOLAR 18 NA0 BIPOLAR 19 16.317591 MAJOR 20 18.297060 MAJOR 21 16.407940 MAJOR 22 19.137580 MAJOR 23 16.194390 CONTROL 24 21.368930 CONTROL 25 15.890490 CONTROL 26 18.997950 CONTROL 27 NA0 BIPOLAR 28 18.904860 BIPOLAR 29 16.364130 MAJOR 30 20.427100 MAJOR 31 16.659820 MAJOR 32 19.457910 MAJOR 33 16.643390 CONTROL 34 19.400410 CONTROL 35 15.373031 BIPOLAR 36 19.838470 BIPOLAR 37 15.422311 MAJOR 38 19.370290 MAJOR 39 15.069130 MAJOR 40 17.815200 MAJOR 41 15.504450 BIPOLAR 42 17.921970 BIPOLAR 43 15.345650 CONTROL 44 18.075290 CONTROL 45 15.594800 CONTROL 46 19.674200 CONTROL 47 14.789870 MAJOR 48 20.054760 MAJOR 49 14.787130 MAJOR 50 19.868580 MAJOR On Fri, 2010-07-23 at 11:52 -0700, Charles C. Berry wrote: On Fri, 23 Jul 2010, Christopher David Desjardins wrote: Hi, I am trying to fit the following model: sr.reg.s4.nore <- survreg(Surv(age_sym4,sym4), as.factor(lifedxm), data=bip.surv) Next time include a reproducible example. i.e. something we can run. Now, Google "Hauck Donner Effect" to understand why anova(sr.reg.s4.nore) is preferred. Chuck Where age_sym4 is the age that a subject develops clinical thought problems; sym4 is whether they develop clinical thoughts problems (0 or 1); and lifedxm is mother's diagnosis: BIPOLAR, MAJOR DEPRESSION, or CONTROL. I am interested in whether or not survival differs by this covariate. When I run my model, I am getting the following output: summary(sr.reg.s4.nore) Call: survreg(formula = Surv(age_sym4, sym4) ~ as.factor(lifedxm), data = bip.surv) Value Std. Error z p (Intercept)4.037 0.455 8.86643 0.00755 as.factor(lifedxm)CONTROL 14.844 4707.383 0.00315 0.997484052845082791450 as.factor(lifedxm)MAJOR0.706 0.447 1.58037 0.114022774867277756905 Log(scale)-0.290 0.267 -1.08493 0.2779524374
Re: [R] Survival analysis MLE gives NA or enormous standard errors
Hi Charles, On Fri, 2010-07-23 at 14:40 -0700, Charles C. Berry wrote: > On Fri, 23 Jul 2010, Christopher David Desjardins wrote: > > > Sorry. I should have included some data. I've attached a subset of my > > data (50/192) cases in a Rdata file and have pasted it below. > > > > Running anova I get the following: > > > >> anova(sr.reg.s4.nore) > > Df Deviance Resid. Df-2*LL P(>|Chi|) > > NULL NA NA45 33.89752NA > > as.factor(lifedxm) 2 2.43821143 31.45931 0.2954943 > > > > That would just be an omnibus test right and should that first NULL NA > > line be worrisome? What if I want to test specifically that CONTROL and > > BIPOLAR were different and that MAJOR DEPRESSION and BIPOLAR were > > different? > You wrote: > Construct a likelikehood ratio test for each hypothesis by fitting three > models - two containing each term and one containing both - and comparing > each simpler model to the fuller model. > Would that be recoding lifedxm (presently a dummy variable where 0 - BIPOLAR, 1 - CONTROL, and 2 - MAJOR DEPRESSED) as three seperate variables: CONTROL (0 - No, 1 - Yes), BIPOLAR (0 - N0, 1 - Yes), and MAJOR DEPRESSED (0 - No, 1 - Yes) and then comparing the following models with anova()? CONTROL + BIPOLAR to MAJOR CONTROL + MAJOR to BIPOLAR I am sorry I am just a little confused. Basically I want to know if BIPOLAR is a higher risk than MAJOR and CONTROL and if MAJOR is a higher risk than CONTROL. Thank you very much for all your help, Chris > > > > I'll look at Hauck-Donner effect. > > > > Thanks, > > Chris > > > >> bip.surv.s > > age_sym4 sym4 lifedxm > > 1 16.128680 MAJOR > > 2 19.326490 MAJOR > > 3 16.550310 CONTROL > > 4 19.367560 CONTROL > > 5 16.090350 MAJOR > > 6 21.505820 MAJOR > > 7 16.361400 MAJOR > > 8 20.572210 MAJOR > > 9 16.457220 CONTROL > > 10 19.945240 CONTROL > > 11 15.791920 MAJOR > > 12 20.766600 MAJOR > > 13 16.150580 BIPOLAR > > 14 19.258040 BIPOLAR > > 15 17.363450 MAJOR > > 16 21.180010 MAJOR > > 17 NA0 BIPOLAR > > 18 NA0 BIPOLAR > > 19 16.317591 MAJOR > > 20 18.297060 MAJOR > > 21 16.407940 MAJOR > > 22 19.137580 MAJOR > > 23 16.194390 CONTROL > > 24 21.368930 CONTROL > > 25 15.890490 CONTROL > > 26 18.997950 CONTROL > > 27 NA0 BIPOLAR > > 28 18.904860 BIPOLAR > > 29 16.364130 MAJOR > > 30 20.427100 MAJOR > > 31 16.659820 MAJOR > > 32 19.457910 MAJOR > > 33 16.643390 CONTROL > > 34 19.400410 CONTROL > > 35 15.373031 BIPOLAR > > 36 19.838470 BIPOLAR > > 37 15.422311 MAJOR > > 38 19.370290 MAJOR > > 39 15.069130 MAJOR > > 40 17.815200 MAJOR > > 41 15.504450 BIPOLAR > > 42 17.921970 BIPOLAR > > 43 15.345650 CONTROL > > 44 18.075290 CONTROL > > 45 15.594800 CONTROL > > 46 19.674200 CONTROL > > 47 14.789870 MAJOR > > 48 20.054760 MAJOR > > 49 14.787130 MAJOR > > 50 19.868580 MAJOR > > > > > > On Fri, 2010-07-23 at 11:52 -0700, Charles C. Berry wrote: > >> On Fri, 23 Jul 2010, Christopher David Desjardins wrote: > >> > >>> Hi, > >>> I am trying to fit the following model: > >>> > >>> sr.reg.s4.nore <- survreg(Surv(age_sym4,sym4), as.factor(lifedxm), > >>> data=bip.surv) > >> > >> Next time include a reproducible example. i.e. something we can run. > >> > >> Now, Google "Hauck Donner Effect" to understand why > >> > >>anova(sr.reg.s4.nore) > >> > >> is preferred. > >> > >> Chuck > >> > >> > >>> > >>> Where age_sym4 is the age that a subject develops clinical thought > >>> problems; sym4 is whether they develop clinical thoughts problems (0 or > >>> 1); and lifedxm is mother's diagnosis: BIPOLAR, MAJOR DEPRESSION, or > >>> CONTROL. > >>> > >>> I am interested in whether or not survival differs by this covariate. > >>> > >>> When I run my model, I am getting the following output: > >>> > summary(sr.reg.s4.nore) > >>> > >>> Call: > >>> survreg(formula = Surv(age_sym4, sym4) ~ as.factor(lifedxm), > >>>data = bip.surv) > >>> Value Std. Error z p > >>> (Intercept)4.037 0.455 8.86643 > >>> 0.00755 > >>> as.factor(lifedxm)CONTROL 14.844 4707.383 0.00315 > >>> 0.997484052845082791450 > >>> as.factor(lifedxm)MAJOR0.706 0.447 1.58037 > >>> 0.114022774867277756905 > >>> Log(scale)-0.290 0.267 -1.08493 > >>> 0.277952437474223823521 > >>> > >>> Scale= 0.748 > >>> > >>> Weibull distribution > >>> Loglik(model)= -76.3 Loglik(intercept only)= -82.6 > >>> Chisq= 12.73 on 2 degrees of freedom, p= 0.0017 > >>> Number of Newton-Raphson Iterations: 21 > >>> n=186 (6 observations deleted due to missingness) > >>> > >>> > >>> I am concerned about the p-value of 0.997 and the SE of 4707. I am > >>> curious if it has to do with the f
Re: [R] Survival analysis MLE gives NA or enormous standard errors
On Fri, 23 Jul 2010, Christopher David Desjardins wrote: Sorry. I should have included some data. I've attached a subset of my data (50/192) cases in a Rdata file and have pasted it below. Running anova I get the following: anova(sr.reg.s4.nore) Df Deviance Resid. Df-2*LL P(>|Chi|) NULL NA NA45 33.89752NA as.factor(lifedxm) 2 2.43821143 31.45931 0.2954943 That would just be an omnibus test right and should that first NULL NA line be worrisome? What if I want to test specifically that CONTROL and BIPOLAR were different and that MAJOR DEPRESSION and BIPOLAR were different? Construct a likelikehood ratio test for each hypothesis by fitting three models - two containing each term and one containing both - and comparing each simpler model to the fuller model. I'll look at Hauck-Donner effect. Thanks, Chris bip.surv.s age_sym4 sym4 lifedxm 1 16.128680 MAJOR 2 19.326490 MAJOR 3 16.550310 CONTROL 4 19.367560 CONTROL 5 16.090350 MAJOR 6 21.505820 MAJOR 7 16.361400 MAJOR 8 20.572210 MAJOR 9 16.457220 CONTROL 10 19.945240 CONTROL 11 15.791920 MAJOR 12 20.766600 MAJOR 13 16.150580 BIPOLAR 14 19.258040 BIPOLAR 15 17.363450 MAJOR 16 21.180010 MAJOR 17 NA0 BIPOLAR 18 NA0 BIPOLAR 19 16.317591 MAJOR 20 18.297060 MAJOR 21 16.407940 MAJOR 22 19.137580 MAJOR 23 16.194390 CONTROL 24 21.368930 CONTROL 25 15.890490 CONTROL 26 18.997950 CONTROL 27 NA0 BIPOLAR 28 18.904860 BIPOLAR 29 16.364130 MAJOR 30 20.427100 MAJOR 31 16.659820 MAJOR 32 19.457910 MAJOR 33 16.643390 CONTROL 34 19.400410 CONTROL 35 15.373031 BIPOLAR 36 19.838470 BIPOLAR 37 15.422311 MAJOR 38 19.370290 MAJOR 39 15.069130 MAJOR 40 17.815200 MAJOR 41 15.504450 BIPOLAR 42 17.921970 BIPOLAR 43 15.345650 CONTROL 44 18.075290 CONTROL 45 15.594800 CONTROL 46 19.674200 CONTROL 47 14.789870 MAJOR 48 20.054760 MAJOR 49 14.787130 MAJOR 50 19.868580 MAJOR On Fri, 2010-07-23 at 11:52 -0700, Charles C. Berry wrote: On Fri, 23 Jul 2010, Christopher David Desjardins wrote: Hi, I am trying to fit the following model: sr.reg.s4.nore <- survreg(Surv(age_sym4,sym4), as.factor(lifedxm), data=bip.surv) Next time include a reproducible example. i.e. something we can run. Now, Google "Hauck Donner Effect" to understand why anova(sr.reg.s4.nore) is preferred. Chuck Where age_sym4 is the age that a subject develops clinical thought problems; sym4 is whether they develop clinical thoughts problems (0 or 1); and lifedxm is mother's diagnosis: BIPOLAR, MAJOR DEPRESSION, or CONTROL. I am interested in whether or not survival differs by this covariate. When I run my model, I am getting the following output: summary(sr.reg.s4.nore) Call: survreg(formula = Surv(age_sym4, sym4) ~ as.factor(lifedxm), data = bip.surv) Value Std. Error z p (Intercept)4.037 0.455 8.86643 0.00755 as.factor(lifedxm)CONTROL 14.844 4707.383 0.00315 0.997484052845082791450 as.factor(lifedxm)MAJOR0.706 0.447 1.58037 0.114022774867277756905 Log(scale)-0.290 0.267 -1.08493 0.277952437474223823521 Scale= 0.748 Weibull distribution Loglik(model)= -76.3 Loglik(intercept only)= -82.6 Chisq= 12.73 on 2 degrees of freedom, p= 0.0017 Number of Newton-Raphson Iterations: 21 n=186 (6 observations deleted due to missingness) I am concerned about the p-value of 0.997 and the SE of 4707. I am curious if it has to do with the fact that the CONTROL group doesn't have a mixed response, meaning that all my subjects do not develop clinical levels of thought problems and subsequently 'survive'. table(bip.surv$sym4,bip.surv$lifedxm) BIPOLAR CONTROL MAJOR 0 41 6078 1 7 0 6 Is there some sort of way that I can overcome this? Is my model misspecified? Is this better suited to be run as a Bayesian model using priors to overcome the lack of a mixed response? Also, please cc me on an email as I am a digest subscriber. Thanks, Chris -- Christopher David Desjardins PhD student, Quantitative Methods in Education MS student, Statistics University of Minnesota 192 Education Sciences Building http://cddesjardins.wordpress.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://fampre
Re: [R] Survival analysis MLE gives NA or enormous standard errors
Sorry. I should have included some data. I've attached a subset of my data (50/192) cases in a Rdata file and have pasted it below. Running anova I get the following: > anova(sr.reg.s4.nore) Df Deviance Resid. Df-2*LL P(>|Chi|) NULL NA NA45 33.89752NA as.factor(lifedxm) 2 2.43821143 31.45931 0.2954943 That would just be an omnibus test right and should that first NULL NA line be worrisome? What if I want to test specifically that CONTROL and BIPOLAR were different and that MAJOR DEPRESSION and BIPOLAR were different? I'll look at Hauck-Donner effect. Thanks, Chris > bip.surv.s age_sym4 sym4 lifedxm 1 16.128680 MAJOR 2 19.326490 MAJOR 3 16.550310 CONTROL 4 19.367560 CONTROL 5 16.090350 MAJOR 6 21.505820 MAJOR 7 16.361400 MAJOR 8 20.572210 MAJOR 9 16.457220 CONTROL 10 19.945240 CONTROL 11 15.791920 MAJOR 12 20.766600 MAJOR 13 16.150580 BIPOLAR 14 19.258040 BIPOLAR 15 17.363450 MAJOR 16 21.180010 MAJOR 17 NA0 BIPOLAR 18 NA0 BIPOLAR 19 16.317591 MAJOR 20 18.297060 MAJOR 21 16.407940 MAJOR 22 19.137580 MAJOR 23 16.194390 CONTROL 24 21.368930 CONTROL 25 15.890490 CONTROL 26 18.997950 CONTROL 27 NA0 BIPOLAR 28 18.904860 BIPOLAR 29 16.364130 MAJOR 30 20.427100 MAJOR 31 16.659820 MAJOR 32 19.457910 MAJOR 33 16.643390 CONTROL 34 19.400410 CONTROL 35 15.373031 BIPOLAR 36 19.838470 BIPOLAR 37 15.422311 MAJOR 38 19.370290 MAJOR 39 15.069130 MAJOR 40 17.815200 MAJOR 41 15.504450 BIPOLAR 42 17.921970 BIPOLAR 43 15.345650 CONTROL 44 18.075290 CONTROL 45 15.594800 CONTROL 46 19.674200 CONTROL 47 14.789870 MAJOR 48 20.054760 MAJOR 49 14.787130 MAJOR 50 19.868580 MAJOR On Fri, 2010-07-23 at 11:52 -0700, Charles C. Berry wrote: > On Fri, 23 Jul 2010, Christopher David Desjardins wrote: > > > Hi, > > I am trying to fit the following model: > > > > sr.reg.s4.nore <- survreg(Surv(age_sym4,sym4), as.factor(lifedxm), > > data=bip.surv) > > Next time include a reproducible example. i.e. something we can run. > > Now, Google "Hauck Donner Effect" to understand why > > anova(sr.reg.s4.nore) > > is preferred. > > Chuck > > > > > > Where age_sym4 is the age that a subject develops clinical thought > > problems; sym4 is whether they develop clinical thoughts problems (0 or > > 1); and lifedxm is mother's diagnosis: BIPOLAR, MAJOR DEPRESSION, or > > CONTROL. > > > > I am interested in whether or not survival differs by this covariate. > > > > When I run my model, I am getting the following output: > > > >> summary(sr.reg.s4.nore) > > > > Call: > > survreg(formula = Surv(age_sym4, sym4) ~ as.factor(lifedxm), > >data = bip.surv) > > Value Std. Error z p > > (Intercept)4.037 0.455 8.86643 > > 0.00755 > > as.factor(lifedxm)CONTROL 14.844 4707.383 0.00315 > > 0.997484052845082791450 > > as.factor(lifedxm)MAJOR0.706 0.447 1.58037 > > 0.114022774867277756905 > > Log(scale)-0.290 0.267 -1.08493 > > 0.277952437474223823521 > > > > Scale= 0.748 > > > > Weibull distribution > > Loglik(model)= -76.3 Loglik(intercept only)= -82.6 > > Chisq= 12.73 on 2 degrees of freedom, p= 0.0017 > > Number of Newton-Raphson Iterations: 21 > > n=186 (6 observations deleted due to missingness) > > > > > > I am concerned about the p-value of 0.997 and the SE of 4707. I am > > curious if it has to do with the fact that the CONTROL group doesn't > > have a mixed response, meaning that all my subjects do not develop > > clinical levels of thought problems and subsequently 'survive'. > > > >> table(bip.surv$sym4,bip.surv$lifedxm) > > > >BIPOLAR CONTROL MAJOR > > 0 41 6078 > > 1 7 0 6 > > > > Is there some sort of way that I can overcome this? Is my model > > misspecified? Is this better suited to be run as a Bayesian model using > > priors to overcome the lack of a mixed response? > > > > Also, please cc me on an email as I am a digest subscriber. > > Thanks, > > Chris > > > > > > -- > > Christopher David Desjardins > > PhD student, Quantitative Methods in Education > > MS student, Statistics > > University of Minnesota > > 192 Education Sciences Building > > http://cddesjardins.wordpress.com > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > Charles C. Berry(858) 534-2098 > Dept of Family/Preventive > Medicine > E mailto:cbe...@tajo.u
Re: [R] Survival analysis MLE gives NA or enormous standard errors
On Fri, 23 Jul 2010, Christopher David Desjardins wrote: Hi, I am trying to fit the following model: sr.reg.s4.nore <- survreg(Surv(age_sym4,sym4), as.factor(lifedxm), data=bip.surv) Next time include a reproducible example. i.e. something we can run. Now, Google "Hauck Donner Effect" to understand why anova(sr.reg.s4.nore) is preferred. Chuck Where age_sym4 is the age that a subject develops clinical thought problems; sym4 is whether they develop clinical thoughts problems (0 or 1); and lifedxm is mother's diagnosis: BIPOLAR, MAJOR DEPRESSION, or CONTROL. I am interested in whether or not survival differs by this covariate. When I run my model, I am getting the following output: summary(sr.reg.s4.nore) Call: survreg(formula = Surv(age_sym4, sym4) ~ as.factor(lifedxm), data = bip.surv) Value Std. Error z p (Intercept)4.037 0.455 8.86643 0.00755 as.factor(lifedxm)CONTROL 14.844 4707.383 0.00315 0.997484052845082791450 as.factor(lifedxm)MAJOR0.706 0.447 1.58037 0.114022774867277756905 Log(scale)-0.290 0.267 -1.08493 0.277952437474223823521 Scale= 0.748 Weibull distribution Loglik(model)= -76.3 Loglik(intercept only)= -82.6 Chisq= 12.73 on 2 degrees of freedom, p= 0.0017 Number of Newton-Raphson Iterations: 21 n=186 (6 observations deleted due to missingness) I am concerned about the p-value of 0.997 and the SE of 4707. I am curious if it has to do with the fact that the CONTROL group doesn't have a mixed response, meaning that all my subjects do not develop clinical levels of thought problems and subsequently 'survive'. table(bip.surv$sym4,bip.surv$lifedxm) BIPOLAR CONTROL MAJOR 0 41 6078 1 7 0 6 Is there some sort of way that I can overcome this? Is my model misspecified? Is this better suited to be run as a Bayesian model using priors to overcome the lack of a mixed response? Also, please cc me on an email as I am a digest subscriber. Thanks, Chris -- Christopher David Desjardins PhD student, Quantitative Methods in Education MS student, Statistics University of Minnesota 192 Education Sciences Building http://cddesjardins.wordpress.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Survival analysis MLE gives NA or enormous standard errors
Hi, I am trying to fit the following model: sr.reg.s4.nore <- survreg(Surv(age_sym4,sym4), as.factor(lifedxm), data=bip.surv) Where age_sym4 is the age that a subject develops clinical thought problems; sym4 is whether they develop clinical thoughts problems (0 or 1); and lifedxm is mother's diagnosis: BIPOLAR, MAJOR DEPRESSION, or CONTROL. I am interested in whether or not survival differs by this covariate. When I run my model, I am getting the following output: > summary(sr.reg.s4.nore) Call: survreg(formula = Surv(age_sym4, sym4) ~ as.factor(lifedxm), data = bip.surv) Value Std. Error z p (Intercept)4.037 0.455 8.86643 0.00755 as.factor(lifedxm)CONTROL 14.844 4707.383 0.00315 0.997484052845082791450 as.factor(lifedxm)MAJOR0.706 0.447 1.58037 0.114022774867277756905 Log(scale)-0.290 0.267 -1.08493 0.277952437474223823521 Scale= 0.748 Weibull distribution Loglik(model)= -76.3 Loglik(intercept only)= -82.6 Chisq= 12.73 on 2 degrees of freedom, p= 0.0017 Number of Newton-Raphson Iterations: 21 n=186 (6 observations deleted due to missingness) I am concerned about the p-value of 0.997 and the SE of 4707. I am curious if it has to do with the fact that the CONTROL group doesn't have a mixed response, meaning that all my subjects do not develop clinical levels of thought problems and subsequently 'survive'. > table(bip.surv$sym4,bip.surv$lifedxm) BIPOLAR CONTROL MAJOR 0 41 6078 1 7 0 6 Is there some sort of way that I can overcome this? Is my model misspecified? Is this better suited to be run as a Bayesian model using priors to overcome the lack of a mixed response? Also, please cc me on an email as I am a digest subscriber. Thanks, Chris -- Christopher David Desjardins PhD student, Quantitative Methods in Education MS student, Statistics University of Minnesota 192 Education Sciences Building http://cddesjardins.wordpress.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.