Re: [R] Seeking help with LOGIT model
You should look up the Hauck-Donne phenomenon, which shows that with binomial GLMs, the standard error can grow faster than the effect size. Complete separation results, for example, when one predictor (or a combination of several predictors) perfectly predicts the response. Something like this seems to be happening for variables 4 and 5. You could try the brglm function from the package of the same name, which uses bias correction. Compare (after coercing your Data to a data frame): summary(glm(Y ~ ., binomial, Data)) Call: glm(formula = Y ~ ., family = binomial, data = Data) Deviance Residuals: Min1QMedian3Q Max -2.00979 0.0 0.6 0.27987 1.82302 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 10.99326 20.77336 0.529 0.5967 `X 1` 0.019430.01040 1.868 0.0617 . `X 2` 10.610135.65409 1.877 0.0606 . `X 3` -0.667630.47668 -1.401 0.1613 `X 4` 70.98785 36.41181 1.950 0.0512 . `X 5` 17.33126 2872.17069 0.006 0.9952 summary(brglm(Y ~ ., binomial, Data)) Call: brglm(formula = Y ~ ., family = binomial, data = Data) Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 12.017791 14.337183 0.838 0.4019 `X 1`0.014898 0.008263 1.803 0.0714 . `X 2`8.307941 4.010792 2.071 0.0383 * `X 3` -0.576309 0.352097 -1.637 0.1017 `X 4` 35.627644 16.638766 2.141 0.0323 * `X 5`2.134544 2.570756 0.830 0.4064 Good luck. Ken Quoting Christofer Bogaso : Thanks Ken for your reply. No doubt your english is quite tough!! I understand something is not normal with the 5th explanatory variable (se:2872.17069!) However could not understand what you mean by "You seem to be getting complete separation on X5 "? Can you please be more elaborate? Thanks, On Thu, Apr 12, 2012 at 4:06 PM, ken knoblauch wrote: Christofer Bogaso gmail.com> writes: Dear all, I am fitting a LOGIT model on this Data... << snip >>--- glm(Data[,1] ~ Data[,-1], binomial(link = logit)) Call: glm(formula = Data[, 1] ~ Data[, -1], family = binomial(link = logit)) Coefficients: (Intercept) Data[, -1]X 1 Data[, -1]X 2 Data[, -1]X 3 Data[, -1]X 4 Data[, -1]X 5 10.99326 0.01943 10.61013 -0.66763 70.98785 17.33126 Degrees of Freedom: 43 Total (i.e. Null); 38 Residual Null Deviance: 44.58 Residual Deviance: 17.46 AIC: 29.46 Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred However I am getting a warning mesage as "fitted probabilities numerically 0 or 1 occurred". Here my question is, have I made any mistakes with my above implementation? I s it just because, I have too less number of '0' in my response Variable? Look at the output of summary, especially the standard errors. You seem to be getting complete separation on X5 and X4 doesn,'t look so hot either. Ken __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ken Knoblauch Inserm U846 Stem-cell and Brain Research Institute Department of Integrative Neurosciences 18 avenue du Doyen Lépine 69500 Bron France tel: +33 (0)4 72 91 34 77 fax: +33 (0)4 72 91 34 61 portable: +33 (0)6 84 10 64 10 http://www.sbri.fr/members/kenneth-knoblauch.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Seeking help with LOGIT model
Thanks Ken for your reply. No doubt your english is quite tough!! I understand something is not normal with the 5th explanatory variable (se:2872.17069!) However could not understand what you mean by "You seem to be getting complete separation on X5 "? Can you please be more elaborate? Thanks, On Thu, Apr 12, 2012 at 4:06 PM, ken knoblauch wrote: > Christofer Bogaso gmail.com> writes: >> Dear all, I am fitting a LOGIT model on this Data... > << snip >>--- >> glm(Data[,1] ~ Data[,-1], binomial(link = logit)) >> >> Call: glm(formula = Data[, 1] ~ Data[, -1], family = binomial(link = logit)) >> >> Coefficients: >> (Intercept) Data[, -1]X 1 Data[, -1]X 2 Data[, -1]X 3 Data[, >> -1]X 4 Data[, -1]X 5 >> 10.99326 0.01943 10.61013 -0.66763 >> 70.98785 17.33126 >> >> Degrees of Freedom: 43 Total (i.e. Null); 38 Residual >> Null Deviance: 44.58 >> Residual Deviance: 17.46 AIC: 29.46 >> Warning message: >> glm.fit: fitted probabilities numerically 0 or 1 occurred >> >> However I am getting a warning mesage as "fitted probabilities >> numerically 0 or 1 occurred". Here my question is, > have I made any >> mistakes with my above implementation? I > s it just because, I have too >> less number of '0' in my response Variable? >> > Look at the output of summary, especially the standard errors. > You seem to be getting complete > separation on X5 and X4 doesn,'t look so hot either. > > Ken > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Seeking help with LOGIT model
Christofer Bogaso gmail.com> writes: > Dear all, I am fitting a LOGIT model on this Data... << snip >>--- > glm(Data[,1] ~ Data[,-1], binomial(link = logit)) > > Call: glm(formula = Data[, 1] ~ Data[, -1], family = binomial(link = logit)) > > Coefficients: > (Intercept) Data[, -1]X 1 Data[, -1]X 2 Data[, -1]X 3 Data[, > -1]X 4 Data[, -1]X 5 > 10.993260.01943 10.61013 -0.66763 > 70.98785 17.33126 > > Degrees of Freedom: 43 Total (i.e. Null); 38 Residual > Null Deviance: 44.58 > Residual Deviance: 17.46AIC: 29.46 > Warning message: > glm.fit: fitted probabilities numerically 0 or 1 occurred > > However I am getting a warning mesage as "fitted probabilities > numerically 0 or 1 occurred". Here my question is, have I made any > mistakes with my above implementation? I s it just because, I have too > less number of '0' in my response Variable? > Look at the output of summary, especially the standard errors. You seem to be getting complete separation on X5 and X4 doesn,'t look so hot either. Ken __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Seeking help with LOGIT model
Dear all, I am fitting a LOGIT model on this Data... Data <- structure(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 47, 58, 82, 100, 222, 164, 161, 70, 219, 81, 209, 182, 185, 104, 126, 192, 95, 245, 97, 177, 125, 56, 85, 199, 298, 145, 78, 144, 178, 146, 132, 98, 120, 148, 123, 282, 79, 34, 104, 91, 199, 101, 109, 117, 1.1, 0.92, 1.72, 2.18, 1.75, 2.26, 2.07, 1.43, 1.92, 1.82, 2.34, 2.12, 1.81, 1.35, 1.26, 2.07, 2.04, 1.55, 1.89, 1.68, 0.76, 1.96, 1.29, 1.81, 1.72, 2.39, 1.68, 2.29, 2.34, 2.21, 1.42, 1.97, 2.12, 1.9, 1.15, 1.7, 1.24, 1.55, 2.04, 1.59, 2.07, 2, 1.84, 2.04, 51.2, 48.5, 50.8, 54.4, 52.4, 56.7, 54.6, 52.7, 52.3, 53, 55.4, 53.5, 51.6, 48.5, 49.3, 53.9, 55.7, 51.2, 54, 52.2, 51.1, 54, 55, 52.9, 53.7, 55.8, 50.4, 58.8, 54.5, 53.5, 48.8, 54.5, 52.1, 56, 56.2, 53.3, 50.9, 53.2, 51.7, 54.3, 53.7, 54.7, 47, 56.9, 0.321, 0.224, 0.127, 0.063, 0.021, 0.027, 0.139, 0.218, 0.008, 0.012, 0.076, 0.299, 0.04, 0.069, 0.33, 0.017, 0.166, 0.003, 0.01, 0.076, 0.454, 0.032, 0.266, 0.018, 0.038, 0.067, 0.075, 0.064, 0.065, 0.065, 0.09, 0.016, 0.061, 0.019, 0.389, 0.037, 0.161, 0.127, 0.017, 0.222, 0.026, 0.012, 0.057, 0.022, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0), .Dim = c(44L, 6L), .Dimnames = list( c("Obs 1", "Obs 2", "Obs 3", "Obs 4", "Obs 5", "Obs 6", "Obs 7", "Obs 8", "Obs 9", "Obs 10", "Obs 11", "Obs 12", "Obs 13", "Obs 14", "Obs 15", "Obs 16", "Obs 17", "Obs 18", "Obs 19", "Obs 20", "Obs 21", "Obs 22", "Obs 23", "Obs 24", "Obs 25", "Obs 26", "Obs 27", "Obs 28", "Obs 29", "Obs 30", "Obs 31", "Obs 32", "Obs 33", "Obs 34", "Obs 35", "Obs 36", "Obs 37", "Obs 38", "Obs 39", "Obs 40", "Obs 41", "Obs 42", "Obs 43", "Obs 44"), c("Y", "X 1", "X 2", "X 3", "X 4", "X 5"))) glm(Data[,1] ~ Data[,-1], binomial(link = logit)) Call: glm(formula = Data[, 1] ~ Data[, -1], family = binomial(link = logit)) Coefficients: (Intercept) Data[, -1]X 1 Data[, -1]X 2 Data[, -1]X 3 Data[, -1]X 4 Data[, -1]X 5 10.993260.01943 10.61013 -0.66763 70.98785 17.33126 Degrees of Freedom: 43 Total (i.e. Null); 38 Residual Null Deviance: 44.58 Residual Deviance: 17.46AIC: 29.46 Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred However I am getting a warning mesage as "fitted probabilities numerically 0 or 1 occurred". Here my question is, have I made any mistakes with my above implementation? Is it just because, I have too less number of '0' in my response Variable? Thanks for your help. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.