[R] Stepwise logistic regression with significance testing - stepAIC
Hello R-Users,  I have one binary dependent variable and a set of independent variables (glm(formula,â¦,family=âbinomialâ) ) and I am using the function stepAIC (âMASSâ) for choosing an optimal model. However I am not sure if stepAIC considers significance properties like Likelihood ratio test and Wald test (see example below).   y - rbinom(30,1,0.4) x1 - rnorm(30) x2 - rnorm(30) x3 - rnorm(30) xdata - data.frame(x1,x2,x3) fit1 - glm(y~ . ,family=binomial,data=xdata) stepAIC(fit1,trace=FALSE)  Call: glm(formula = y ~ x3, family = binomial, data = xdata)  Coefficients: (Intercept)          x3    -0.3556      0.8404  Degrees of Freedom: 29 Total (i.e. Null); 28 Residual Null Deviance:     40.38 Residual Deviance: 37.86       AIC: 41.86 fit - glm( stepAIC(fit1,trace=FALSE)$formula ,family=binomial) my.summ - summary(fit) # Wald Test print(my.summ$coeff[,4]) (Intercept)         x3  0.3609638  0.1395215 my.anova - anova(fit,test=Chisq) #LR Test print(my.anova$P[2]) [1] 0.1121783   Is there an alternative function or a possible way of checking if the added variable and the new model are significant within the regression steps?  Thanks in advance for your help  Regards  Peter-Heinz Fox [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Stepwise logistic Regression with significance testing - stepAIC
Hello R-Users,  I have one binary dependent variable and a set of independent variables (glm(formula,â¦,family=âbinomialâ) ) and I am using the function stepAIC (âMASSâ) for choosing an optimal model. However I am not sure if stepAIC considers significance properties like Likelihood ratio test and Wald test (see example below).   y - rbinom(30,1,0.4) x1 - rnorm(30) x2 - rnorm(30) x3 - rnorm(30) xdata - data.frame(x1,x2,x3) fit1 - glm(y~ . ,family=binomial,data=xdata) stepAIC(fit1,trace=FALSE)  Call: glm(formula = y ~ x3, family = binomial, data = xdata)  Coefficients: (Intercept)          x3    -0.3556      0.8404  Degrees of Freedom: 29 Total (i.e. Null); 28 Residual Null Deviance:     40.38 Residual Deviance: 37.86       AIC: 41.86 fit - glm( stepAIC(fit1,trace=FALSE)$formula ,family=binomial) my.summ - summary(fit) # Wald Test print(my.summ$coeff[,4]) (Intercept)         x3  0.3609638  0.1395215 my.anova - anova(fit,test=Chisq) #LR Test print(my.anova$P[2]) [1] 0.1121783   Is there an alternative function or a possible way of checking if the added variable and the new model are significant within the regression steps?  Thanks in advance for your help  Regards  Peter-Heinz Fox [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Stepwise logistic regression with significance testing - stepAIC
Hello R-Users,  I have one binary dependent variable and a set of independent variables (glm(formula,â¦,family=âbinomialâ) ) and I am using the function stepAIC (âMASSâ) for choosing an optimal model. However I am not sure if stepAIC considers significance properties like Likelihood ratio test and Wald test (see example below).   y - rbinom(30,1,0.4) x1 - rnorm(30) x2 - rnorm(30) x3 - rnorm(30) xdata - data.frame(x1,x2,x3) fit1 - glm(y~ . ,family=binomial,data=xdata) stepAIC(fit1,trace=FALSE)  Call: glm(formula = y ~ x3, family = binomial, data = xdata)  Coefficients: (Intercept)          x3    -0.3556      0.8404  Degrees of Freedom: 29 Total (i.e. Null); 28 Residual Null Deviance:     40.38 Residual Deviance: 37.86       AIC: 41.86 fit - glm( stepAIC(fit1,trace=FALSE)$formula ,family=binomial) my.summ - summary(fit) # Wald Test print(my.summ$coeff[,4]) (Intercept)         x3  0.3609638  0.1395215 my.anova - anova(fit,test=Chisq) #LR Test print(my.anova$P[2]) [1] 0.1121783   Is there an alternative function or a possible way of checking if the added variable and the new model are significant within the regression steps?  Thanks in advance for your help  Regards  Peter-Heinz Fox [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stepwise logistic regression with significance testing - stepAIC
There is not a meaningful alternative way since the way you propose is not meaningful. The Wald tests have some know problems even in the well defined cases. Both types of tests are designed to test a predefined hypothesis, not a conditional hypothesis on the stepwise procedure. It is best to use other approaches than stepwise selection (it has been shown to give biased results) such as the lasso. If you need to use stepwise, then you should bootstrap the entire selection process to get better estimates/standard errors. Frank Harrell's book and package go into more detail on this and provide some tools to help (as well as the other packages that can be used). Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Peter-Heinz Fox Sent: Tuesday, May 05, 2009 8:02 AM To: r-help@r-project.org Subject: [R] Stepwise logistic regression with significance testing - stepAIC Hello R-Users, I have one binary dependent variable and a set of independent variables (glm(formula,…,family=”binomial”) ) and I am using the function stepAIC (“MASS”) for choosing an optimal model. However I am not sure if stepAIC considers significance properties like Likelihood ratio test and Wald test (see example below). y - rbinom(30,1,0.4) x1 - rnorm(30) x2 - rnorm(30) x3 - rnorm(30) xdata - data.frame(x1,x2,x3) fit1 - glm(y~ . ,family=binomial,data=xdata) stepAIC(fit1,trace=FALSE) Call: glm(formula = y ~ x3, family = binomial, data = xdata) Coefficients: (Intercept) x3 -0.3556 0.8404 Degrees of Freedom: 29 Total (i.e. Null); 28 Residual Null Deviance: 40.38 Residual Deviance: 37.86 AIC: 41.86 fit - glm( stepAIC(fit1,trace=FALSE)$formula ,family=binomial) my.summ - summary(fit) # Wald Test print(my.summ$coeff[,4]) (Intercept) x3 0.3609638 0.1395215 my.anova - anova(fit,test=Chisq) #LR Test print(my.anova$P[2]) [1] 0.1121783 Is there an alternative function or a possible way of checking if the added variable and the new model are significant within the regression steps? Thanks in advance for your help Regards Peter-Heinz Fox [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stepwise logistic regression with significance testing - stepAIC
Greg Snow wrote: There is not a meaningful alternative way since the way you propose is not meaningful. The Wald tests have some know problems even in the well defined cases. Both types of tests are designed to test a predefined hypothesis, not a conditional hypothesis on the stepwise procedure. It is best to use other approaches than stepwise selection (it has been shown to give biased results) such as the lasso. If you need to use stepwise, then you should bootstrap the entire selection process to get better estimates/standard errors. For bootstrapping the stepAIC procedure you may have a look at package bootStepAIC. Best, Dimitris Frank Harrell's book and package go into more detail on this and provide some tools to help (as well as the other packages that can be used). Hope this helps, -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.