Re: [R] Logistic regression with multiple imputation

Rafael Björk Wed, 30 Jun 2010 05:52:00 -0700

In addition to the tips above, you may want to chek out:
http://www.stat.columbia.edu/~gelman/arm/missing.pdf


2010/6/30 Chuck Cleland <cclel...@optonline.net>

> On 6/30/2010 1:14 AM, Daniel Chen wrote:
> > Hi,
> >
> > I am a long time SPSS user but new to R, so please bear with me if my
> > questions seem to be too basic for you guys.
> >
> > I am trying to figure out how to analyze survey data using logistic
> > regression with multiple imputation.
> >
> > I have a survey data of about 200,000 cases and I am trying to predict
> the
> > odds ratio of a dependent variable using 6 categorical independent
> variables
> > (dummy-coded). Approximatively 10% of the cases (~20,000) have missing
> data
> > in one or more of the independent variables. The percentage of missing
> > ranges from 0.01% to 10% for the independent variables.
> >
> > My current thinking is to conduct a logistic regression with multiple
> > imputation, but I don't know how to do it in R. I searched the web but
> > couldn't find instructions or examples on how to do this. Since SPSS is
> > hopeless with missing data, I have to learn to do this in R. I am new to
> R,
> > so I would really appreciate if someone can show me some examples or tell
> me
> > where to find resources.
>
>   Here is an example using the Amelia package to generate imputations
> and the mitools and mix packages to make the pooled inferences.
>
> titanic <-
> read.table("http://lib.stat.cmu.edu/S/Harrell/data/ascii/titanic.txt";,
> sep=',', header=TRUE)
>
> set.seed(4321)
>
> titanic$sex[sample(nrow(titanic), 10)] <- NA
> titanic$pclass[sample(nrow(titanic), 10)] <- NA
> titanic$survived[sample(nrow(titanic), 10)] <- NA
>
> library(Amelia) # generate multiple imputations
> library(mitools) # for MIextract()
> library(mix) # for mi.inference()
>
> titanic.amelia <- amelia(subset(titanic,
> select=c('survived','pclass','sex','age')),
>                         m=10, noms=c('survived','pclass','sex'),
> emburn=c(500,500))
>
> allimplogreg <- lapply(titanic.amelia$imputations,
> function(x){glm(survived ~ pclass + sex + age, family=binomial, data = x)})
>
> mice.betas.glm <- MIextract(allimplogreg, fun=function(x){coef(x)})
> mice.se.glm <- MIextract(allimplogreg,
> fun=function(x){sqrt(diag(vcov(x)))})
>
> as.data.frame(mi.inference(mice.betas.glm, mice.se.glm))
>
> # Or using only mitools for pooled inference
>
> betas <- MIextract(allimplogreg, fun=coef)
> vars <- MIextract(allimplogreg, fun=vcov)
> summary(MIcombine(betas,vars))
>
> > Thank you!
> >
> > Daniel
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Chuck Cleland, Ph.D.
> NDRI, Inc. (www.ndri.org)
> 71 West 23rd Street, 8th floor
> New York, NY 10010
> tel: (212) 845-4495 (Tu, Th)
> tel: (732) 512-0171 (M, W, F)
> fax: (917) 438-0894
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Logistic regression with multiple imputation

Reply via email to