Hello! If I understand this listserve correctly, I can email this address to get help when I am struggling with code. If this is inaccurate, please let me know, and I will unsubscribe. I have been struggling with the same error message for a while, and I can't seem to get past it. Here is the issue: I am using a data set that uses -1:-9 to indicate various kinds of missing data. I changed all of these to NA, regardless of the cause of the missing data. I am trying to do propensity score matching with this data, but it will not calculate the propensity scores, regardless of which method I have tried. I have tried the following methods: 1. Optimal propensity score matching, using the MatchIt library: m.out<-matchit(assignment~totalexp + yrschool+new+cert+age+STratio + percminority+urbanicity+povproblem+numthreats+numbattack+weight, data = data, distance="logit", method = "optimal", ratio = 1) 2. Nearest neighbor propensity score matching, using the MatchIt library: mout<-matchit(assignment~totalexp + yrschool+new+cert+age+STratio+percminority+urbanicity+povproblem+numthreats+numbattack, distance = "logit", replace = T, data = data, method = "nearest", m.order="largest", caliper = 0.10) 3. Just calculating the propensity scores using the glm function: ps.model = glm(assignment~totalexp + yrschool+new+cert+age+STratio+percminority+urbanicity+povproblem+numthreats+numbattack, family = "binomial", data = data) data$propensityscores = fitted(ps.model)
In each case, I have tried running the code after having performed zero imputations, 1 imputation, and 5 imputations. A colleague looked at my code and assured me that I was doing the imputations correctly. However, even after performing the imputation, one of the continuous variables still has NAs. This is the code that I am using for 5 imputations: library(mice) #Remove weights data$weight<-NULL #perform the imputation imputed.data = mice(data, m = 5, diagnostics = F) #reinsert the weights imputed.data.final=complete(imputed.data) imputed.data.final$weight=lbdata$weight #rename the imputed dataset "data" data = imputed.data.final When I perform optimal propensity score matching or nearest neighbor matching (regardless of how many imputations I perform), I get the following error: Error in matchit(assignment ~ totalexp + yrschool + new + cert + age + : Missing values exist in the data I tried running these with just two of the categorical covariates, but I still got this error, even though there is no missing data for those variables. When I perform the glm function to get the propensity scores, I get this error, indicating that, for some reason, it is reducing the number of rows in my data set, which makes me think that it is doing list-wise deletion: Error in `$<-.data.frame`(`*tmp*`, "propensityscores", value = c(0.116801691392172, : replacement has 15934 rows, data has 16844 However, this method works if I remove the covariate that has missing data. So, I guess my question is, how do I get the code to impute for the variable that it is not imputing? Or, do I just need to chuck this variable? And, if I just need to chuck this variable, how do I get the optimal propensity score method to work? Currently it doesn't work even when I chuck this variable. Thank you for any help or advice! Liz [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.