I'm trying to get predicted probabilities out of a regression model, but am having trouble with the "newdata" option in the predict() function. Suppose I have a model with two independent variables, like this:

y=rbinom(100, 1, .3)
x1=rbinom(100, 1, .5)
x2=rnorm(100, 3, 2)
mod=glm(y ~ x1 + x2, family=binomial)

I can then get the predicted probabilities for the two values of x1, holding x2 constant at 0 like this:

p2=predict(mod, type="response", newdata=as.data.frame(cbind(x1, x2=0)))
unique(p2)

However, I am running regressions as part of a function I wrote, which feeds in the independent variables to the regression in matrix form, like this:

dat=cbind(x1, x2)
mod2=glm(y ~ dat, family=binomial)

The results are the same as in mod. Yet I cannot figure out how to input information into the "newdata" option of predict() in order to generate the same predicted probabilities as above. The same code as above does not work:

p2a=predict(mod2, type="response", newdata=as.data.frame(cbind(x1, x2=0)))
unique(p2a)

Nor does creating a data frame that has the names "datx1" and "datx2," which is how the variables appear if you run a summary() on mod2. Looking at the model matrix of mod2 shows that the fitted model only shows two variables, the dependent variable y and one independent variable called "dat." It is as if my two variables x1 and x2 have become two levels in a factor variable called "dat."

names(mod2$model)

My question is this: if I have a fitted model like mod2, how do I use the "newdata" option in the predict function so that I can get the predicted values I am after? I.E. how do I recreate a data frame with one variable called "dat" that contains two levels which represent my (modified) variables x1 and x2?

Thanks in advance!

Andrew Miles
Department of Sociology
Duke University

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to