Re: [R] formatting data for predict()

2010-09-26 Thread Ista Zahn
Hi Andrew,
My inclination would be to put all the variables in a data.frame
instead of putting the predictors in a matrix. But if you want to
continue down this road, you need to have a column named dat in a the
data.frame that contains a matrix. I couldn't figure out how to do
such a thing in a single call, so I had to create it in a separate
step:

newdat - data.frame(y=rep(NA, length(unique(x1
newdat$dat - cbind(unique(x1), x2=0)
p2a=predict(mod2, type=response, newdata=newdat)
p2a

Hope it helps,
Ista

On Sun, Sep 26, 2010 at 4:38 AM, Andrew Miles rstuff.mi...@gmail.com wrote:
 I'm trying to get predicted probabilities out of a regression model, but am
 having trouble with the newdata option in the predict() function.  Suppose
 I have a model with two independent variables, like this:

 y=rbinom(100, 1, .3)
 x1=rbinom(100, 1, .5)
 x2=rnorm(100, 3, 2)
 mod=glm(y ~ x1 + x2, family=binomial)

 I can then get the predicted probabilities for the two values of x1, holding
 x2 constant at 0 like this:

 p2=predict(mod, type=response, newdata=as.data.frame(cbind(x1, x2=0)))
 unique(p2)

 However, I am running regressions as part of a function I wrote, which feeds
 in the independent variables to the regression in matrix form, like this:

 dat=cbind(x1, x2)
 mod2=glm(y ~ dat, family=binomial)

 The results are the same as in mod.  Yet I cannot figure out how to input
 information into the newdata option of predict() in order to generate the
 same predicted probabilities as above.  The same code as above does not
 work:

 p2a=predict(mod2, type=response, newdata=as.data.frame(cbind(x1, x2=0)))
 unique(p2a)

 Nor does creating a data frame that has the names datx1 and datx2, which
 is how the variables appear if you run a summary() on mod2.  Looking at the
 model matrix of mod2 shows that the fitted model only shows two variables,
 the dependent variable y and one independent variable called dat.  It is
 as if my two variables x1 and x2 have become two levels in a factor variable
 called dat.

 names(mod2$model)

 My question is this:  if I have a fitted model like mod2, how do I use the
 newdata option in the predict function so that I can get the predicted
 values I am after?  I.E. how do I recreate a data frame with one variable
 called dat that contains two levels which represent my (modified)
 variables x1 and x2?

 Thanks in advance!

 Andrew Miles
 Department of Sociology
 Duke University

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] formatting data for predict()

2010-09-25 Thread Andrew Miles
I'm trying to get predicted probabilities out of a regression model,  
but am having trouble with the newdata option in the predict()  
function.  Suppose I have a model with two independent variables, like  
this:


y=rbinom(100, 1, .3)
x1=rbinom(100, 1, .5)
x2=rnorm(100, 3, 2)
mod=glm(y ~ x1 + x2, family=binomial)

I can then get the predicted probabilities for the two values of x1,  
holding x2 constant at 0 like this:


p2=predict(mod, type=response, newdata=as.data.frame(cbind(x1, x2=0)))
unique(p2)

However, I am running regressions as part of a function I wrote, which  
feeds in the independent variables to the regression in matrix form,  
like this:


dat=cbind(x1, x2)
mod2=glm(y ~ dat, family=binomial)

The results are the same as in mod.  Yet I cannot figure out how to  
input information into the newdata option of predict() in order to  
generate the same predicted probabilities as above.  The same code as  
above does not work:


p2a=predict(mod2, type=response, newdata=as.data.frame(cbind(x1,  
x2=0)))

unique(p2a)

Nor does creating a data frame that has the names datx1 and datx2,  
which is how the variables appear if you run a summary() on mod2.   
Looking at the model matrix of mod2 shows that the fitted model only  
shows two variables, the dependent variable y and one independent  
variable called dat.  It is as if my two variables x1 and x2 have  
become two levels in a factor variable called dat.


names(mod2$model)

My question is this:  if I have a fitted model like mod2, how do I use  
the newdata option in the predict function so that I can get the  
predicted values I am after?  I.E. how do I recreate a data frame with  
one variable called dat that contains two levels which represent my  
(modified) variables x1 and x2?


Thanks in advance!

Andrew Miles
Department of Sociology
Duke University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.