Hey all,

The code below creates a partial dependence plot for the variable x1 in the
linear model y ~ x1 + x1^2 + x2.

I have noticed that the for loop in the code takes a long time to run if the
size of the data is increased.  Is there a way to change the for loop into
an apply statement?  The tricky part is that I need to change the values of
x1 in each step of the loop to give me the appropriate dataset to make
predictions on < cbind(m[,-match("x1",names(m))],x1=a[1,i+1]) >.   If I try
and add the 1112 columns to the dataset a priori and use apply, the code
won't work because the predict function needs the column labeled "x1".  I
realize I could just grab the form of the linear function and use that
instead of predict(), but I don't want to do that because I want to make
this code applicable to generic model fits.

#create fake data and fit a simple linear regression model
x1 <- rep(c(1,3,4),100)
x2 <- rep(c(1:6),50)
y <- 40+2*x1^.5 - 6*x2 + rnorm(100,0,2)
m <- as.data.frame(cbind(y,x1,x2))
lm1 <- lm(y~x1+I(x1^2)+x2,data=m)

#super small version of R code for partial dependence plot
a <- rbind(c(0:1111)*(max(m$x1)-min(m$x1))/1111 +
min(m$x1),c(0:1111)*0-99999)
for(i in c(0:1111))
{
  a[2,i+1] <-
mean(predict(lm1,cbind(m[,-match("x1",names(m))],x1=a[1,i+1])))
}
plot(a[1,],a[2,],xlab="x1",ylab="Response",type="l",main="Partial Dependence
Plot")

Many thanks,

Mike

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to