Hey all, The code below creates a partial dependence plot for the variable x1 in the linear model y ~ x1 + x1^2 + x2.
I have noticed that the for loop in the code takes a long time to run if the size of the data is increased. Is there a way to change the for loop into an apply statement? The tricky part is that I need to change the values of x1 in each step of the loop to give me the appropriate dataset to make predictions on < cbind(m[,-match("x1",names(m))],x1=a[1,i+1]) >. If I try and add the 1112 columns to the dataset a priori and use apply, the code won't work because the predict function needs the column labeled "x1". I realize I could just grab the form of the linear function and use that instead of predict(), but I don't want to do that because I want to make this code applicable to generic model fits. #create fake data and fit a simple linear regression model x1 <- rep(c(1,3,4),100) x2 <- rep(c(1:6),50) y <- 40+2*x1^.5 - 6*x2 + rnorm(100,0,2) m <- as.data.frame(cbind(y,x1,x2)) lm1 <- lm(y~x1+I(x1^2)+x2,data=m) #super small version of R code for partial dependence plot a <- rbind(c(0:1111)*(max(m$x1)-min(m$x1))/1111 + min(m$x1),c(0:1111)*0-99999) for(i in c(0:1111)) { a[2,i+1] <- mean(predict(lm1,cbind(m[,-match("x1",names(m))],x1=a[1,i+1]))) } plot(a[1,],a[2,],xlab="x1",ylab="Response",type="l",main="Partial Dependence Plot") Many thanks, Mike [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.