Simple question. For a simple linear regression, I obtained the "standard error of predicted means", for both a confidence and prediction interval:
x<-1:15 y<-x + rnorm(n=15) model<-lm(y~x) predict.lm(model,newdata=data.frame(x=c(10,20)),se.fit=T,interval="confidence")$se.fit 1 2 0.2708064 0.7254615 predict.lm(model,newdata=data.frame(x=c(10,20)),se.fit=T,interval="prediction")$se.fit 1 2 0.2708064 0.7254615 I was surprised to find that the standard errors returned were in fact the standard errors of the sampling distribution of Y_hat: sqrt(MSE(1/n + (x-x_bar)^2/SS_x)), not the standard errors of Y_new (predicted value): sqrt(MSE(1 + 1/n + (x-x_bar)^2/SS_x)). Is there a reason this quantity is called the "standard error of predicted means" if it doesn't relate to the prediction distribution? Turning to Neter et al.'s Applied Linear Statistical Models, I note that if we have multiple observations, then the standard error of the mean of the predicted value: sqrt(MSE(1/m + 1/n + (x-x_bar)^2/SS_x)), reverts to the standard error of the sampling distribution of Y-hat, as m, the number of samples, gets large. Still, this doesn't explain the result for small sample sizes. Using R.2.1 for Windows ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html