Simple question.

For a simple linear regression, I obtained the "standard error of
predicted means", for both a confidence and prediction interval:

x<-1:15
y<-x + rnorm(n=15)
model<-lm(y~x)
predict.lm(model,newdata=data.frame(x=c(10,20)),se.fit=T,interval="confidence")$se.fit
        1         2
0.2708064 0.7254615

predict.lm(model,newdata=data.frame(x=c(10,20)),se.fit=T,interval="prediction")$se.fit
        1         2
0.2708064 0.7254615


I was surprised to find that the standard errors returned were in fact the
standard errors of the sampling distribution of Y_hat:

sqrt(MSE(1/n + (x-x_bar)^2/SS_x)),

not the standard errors of Y_new (predicted value):

sqrt(MSE(1 + 1/n + (x-x_bar)^2/SS_x)).

Is there a reason this quantity is called the "standard error of predicted
means" if it doesn't relate to the prediction distribution?

Turning to Neter et al.'s Applied Linear Statistical Models, I note that
if we have multiple observations, then the standard error of the mean of
the predicted value:

sqrt(MSE(1/m + 1/n + (x-x_bar)^2/SS_x)),

reverts to the standard error of the sampling distribution of Y-hat, as m,
the number of samples, gets large. Still, this doesn't explain the result
for small sample sizes.

Using R.2.1 for Windows

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to