On Sun, 2006-10-29 at 12:54 -0500, Kevin Middleton wrote: > Based on some recent r-help discussions, I have been trying out > plotting confidence intervals using predict and matplot. Matplot > appeared to not be plotting the linear regression when using the > default column names generated by read.table (V1, V2, etc). On > further inspection, the error seemed to be with predict and vector > names (V1, V2) rather than with matplot. I was using some textbook > data sets that were read with read.table, so my data.frame columns > were named by default. The problem seems to be related to the name of > the vector only (though I may be wrong). > > The example below, based on that in ?predict.lm, better describes the > problem. Using predict with V2~V1 & y~x produces identical output > (when x<-V1 and y<-V2). However, using predict with > interval="confidence" results in different output from the same data. > That with y~x is correct, and V2~V1 is "incorrect".
This is because you have been caught out by the way newdata is handled in predict. This is your problem: new <- data.frame(x = seq(min(V1), max(V1), length.out = length(V1))) names(new) You have created new, a 1 column data frame with the colname "x". As "x" doesn't exist in the model formula (V2 ~ V1), what the predict line actually means, is to predict value for the observed data used to fit them model, i.e. the fitted values: pred.w.clim <- predict(lm(V2 ~ V1), new, interval="confidence") fit <- fitted(lm(V2 ~ V1)) all.equal(fit, pred.w.clim[, 1]) When you do predict(lm(y ~ x), new, interval="confidence"), "x" is now in the model formula *and* in "new" so you get predictions for the x values in "new". If you'd done this: new <- data.frame(V1 = seq(min(V1), max(V1), length.out = length(V1))) predict(lm(V2 ~ V1), new, interval="confidence") Then you'd have gotten what you wanted. HTH G > > This may be related to a previous post on r-help: > http://finzi.psych.upenn.edu/R/Rhelp02a/archive/56982.html > > I can't figure out why there would be a difference in predict when > only the vector name seemingly changes. Granted, I could just read > the data.frame is as "x" and "y." > > Thanks > Kevin > > ###################### > > set.seed(10) > > # For example purposes, plot side by side > par(mfrow=c(1,2)) > > V1 <- rnorm(15) > V2 <- V1 + rnorm(15) > > new <- data.frame(x = seq(min(V1), max(V1), length.out = length(V1))) > > pred.w.clim <- predict(lm(V2 ~ V1), new, interval="confidence") > matplot(new$x, pred.w.clim, > lty=c(1,2,2), type="l", > col=c("black", "red", "red"), > ylab="predicted y") > points(V1,V2) > > > # Create x & y equal to V1 & V2 > x <- V1 > y <- V2 > > pred.w.clim2 <- predict(lm(y ~ x), new, interval="confidence") > matplot(new$x, pred.w.clim2, > lty=c(1,2,2), type="l", > col=c("black", "red", "red"), > ylab="predicted y") > points(x,y) > > # Test if V1=x, V2=y > all.equal(x,V1) > all.equal(y,V2) > > # Same output with predict > predict(lm(V2 ~ V1)) > predict(lm(y ~ x)) > all.equal(predict(lm(V2 ~ V1)), predict(lm(y ~ x))) > > # Different output with interval="confidence" > pred.w.clim > pred.w.clim2 > all.equal(pred.w.clim, pred.w.clim2) > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC [f] +44 (0)20 7679 0565 UCL Department of Geography Pearson Building [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street London, UK [w] http://www.ucl.ac.uk/~ucfagls/ WC1E 6BT [w] http://www.freshwaters.org.uk/ %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.