Radford Neal: > I've encountered suggestions to use bootstrap in circumstances such > this before, but I've never understood them. The bootstrap samples > will clearly violate the assumption of independent residuals that > underlies the usual regression model. The bootstrap samples will > have less diverse values for the predictor variables. So it seems > me that the bootstrap results will NOT be a good guide to what is > going on with the actual sample.
It depends on what you want to do. Bootstrap is ideally suited if your objective is to investigate the PDF of the parameters. In a way, the resulting parameter distribution is very similar in meaning as what you would obtain analytically via Bayesian methods (also observe de Finetti's "permutability" theorem). Tibshirani, Hastie & Friedman elaborate on this similarity in their book. But when you're interested in the performance of a regression model, Breiman uses a scheme where you create a bootstrap sample for training and use the samples that weren't selected for testing, but I too tend to prefer a cross-validation scheme, ideally 2-fold with many replications. In my experience, leave-one-out does not penalize overfitting enough, and the resulting residual distributions are quite useless. -- mag. Aleks Jakulin http://ai.fri.uni-lj.si/aleks/ Artificial Intelligence Laboratory, Faculty of Computer and Information Science, University of Ljubljana. . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
