Re: R - how could lm() retrieve data from an object with leaving one row out?

Aleks Jakulin Thu, 12 Feb 2004 05:17:51 -0800

Radford Neal:
> I've encountered suggestions to use bootstrap in circumstances such
> this before, but I've never understood them.  The bootstrap samples
> will clearly violate the assumption of independent residuals that
> underlies the usual regression model.  The bootstrap samples will
> have less diverse values for the predictor variables.  So it seems
> me that the bootstrap results will NOT be a good guide to what is
> going on with the actual sample.


It depends on what you want to do. Bootstrap is ideally suited if your
objective is to investigate the PDF of the parameters. In a way, the
resulting parameter distribution is very similar in meaning as what
you would obtain analytically via Bayesian methods (also observe de
Finetti's "permutability" theorem). Tibshirani, Hastie & Friedman
elaborate on this similarity in their book.

But when you're interested in the performance of a regression model,
Breiman uses a scheme where you create a bootstrap sample for training
and use the samples that weren't selected for testing, but I too tend
to prefer a cross-validation scheme, ideally 2-fold with many
replications. In my experience, leave-one-out does not penalize
overfitting enough, and the resulting residual distributions are quite
useless.

-- 
mag. Aleks Jakulin
http://ai.fri.uni-lj.si/aleks/
Artificial Intelligence Laboratory,
Faculty of Computer and Information Science, University of Ljubljana.



.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: R - how could lm() retrieve data from an object with leaving one row out?

Reply via email to