-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 hadley wickham wrote: > On Mon, Nov 10, 2008 at 9:22 AM, Mike Dunbar <[EMAIL PROTECTED]> wrote: >> (apologies - I should have written coast * MBL not ML) >> >> I'm not sure of my ground here, but surely do lose something -
you wouldn't retain coast:MBL if it's not significant, as you lose degrees of freedom, and this gets worse the more terms and the more interactions you consider. > > But if you drop the term you are effectively spending your degrees of > freedom twice - once to estimate the effect that you drop, and then > again in the new model. Another way of to see the problem is to think > about the null distribution of the p-values - if you only include > significant p values in your model, the standard null hypothesis is > clearly not appropriate. > > I think there's a good discussion of this in Frank Harrell's > regression modelling strategies, but unfortunately I don't have a copy > on hand to point you to the exact location. > > Hadley See e.g. sections 4.2 through 4.4 (pp. 56-60). The discussion above does not mean that overfitted models are good, or that there isn't a penalty to overspecifying models (or otherwise one would always throw everything into the models), but that data-driven model selection has some very fundamental problems ... cheers Ben Bolker -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkkYk1MACgkQc5UpGjwzenOcvgCePr2fJx+GfV++s6Q14pQe/Ryj vf8An2Gxc3SCzsCHj7x53yOXAx/NZng4 =Os6f -----END PGP SIGNATURE----- _______________________________________________ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology