(Near) non-identifiability (especially in nonlinear models, which include linear mixed effects models, Bayesian hierarchical models, etc.) is typically a strong clue; usually indicated by software complaints (e.g. convergence failures, running up against iteration limits, etc.).
However this is sufficient-ish, not necessary: "over-fitting" frequently occurs even without such overt complaints. It should also be said that, except for identifiability, "over-fitting" is not a well-defined statistical term: it depends on the scientific context. Bert Gunter Genentech Nonclinical Biostatistics -----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Steve Lianoglou Sent: Sunday, May 09, 2010 6:13 PM To: David Winsemius Cc: r-help@r-project.org; bbslover Subject: Re: [R] How to estimate whether overfitting? On Sun, May 9, 2010 at 11:53 AM, David Winsemius <dwinsem...@comcast.net> wrote: > > On May 9, 2010, at 9:20 AM, bbslover wrote: > >> >> 1. is there some criterion to estimate overfitting? e.g. R2 and Q2 in the >> training set, as well as R2 in the test set, when means overfitting. for >> example, in my data, I have R2=0.94 for the training set and for the >> test >> set R2=0.70, is overfitting? >> 2. in this scatter, can one say this overfitting? >> >> 3. my result is obtained by svm, and the sample are 156 and 52 for the >> training and test sets, and predictors are 96, In this case, can svm be >> employed to perform prediction? whether the number of the predictors are >> too many ? >> > > I think you need to buy a copy of Hastie, Tibshirani, and Friedman and do > some self-study of chapters 7 and 12. And you don't even have to buy it before you can start studying since the PDF is available here: http://www-stat.stanford.edu/~tibs/ElemStatLearn/ Having a hard cover is always handy, tho .. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.