"It gets curiouser and curiouser," said Alice. -- Bert
On Tue, May 8, 2012 at 9:07 PM, array chip <arrayprof...@yahoo.com> wrote: > Paul, thanks for your thoughts. blunt, not at all.... > > If I understand correctly, it doesn't help anything to speculate whether > there might be additional variables existing or not. Given current variables > in the model, it's perfectly fine to draw conclusions based on significant > coefficients regardless of R-squared is high or low. > > Gary King's article is interesting... > > John > > > > ________________________________ > From: Paul Johnson <pauljoh...@gmail.com> > > Cc: peter dalgaard <pda...@gmail.com>; "r-help@r-project.org" > <r-help@r-project.org> > Sent: Tuesday, May 8, 2012 8:23 PM > Subject: Re: [R] low R square value from ANCOVA model > > >> Thanks again Peter. What about the argument that because low R square (e.g. >> R^2=0.2) indicated the model variance was not sufficiently explained by the >> factors in the model, there might be additional factors that should be >> identified and included in the model. And If these additional factors were >> indeed included, it might change the significance for the factor of interest >> that previously showed significant coefficient. In other word, if R square >> is low, the significant coefficient observed is not trustworthy. >> >> What's your opinion on this argument? > > I think that argument is silly. I'm sorry if that is too blunt. Its > just plain superficial. > It reflects a poor understanding of what the linear model is all > about. If you have > other variables that might "belong" in the model, run them and test. > The R-square, > either low or high, does not have anything direct to say about whether > those other > variables exist. > > Here's my authority. > > Arthur Goldberger (A Course in Econometrics, 1991, p.177) > “Nothing in the CR (Classical Regression) model requires that R2 be high. > Hence, > a high R2 is not evidence in favor of the model, and a low R2 is not evidence > against it.” > > I found that reference in Anders Skrondal and Sophia Rabe-Hesketh, > Generalized Latend Variable Modeling: Multilevel, Longitudinal, > and Structural Equation Models, Boca Raton, FL: Chapman and Hall/CRC, 2004. > > From Section 8.5.2: > > "Furthermore, how badly the baseline model fits the data depends greatly > on the magnitude of the parameters of the true model. For instance, consider > estimating a simple parallel measurement model. If the true model is a > congeneric measurement model (with considerable variation in factor loadings > and measurement error variances between items), the fit index could be high > simply because the null model fits very poorly, i.e. because the > reliabilities of > the items are high. However, if the true model is a parallel measurement model > with low reliabilities the fit index could be low although we are estimating > the > correct model. Similarly, estimating a simple linear regression model can > yield > a high R2 if the relationship is actually quadratic with a considerable linear > trend and a low R2 when the model is true but with a small slope (relative to > the overall variance)." > > For a detailed argument/explanation of the argument that the R-square is not > a way to decide if a model is "good" or "bad" see > > King, Gary. (1986). How Not to Lie with Statistics: Avoiding Common Mistakes > in > Quantitative Political Science. American Journal of Political Science, > 30(3), 666–687. doi:10.2307/2111095 > > pj > -- > Paul E. Johnson > Professor, Political Science Assoc. Director > 1541 Lilac Lane, Room 504 Center for Research Methods > University of Kansas University of Kansas > http://pj.freefaculty.org http://quant.ku.edu > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.