I would like to say, 'on the contrary,' but I'm not contradicting Eric and the others' comments RE: "you don't calculate r^2 with insufficient data."

If the number of observations equals the number of terms in the (regression) model you do have a perfect fit, with no df, etc.  This data is, nonetheless, information.  To use the Baysian terminology, you must have prior information (i.e., strong belief) that the statistical error (variance) is smaller than some of the observed effects.  Then you can use a few techniques, such as probability plots or raw rank selection, to select out the larger effects.  Where possible, one can then go back, run some confirmation trials and establish if these effects are 'real.'  Where confirmation trials are not possible the accepted approach  :) is to assert loudly which effects you believe in, and go with them.

You did the study to make a prediction and pick a path to give you an advantage, yes?  Believing the large effects is the way to bet.  What odds you give this path depends in part on how much data you have.

One qualifier to the above:  I do most of my work with designed experiments - near orthogonal arrays.  If you have significant confounding in your design, you need to be aware of what it is so that you are not led down the traditional garden path by your own data.

Jay

Eric Bohlman wrote:

In sci.stat.consult Graeme Byrne <[EMAIL PROTECTED]> wrote:
> In short, you don't. If the number of terms in the model equals the number
> of observations you have much bigger problems than not being able to compute
> adjusted R^2. It should always be the case that the number of observations
> exceed the number of terms in the model otherwise you cannot calculate any
> of the standard regression diagnostics (F-stats, t-stats etc). My advice is
> get more data or remove terms from the model. If neither of these is an
> option you are stuck.

It's worse than not being able to calculate regression diagnostics.  You
can't make *any* inferences beyond your observed data.  Consider the
degenerate case of trying to fit a bivariate regression line when you have
only two observations.  You'll *always* get a perfect fit because two
points mathematically define a line.  But that perfect fit will tell you
absolutely nothing about the underlying relationship between the two
variables.  It's consistent with *any* possible relationship, including
complete independence.  You can't tell how far off your model is from the
observations because there simply isn't any room for it to be off ("room
for the model to be off" otherwise being known as "degrees of freedom").

A model with as many parameters as observations is equivalent to the
observations themselves, and therefore testing such a model against the
observations is the same thing as asking if the observations are equal to
themselves, which is circular reasoning.

=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

--
Jay Warner
Principal Scientist
Warner Consulting, Inc.
4444 North Green Bay Road
Racine, WI 53404-1216
USA

Ph: (262) 634-9100
FAX: (262) 681-1133
email: [EMAIL PROTECTED]
web: http://www.a2q.com

The A2Q Method (tm) -- What do you want to improve today?
  ================================================================= Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =================================================================

Reply via email to