On Sat, 27 Mar 2004 21:54:48 -0500, Rajarshi Guha
<[EMAIL PROTECTED]> wrote:

> Hello,
>   I'm considering  a problem where I would like to classify the
> predictions made by a multiple linear regression model as 'good' or 'bad'.
> 
> I have considered a number of ways to go about it - the most obvious being
> the use of outlier diagnostics and classifying outliers as 'bad'
> predictions.

If you 'the obvious'  about outliers one step further, you can 
rank outliers, numbering them from 1 to n.  When you look at
the absolute magnitudes, to you see any discontinuities?

Are you concerned solely with classifying predictions from
one model, or is there a comparison, somewhere, with the
result of other models? - that raises additional problems.

> 
> However I was also wondering whether it would make sense to use the
> standard deviation of the observations (or the predictions) as a criterion.
> That is, if a prediction lies outside, say, 1 standard deviation, it would
> be 'bad'. The problem is '1 standard deviation of *what* ?'
> 
There are only two SDs that come immediately to mind.
The SD  of the residuals is going to describe them in 
a rather fixed proportions -- but the residual plot is what 
is useful for showing an overall problem.

The SD  of the sample is related by way of the R-squared
of prediction.  In might be meaningful for some applications,
to talk about how close the predictions are.  Or, it might not
gain anything over mentioning the R-squared and adjusted
R-squared.  Is the N  large enough that you are safe 
against overfitting?



> I'm not sure that it seems to make sense to say 1 standard deviation
> beyond the mean of the observation. 
> 
> Is it possible to calculate confidence intervals for the predictions and
> then say that if an observation lies outside the calculated confidence
> interval it would be 'bad'?
> 
> Or should I simply stick to outlier diagnostics?
> 

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
 - I need a new job, after March 31.  Openings? -
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to