On Sat, 27 Mar 2004 21:54:48 -0500, Rajarshi Guha <[EMAIL PROTECTED]> wrote:
> Hello, > I'm considering a problem where I would like to classify the > predictions made by a multiple linear regression model as 'good' or 'bad'. > > I have considered a number of ways to go about it - the most obvious being > the use of outlier diagnostics and classifying outliers as 'bad' > predictions. If you 'the obvious' about outliers one step further, you can rank outliers, numbering them from 1 to n. When you look at the absolute magnitudes, to you see any discontinuities? Are you concerned solely with classifying predictions from one model, or is there a comparison, somewhere, with the result of other models? - that raises additional problems. > > However I was also wondering whether it would make sense to use the > standard deviation of the observations (or the predictions) as a criterion. > That is, if a prediction lies outside, say, 1 standard deviation, it would > be 'bad'. The problem is '1 standard deviation of *what* ?' > There are only two SDs that come immediately to mind. The SD of the residuals is going to describe them in a rather fixed proportions -- but the residual plot is what is useful for showing an overall problem. The SD of the sample is related by way of the R-squared of prediction. In might be meaningful for some applications, to talk about how close the predictions are. Or, it might not gain anything over mentioning the R-squared and adjusted R-squared. Is the N large enough that you are safe against overfitting? > I'm not sure that it seems to make sense to say 1 standard deviation > beyond the mean of the observation. > > Is it possible to calculate confidence intervals for the predictions and > then say that if an observation lies outside the calculated confidence > interval it would be 'bad'? > > Or should I simply stick to outlier diagnostics? > -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html - I need a new job, after March 31. Openings? - . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
