Read Mosteller and Tukey's, "Data Analysis and Regression."

SChiu <[EMAIL PROTECTED]> wrote:
Dear all,

I've a question regarding outliners and the number of data points.

For instance, I want to use regression to calculate the slope over 3
years, i.e. 36 data points, one point for each month. So I use the
following method:

1. calculate the median value
2. find the standard deviation
3. set the threshold = median value + std dev * constant (e.g.
constant = 10)
4. outliers are the data points which are greater than the threshold.
5. replace an outlier with the mean of its neighbor data points.
6. regression

However, I also want to find the slope for each year using the same
method. As I may not have all the 12 data points for each calendar
year (e.g. Feb 01 - Jan 04, 36 data points in total, 11 data points
for the 1st year and 1 data points for the last year), I found the
above-mentioned method didn't work ! very well to detect the outliers.
I'm thinking about making the constant smaller for fewer data points.
Any ideas?

Thanks,
SChiu
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
. http://jse.stat.ncsu.edu/ .
=================================================================


Phillip Good
http.ms//www.statistician.usa
"Never trust anything that can think for itself if you can't see where it keeps its brain."  JKR


Do you Yahoo!?
Get better spam protection with Yahoo! Mail

Reply via email to