Read Mosteller and Tukey's, "Data Analysis and Regression."
SChiu <[EMAIL PROTECTED]> wrote:
SChiu <[EMAIL PROTECTED]> wrote:
Dear all,
I've a question regarding outliners and the number of data points.
For instance, I want to use regression to calculate the slope over 3
years, i.e. 36 data points, one point for each month. So I use the
following method:
1. calculate the median value
2. find the standard deviation
3. set the threshold = median value + std dev * constant (e.g.
constant = 10)
4. outliers are the data points which are greater than the threshold.
5. replace an outlier with the mean of its neighbor data points.
6. regression
However, I also want to find the slope for each year using the same
method. As I may not have all the 12 data points for each calendar
year (e.g. Feb 01 - Jan 04, 36 data points in total, 11 data points
for the 1st year and 1 data points for the last year), I found the
above-mentioned method didn't work ! very well to detect the outliers.
I'm thinking about making the constant smaller for fewer data points.
Any ideas?
Thanks,
SChiu
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
. http://jse.stat.ncsu.edu/ .
=================================================================
Phillip Good
http.ms//www.statistician.usa
"Never trust anything that can think for itself if you can't see where it keeps its brain." JKR
http.ms//www.statistician.usa
"Never trust anything that can think for itself if you can't see where it keeps its brain." JKR
Do you Yahoo!?
Get better spam protection with Yahoo! Mail
