I have summary statistics from many sets (10,000's) of near-normal continuous 
data.  From previously generated QQplots of these data I can visually see that 
most of them are normal with a few which are not normal.  I have the raw data 
for a few (700) of these sets.  I have applied several tests of normality, 
skew, and kurtosis to these sets to see which test might yield a parameter 
which identifies the sets which are visibly non-normal on the QQplot.  My 
conclusions thus far has been that the skew is the best determinant of 
non-normality for these particular data.

Given that I do not have ready access to the sets (10,000's) of data, only to 
summary statistics which have been calculated on these sets, is there a method 
by which I may estimate the skew given the following summary statistics:
0.1% 1% 5% 10% 25% 75% 90% 95% 99% 99.9% mean median N sigma

N is usually about 900, and so I would discount the 0.1%, 1%, 99%, and 99.9% 
quantiles as unreliable due to noisiness in the distributions.

I know that for instance there are general rules for calculated sigma of a 
normal distribution given quantiles, and so am wondering if there are any 
general rules for calculating skew given a set of quantiles, mean, and sigma.  
I am currently thinking of trying polynomial fits on the QQplot using the raw 
data I have and then empirically trying to derive a relationship between the 
quantiles and the skew.

Thank you for any ideas.

Leif Kirschenbaum
Senior Yield Engineer
Reflectivity, Inc.
(408) 737-8100 x307
[EMAIL PROTECTED]

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to