Hi all, I have two very large samples of data (10000+ data points) and would like to perform normality tests on it. I know that p < .05 means that a data set is considered as not normal with any of the two tests. I am also aware that large samples tend to lead more likely to normal results (Andy Field, 2005).
I have a few questions to ensure that I am using them right. 1) The Shapiro-Wilk test requires to provide mean and sd. Is is correct to add here the mean and sd of the data itself (since I am comparing to a normal distribution with the same parameters) ? mySD <- sd(mydata$myfield) myMean <- mean(mydata$myfield) shapiro.test(rnorm(100, mean = myMean, sd = mySD)) 2) If I just want to test each distribution individually, I assume that I am doing a one-sample Kolmogorov-Smirnov test. Is that correct? 3) If I simply want to know if normality exists or not, what should I put for the parameter 'alternative' ? Does it actually matter? alternative = c("two.sided", "less", "greater") Thank you, Ralf ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.