Thanks Ted and Greg. I had actually tried pnorm and after having problems, thought maybe I was misunderstanding dnorm as a variable in ks.test due to over- (more likely under) thinking it. I'm assuming now that ks.test will consider my data in cumulative form (makes sense now that I think about it, but I didn't want to assume any steps that the R version of k-s test takes). I plan to explore the ideas and run the simulations you sent in full over the weekend.
Thanks again! Kerry On Nov 11, 12:05 pm, Greg Snow <greg.s...@imail.org> wrote: > Consider the following simulations (also fixing the pnorm instead of dnorm > that Ted pointed out and I missed): > > out1 <- replicate(10000, { > x <- rnorm(1000, 100, 3); > ks.test( x, pnorm, mean=100, sd=3 )$p.value > } ) > > out2 <- replicate(10000, { > x <- rnorm(1000, 100, 3); > ks.test( x, pnorm, mean=mean(x), sd=sd(x) )$p.value > } ) > > par(mfrow=c(2,1)) > hist(out1) > hist(out2) > > mean(out1 <= 0.05 ) > mean(out2 <= 0.05 ) > > In both cases the null hypothesis is true (or at least a meaningful > approximation to true) so the p-values should follow a uniform distribution. > In the case of out1 where the mean and sd are specified as part of the null > the p-values are reasonably uniform and the rejection rate is close to alpha > (should asymptotically approach alpha as the number of simulations > increases). However looking at out2, where the parameters are set not by > outside knowledge or tests, but rather from the observed data, the p-values > are clearly not uniform and the rejection rate is far from alpha. > > -- > Gregory (Greg) L. Snow Ph.D. > Statistical Data Center > Intermountain Healthcare > greg.s...@imail.org801.408.8111begin_of_the_skype_highlighting 801.408.8111 end_of_the_skype_highlighting > > > > > -----Original Message----- > > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > > project.org] On Behalf Of Kerry > > Sent: Thursday, November 11, 2010 12:02 AM > > To: r-h...@r-project.org > > Subject: Re: [R] Kolmogorov Smirnov Test > > > Thanks for the feedback. My goal is to run a simple test to show that > > the data cannot be rejected as either normally or uniformally > > distributed (depening on the variable), which is what a previous K-S > > test run using SPSS had shown. The actual distribution I compare to my > > sample only matters that it would be rejected were my data multi- > > modal. This way I can suggest the data is from the same population. I > > later run PCA and cluster analyses to confirm this but I want an easy > > stat to start with for the individual variables. > > > I didn't think I was comparing my data against itself, but rather > > again a normal distribution with the same mean and standard deviation. > > Using the mean seems necessary, so is it incorrect to have the same > > standard deviation too? I need to go back and read on the K-S test to > > see what the appropriate constraints are before bothering anyone for > > more help. Sorry, I thought I had it. > > > Thanks again, > > kbrownk > > > On Nov 11, 12:40 am, Greg Snow <greg.s...@imail.org> wrote: > > > The way you are running the test the null hypothesis is that the data > > comes from a normal distribution with mean=0 and standard deviation = > > 1. If your minimum data value is 0, then it seems very unlikely that > > the mean is 0. So the test is being strongly influenced by the mean > > and standard deviation not just the shape of the distribution. > > > > Note that the KS test was not designed to test against a distribution > > with parameters estimated from the same data (you can do the test, but > > it makes the p-value inaccurate). You can do a little better by > > simulating the process and comparing the KS statistic to the > > simulations rather than looking at the computed p-value. > > > > However you should ask yourself why you are doing the normality tests > > in the first place. The common reasons that people do this don't match > > with what the tests actually test (see the fortunes on normality). > > > > -- > > > Gregory (Greg) L. Snow Ph.D. > > > Statistical Data Center > > > Intermountain Healthcare > > > greg.s...@imail.org > > > 801.408.8111 > > > > > -----Original Message----- > > > > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > > > > project.org] On Behalf Of Kerry > > > > Sent: Wednesday, November 10, 2010 9:23 PM > > > > To: r-h...@r-project.org > > > > Subject: [R] Kolmogorov Smirnov Test > > > > > I'm using ks.test (mydata, dnorm) on my data. I know some of my > > > > different variable samples (mydata1, mydata2, etc) must be normally > > > > distributed but the p value is always < 2.0^-16 (the 2.0 can change > > > > but not the exponent). > > > > > I want to test mydata against a normal distribution. What could I > > be > > > > doing wrong? > > > > > I tried instead using rnorm to create a normal distribution: y = > > rnorm > > > > (68,mean=mydata, sd=mydata), where N= the sample size from mydata. > > > > Then I ran the k-s: ks.test (mydata,y). Should this work? > > > > > One issue I had was that some of my data has a minimum value of 0, > > but > > > > rnorm ran as I have it above will potentially create negative > > numbers. > > > > > Also some of my variables will likely be better tested against non- > > > > normal distributions (uniform etc.), but if I figure I should learn > > > > how to even use ks.test first. > > > > > I used to use SPSS but am really trying to jump into R instead, but > > I > > > > find the help to assume too heavy of statistical knowledge. > > > > > I'm guessing I have a long road before I get this, so any bits of > > > > information that may help me get a bit further will be appreciated! > > > > > Thanks, > > > > kbrownk > > > > > ______________________________________________ > > > > r-h...@r-project.org mailing list > > > >https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guidehttp://www.R-project.org/posting- > > > > guide.html > > > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > > > r-h...@r-project.org mailing > > listhttps://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guidehttp://www.R-project.org/posting- > > guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > ______________________________________________ > > r-h...@r-project.org mailing list > >https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guidehttp://www.R-project.org/posting- > > guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.