On 31-Jul-09 13:38:10, tedzzx wrote: > Dear R users, > I have got two samples: > sample A with observation of 223: > sample A has five categories: 1,2,3,4,5 (I use the numer > 1,2,3,4,5 to define the five differen categories) > there are 5 observations in category 1; 81 observations in > category 2;110 observations in category 3; 27 observations > in category 4; 0 observations in category 5; > To present the sample in R: a<-rep(1:5, c(5,81,110,27,0)) > > sample B with observation of 504: > sample B also has the same five categories: 1,2,3,4,5 > there are 6 observations in category 1; 127 observations in > category 2;297 observations in category 3; 72 observations > in category 4; 2 observations in category 5; > To present the sample in R: b<-rep(1:5, c(6,127,297,72,2)) > > I want to test weather these two samples have significant difference > in distribution ( or Tests for Two Independent Samples). > > I find a webside in: > http://faculty.chass.ncsu.edu/garson/PA765/mann.htm > > This page shows four nonparametric tests. Bust I can only find the test > Kolmogorov-Smirnov Z Test. > res<-ks.test(a,b) > > Can any one tell me which package has the other 3 tests? or Is there > any other test for my question? > Thanks advance > Ted
If your "1,2,3,4,5" are simply nominal codes for the categories, then you may be satisfied with a Fisher test or simply a chi-squared test (using simulated P-values in view of the low frequencies in some cells). Taking your data: A<-c(5,81,110,27,0) B<-c(6,127,297,72,2) M<-cbind(A,B) D<-colSums(M) P<-M%*%(diag(1/D)) P # [,1] [,2] # [1,] 0.02242152 0.011904762 # [2,] 0.36322870 0.251984127 ## So the main differences between # [3,] 0.49327354 0.589285714 ## A and B are in these two categories # [4,] 0.12107623 0.142857143 # [5,] 0.00000000 0.003968254 fisher.test(M,simulate.p.value = TRUE,B=100000) # Fisher's Exact Test for Count Data with simulated p-value # (based on 1e+05 replicates) # data: M # p-value = 0.01594 chisq.test(M,simulate.p.value=TRUE,B=100000) # Pearson's Chi-squared test with simulated p-value # (based on 1e+05 replicates) # data: M # X-squared = 11.7862, df = NA, p-value = 0.01501 So the P-values are similar in both tests. (Another) Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <ted.hard...@manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 31-Jul-09 Time: 17:53:58 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.