[ https://issues.apache.org/jira/browse/MATH-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284500#comment-14284500 ]
Thomas Neidhart edited comment on MATH-1197 at 1/20/15 10:10 PM: ----------------------------------------------------------------- The exactP method also seems to have a problem when comparing it with the results from R. Take this example: {code} double[] x = new double[] { 0, 0, 0, 0, 1 }; double[] y = new double[] { 0, 0, 1, 1, 2, 3 }; final KolmogorovSmirnovTest test = new KolmogorovSmirnovTest(); System.out.println("p=" + test.kolmogorovSmirnovTest(x, y, true)); System.out.println("D=" + test.kolmogorovSmirnovStatistic(x, y)); System.out.println("approximateP=" + test.approximateP(test.kolmogorovSmirnovStatistic(x, y), x.length, y.length)); System.out.println("exactP=" + test.exactP(test.kolmogorovSmirnovStatistic(x, y), x.length, y.length, false)); {code} returns: {noformat} p=0.35714285714285715 D=0.46666666666666673 approximateP=0.5925028311389975 exactP=0.4155844155844156 {noformat} R computes the following: {noformat} data: x and y D = 0.4667, p-value = 0.5925 alternative hypothesis: two-sided {noformat} Edit: the reason seems to be that R can not compute exactP values in case of ties. was (Author: tn): The exactP method also seems to have a problem when comparing it with the results from R. Take this example: {code} double[] x = new double[] { 0, 0, 0, 0, 1 }; double[] y = new double[] { 0, 0, 1, 1, 2, 3 }; final KolmogorovSmirnovTest test = new KolmogorovSmirnovTest(); System.out.println("p=" + test.kolmogorovSmirnovTest(x, y, true)); System.out.println("D=" + test.kolmogorovSmirnovStatistic(x, y)); System.out.println("approximateP=" + test.approximateP(test.kolmogorovSmirnovStatistic(x, y), x.length, y.length)); System.out.println("exactP=" + test.exactP(test.kolmogorovSmirnovStatistic(x, y), x.length, y.length, false)); {code} returns: {noformat} p=0.35714285714285715 D=0.46666666666666673 approximateP=0.5925028311389975 exactP=0.4155844155844156 {noformat} R computes the following: {noformat} data: x and y D = 0.4667, p-value = 0.5925 alternative hypothesis: two-sided {noformat} > Incorrect Kolmogorov–Smirnov Statistic for two samples > ------------------------------------------------------- > > Key: MATH-1197 > URL: https://issues.apache.org/jira/browse/MATH-1197 > Project: Commons Math > Issue Type: Bug > Affects Versions: 3.4.1 > Environment: Ubuntu 14.04 > Reporter: Danaja Thiyunuwan Maldeniya > Attachments: MATH-1197.patch > > > kolmogorovSmirnovTest(double[],double[]) against the samples given below > gives 5.699107852308316E-12 instead of 0.9793 (approx.) Traced the issue to > kolmogorovSmirnovStatistic(double[],double[]) which gives 0.49507389162561577 > instead of 0.064 (verified with ks.test in R and JDistlib) > double[] x = > {0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 > > ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 > > ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 > > ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 > > ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 > > ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 > > ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 > > ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 > > ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 > > ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,2.202653,2.202653,2.202653 > > ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653 > > ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653 > > ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,3.181199,3.181199,3.181199,3.181199,3.181199,3.181199,3.723539 > > ,3.723539,3.723539,3.723539,4.383482,4.383482,4.383482,4.383482,5.320671,5.320671,5.320671,5.717284,6.964001,7.352165 > > ,8.710510,8.710510,8.710510,8.710510,8.710510,8.710510,9.539004,9.539004, > 10.720619, 17.726077, 17.726077, 17.726077, 17.726077 > ,22.053875 ,23.799144 ,27.355308 ,30.584960 ,30.584960 > ,30.584960, 30.584960, 30.751808}; > double[] y = > {0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 > > ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 > > ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 > > ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,2.202653 > > ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,3.061758,3.723539,5.628420,5.628420,5.628420,5.628420 > ,5.628420,6.916982,6.916982,6.916982, 10.178538, 10.178538, > 10.178538, 10.178538, 10.178538 }; -- This message was sent by Atlassian JIRA (v6.3.4#6332)