[jira] [Commented] (LANG-1083) Add (T) casts to get unit tests to pass in old JDK.
[ https://issues.apache.org/jira/browse/LANG-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283783#comment-14283783 ] Jonathan Baker commented on LANG-1083: -- Created PR https://github.com/apache/commons-lang/pull/42 Add (T) casts to get unit tests to pass in old JDK. --- Key: LANG-1083 URL: https://issues.apache.org/jira/browse/LANG-1083 Project: Commons Lang Issue Type: Bug Environment: Maven 3.2.5, Java 1.6.0_18, Fedora 11, AMD 64 (2.6.30.10-105.2.23.fc11.x86_64) Reporter: Jonathan Baker Priority: Trivial This is probably just a quirk of the old JDK that was used. The casts are not necessary on other computers, but they don't seem to hurt either. (Please verify that of course!) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (LANG-1083) Add (T) casts to get unit tests to pass in old JDK.
Jonathan Baker created LANG-1083: Summary: Add (T) casts to get unit tests to pass in old JDK. Key: LANG-1083 URL: https://issues.apache.org/jira/browse/LANG-1083 Project: Commons Lang Issue Type: Bug Environment: Maven 3.2.5, Java 1.6.0_18, Fedora 11, AMD 64 (2.6.30.10-105.2.23.fc11.x86_64) Reporter: Jonathan Baker Priority: Trivial This is probably just a quirk of the old JDK that was used. The casts are not necessary on other computers, but they don't seem to hurt either. (Please verify that of course!) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MATH-1197) Incorrect Kolmogorov–Smirnov Statistic for two samples
[ https://issues.apache.org/jira/browse/MATH-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283756#comment-14283756 ] Thomas Neidhart commented on MATH-1197: --- One observation: the samples contain a lot of equal values. The KS test statistic is implemented using Arrays.binarySearch, but this does not specify which index will be found when looking for a given value in a sorted array. E.g. if you have samples [0, 0, 0, 0, 0, 1] and you search for 0, you might get an index in the range [0, 4]. As far as I understand the KS statistic, it is an empirical distribution function which calculates the cumulative density based on how many values are less or equal than the given observation, which is not equal to the result returned by Arrays.binarySearch. Incorrect Kolmogorov–Smirnov Statistic for two samples --- Key: MATH-1197 URL: https://issues.apache.org/jira/browse/MATH-1197 Project: Commons Math Issue Type: Bug Affects Versions: 3.4.1 Environment: Ubuntu 14.04 Reporter: Danaja Thiyunuwan Maldeniya kolmogorovSmirnovTest(double[],double[]) against the samples given below gives 5.699107852308316E-12 instead of 0.9793 (approx.) Traced the issue to kolmogorovSmirnovStatistic(double[],double[]) which gives 0.49507389162561577 instead of 0.064 (verified with ks.test in R and JDistlib) double[] x = {0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,3.181199,3.181199,3.181199,3.181199,3.181199,3.181199,3.723539 ,3.723539,3.723539,3.723539,4.383482,4.383482,4.383482,4.383482,5.320671,5.320671,5.320671,5.717284,6.964001,7.352165 ,8.710510,8.710510,8.710510,8.710510,8.710510,8.710510,9.539004,9.539004, 10.720619, 17.726077, 17.726077, 17.726077, 17.726077 ,22.053875 ,23.799144 ,27.355308 ,30.584960 ,30.584960 ,30.584960, 30.584960, 30.751808}; double[] y = {0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,3.061758,3.723539,5.628420,5.628420,5.628420,5.628420 ,5.628420,6.916982,6.916982,6.916982, 10.178538, 10.178538, 10.178538, 10.178538, 10.178538 }; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (COMPRESS-298) Cleaner way to catch/detect Seven7 files which are password protected
[ https://issues.apache.org/jira/browse/COMPRESS-298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Bodewig resolved COMPRESS-298. - Resolution: Fixed Fix Version/s: 1.10 The way SevenZFile is coded it is difficult to provide a canReadEntryData method like we've got for ZipFile, so right now throwing a special exception seems to be the best solution. svn revision 1653252 Cleaner way to catch/detect Seven7 files which are password protected - Key: COMPRESS-298 URL: https://issues.apache.org/jira/browse/COMPRESS-298 Project: Commons Compress Issue Type: Improvement Components: Archivers Affects Versions: 1.8.1 Reporter: Nick Burch Fix For: 1.10 Currently, if we open a password protected 7z file and call {{getNextEntry()}} on it, it will blow up with an IOException with a specific string: {code} java.io.IOException: Cannot read encrypted files without a password at org.apache.commons.compress.archivers.sevenz.AES256SHA256Decoder$1.init(AES256SHA256Decoder.java:56) at org.apache.commons.compress.archivers.sevenz.AES256SHA256Decoder$1.read(AES256SHA256Decoder.java:112) at java.io.DataInputStream.readUnsignedByte(DataInputStream.java:288) at org.tukaani.xz.rangecoder.RangeDecoderFromStream.init(Unknown Source) at org.tukaani.xz.LZMAInputStream.initialize(Unknown Source) at org.tukaani.xz.LZMAInputStream.initialize(Unknown Source) at org.tukaani.xz.LZMAInputStream.init(Unknown Source) at org.apache.commons.compress.archivers.sevenz.Coders$LZMADecoder.decode(Coders.java:113) at org.apache.commons.compress.archivers.sevenz.Coders.addDecoder(Coders.java:77) at org.apache.commons.compress.archivers.sevenz.SevenZFile.buildDecoderStack(SevenZFile.java:853) at org.apache.commons.compress.archivers.sevenz.SevenZFile.buildDecodingStream(SevenZFile.java:820) at org.apache.commons.compress.archivers.sevenz.SevenZFile.getNextEntry(SevenZFile.java:151) {code} It would be good if either a specific subtype of IOException could be thrown (which could then be caught to differentiate this from other kinds of IO problems), or if a method could be added to SevenZFile which could be called to see if a password is needed / given password is correct (If implemented, this would help make the code in Tika dealing with 7z files cleaner) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (COMPRESS-288) Missing support for 7z ppmd compression format.
[ https://issues.apache.org/jira/browse/COMPRESS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Bodewig updated COMPRESS-288: Fix Version/s: (was: 1.8.1) Missing support for 7z ppmd compression format. --- Key: COMPRESS-288 URL: https://issues.apache.org/jira/browse/COMPRESS-288 Project: Commons Compress Issue Type: New Feature Components: Archivers Affects Versions: 1.8.1 Environment: Tika from truck build. Reporter: sunxingzhe Labels: 7z When Commons Compress 1.8.1 parse 7z type with ppmd compression format, the following error occurred. Caused by: java.io.IOException: Unsupported compression method [3, 4, 1] at org.apache.commons.compress.archivers.sevenz.Coders.addDecoder(Coders.java:74) at org.apache.commons.compress.archivers.sevenz.SevenZFile.buildDecoderStack(SevenZFile.java:865) at org.apache.commons.compress.archivers.sevenz.SevenZFile.buildDecodingStream(SevenZFile.java:832) at org.apache.commons.compress.archivers.sevenz.SevenZFile.getNextEntry(SevenZFile.java:151) at org.apache.tika.parser.pkg.PackageParser$SevenZWrapper.getNextEntry(PackageParser.java:224) at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:155) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) ... 5 more Please tell me tika whether or not to support ppmd decoder. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (COMPRESS-294) .Z decompress “Invalid 9 bit code 0x183”
[ https://issues.apache.org/jira/browse/COMPRESS-294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283854#comment-14283854 ] Stefan Bodewig commented on COMPRESS-294: - any news? .Z decompress “Invalid 9 bit code 0x183” Key: COMPRESS-294 URL: https://issues.apache.org/jira/browse/COMPRESS-294 Project: Commons Compress Issue Type: Bug Components: Build Affects Versions: 1.9 Reporter: Q Attachments: commons-compress-1.10-SNAPSHOT.jar Trying to decompress a .Z file I get “Invalid 9 bit code 0x183” It seems that the z file was created under unix using the default bits value (16 bits). The current implementation seems to support only 9 bits. (I can't provide a sample file since contains client data but I will try to get on dummy file from the client) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (VFS-558) java.lang.UnsupportedOperationException in FtpFileObject
[ https://issues.apache.org/jira/browse/VFS-558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283917#comment-14283917 ] L commented on VFS-558: --- Re: Thanks for testing! I really hope to release soon (so you better not find new bugs ) Sorry: VFS-559 java.lang.UnsupportedOperationException in FtpFileObject Key: VFS-558 URL: https://issues.apache.org/jira/browse/VFS-558 Project: Commons VFS Issue Type: Bug Affects Versions: 2.0 Reporter: L Assignee: Bernd Eckenfels Fix For: 2.1 I am getting the following exception in my code: java.lang.UnsupportedOperationException at java.util.Collections$UnmodifiableMap.remove(Collections.java:1345) at org.apache.commons.vfs2.provider.ftp.FtpFileObject.onChildrenChanged(FtpFileObject.java:271) at org.apache.commons.vfs2.provider.AbstractFileObject.childrenChanged(AbstractFileObject.java:240) at org.apache.commons.vfs2.provider.AbstractFileObject.notifyParent(AbstractFileObject.java:1931) at org.apache.commons.vfs2.provider.AbstractFileObject.handleCreate(AbstractFileObject.java:1577) at org.apache.commons.vfs2.provider.AbstractFileObject.moveTo(AbstractFileObject.java:1866) at org.apache.commons.vfs2.impl.DecoratedFileObject.moveTo(DecoratedFileObject.java:241) at org.apache.commons.vfs2.cache.OnCallRefreshFileObject.moveTo(OnCallRefreshFileObject.java:184) ... I guess it is caused by the fact that children field is set to EMPTY_FTP_FILE_MAP at the moment onChildrenChanged() is invoked. I also do not like line 1866 in AbstractFileObject.java. To me it looks like it might be the real cause of the problem: FileObjectUtils.getAbstractFileObject(destFile).handleCreate(getType()); Must it not be destFile.getType()? But even if I am right about AbstractFileObject.java:1866, FtpFileObject.onChildrenChanged() must be corrected as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (LANG-1083) Add (T) casts to get unit tests to pass in old JDK
[ https://issues.apache.org/jira/browse/LANG-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedikt Ritter updated LANG-1083: -- Summary: Add (T) casts to get unit tests to pass in old JDK (was: Add (T) casts to get unit tests to pass in old JDK.) Add (T) casts to get unit tests to pass in old JDK -- Key: LANG-1083 URL: https://issues.apache.org/jira/browse/LANG-1083 Project: Commons Lang Issue Type: Bug Environment: Maven 3.2.5, Java 1.6.0_18, Fedora 11, AMD 64 (2.6.30.10-105.2.23.fc11.x86_64) Reporter: Jonathan Baker Priority: Trivial This is probably just a quirk of the old JDK that was used. The casts are not necessary on other computers, but they don't seem to hurt either. (Please verify that of course!) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MATH-1197) Incorrect Kolmogorov–Smirnov Statistic for two samples
[ https://issues.apache.org/jira/browse/MATH-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284003#comment-14284003 ] Phil Steitz commented on MATH-1197: --- Yes, this is a bug. Arrays.binarySearch should not have been used here. Incorrect Kolmogorov–Smirnov Statistic for two samples --- Key: MATH-1197 URL: https://issues.apache.org/jira/browse/MATH-1197 Project: Commons Math Issue Type: Bug Affects Versions: 3.4.1 Environment: Ubuntu 14.04 Reporter: Danaja Thiyunuwan Maldeniya kolmogorovSmirnovTest(double[],double[]) against the samples given below gives 5.699107852308316E-12 instead of 0.9793 (approx.) Traced the issue to kolmogorovSmirnovStatistic(double[],double[]) which gives 0.49507389162561577 instead of 0.064 (verified with ks.test in R and JDistlib) double[] x = {0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,3.181199,3.181199,3.181199,3.181199,3.181199,3.181199,3.723539 ,3.723539,3.723539,3.723539,4.383482,4.383482,4.383482,4.383482,5.320671,5.320671,5.320671,5.717284,6.964001,7.352165 ,8.710510,8.710510,8.710510,8.710510,8.710510,8.710510,9.539004,9.539004, 10.720619, 17.726077, 17.726077, 17.726077, 17.726077 ,22.053875 ,23.799144 ,27.355308 ,30.584960 ,30.584960 ,30.584960, 30.584960, 30.751808}; double[] y = {0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,3.061758,3.723539,5.628420,5.628420,5.628420,5.628420 ,5.628420,6.916982,6.916982,6.916982, 10.178538, 10.178538, 10.178538, 10.178538, 10.178538 }; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (LANG-1083) Add (T) casts to get unit tests to pass in old JDK
[ https://issues.apache.org/jira/browse/LANG-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedikt Ritter resolved LANG-1083. --- Resolution: Fixed {code} $ svn ci -m LANG-1083: Add (T) casts to get unit tests to pass in old JDK. This fixes #42 from github. Thanks to Jonathan Baker. Sendingsrc/changes/changes.xml Sendingsrc/main/java/org/apache/commons/lang3/SerializationUtils.java Sending src/test/java/org/apache/commons/lang3/exception/AbstractExceptionContextTest.java Transmitting file data ... Committed revision 1653307. {code} Thanks! Add (T) casts to get unit tests to pass in old JDK -- Key: LANG-1083 URL: https://issues.apache.org/jira/browse/LANG-1083 Project: Commons Lang Issue Type: Bug Environment: Maven 3.2.5, Java 1.6.0_18, Fedora 11, AMD 64 (2.6.30.10-105.2.23.fc11.x86_64) Reporter: Jonathan Baker Assignee: Benedikt Ritter Priority: Trivial This is probably just a quirk of the old JDK that was used. The casts are not necessary on other computers, but they don't seem to hurt either. (Please verify that of course!) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (VFS-559) FTPClientWrapper is not robust against some failures
L created VFS-559: - Summary: FTPClientWrapper is not robust against some failures Key: VFS-559 URL: https://issues.apache.org/jira/browse/VFS-559 Project: Commons VFS Issue Type: Bug Affects Versions: 2.0 Reporter: L The goal of the class is stated in javadoc: A wrapper to the FTPClient to allow automatic reconnect on connection loss. A lot of its methods look like : try { do something... } catch (final IOException e) { disconnect(); try to repeat the operation... } Unfortunately disonnect() can fail for the same reason as the original do something. In my case it as a connection reset. So instead of the original exception I was getting more or less the same exception from getFtpClient().quit(); So the wrapper did not help at all. I guess all the disconnect() invocations must also be inside try/catch so that even if disconnect() throws, the method goes on to the next step: try to repeat the operation... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MATH-1197) Incorrect Kolmogorov–Smirnov Statistic for two samples
[ https://issues.apache.org/jira/browse/MATH-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284500#comment-14284500 ] Thomas Neidhart edited comment on MATH-1197 at 1/20/15 10:10 PM: - The exactP method also seems to have a problem when comparing it with the results from R. Take this example: {code} double[] x = new double[] { 0, 0, 0, 0, 1 }; double[] y = new double[] { 0, 0, 1, 1, 2, 3 }; final KolmogorovSmirnovTest test = new KolmogorovSmirnovTest(); System.out.println(p= + test.kolmogorovSmirnovTest(x, y, true)); System.out.println(D= + test.kolmogorovSmirnovStatistic(x, y)); System.out.println(approximateP= + test.approximateP(test.kolmogorovSmirnovStatistic(x, y), x.length, y.length)); System.out.println(exactP= + test.exactP(test.kolmogorovSmirnovStatistic(x, y), x.length, y.length, false)); {code} returns: {noformat} p=0.35714285714285715 D=0.46673 approximateP=0.5925028311389975 exactP=0.4155844155844156 {noformat} R computes the following: {noformat} data: x and y D = 0.4667, p-value = 0.5925 alternative hypothesis: two-sided {noformat} Edit: the reason seems to be that R can not compute exactP values in case of ties. was (Author: tn): The exactP method also seems to have a problem when comparing it with the results from R. Take this example: {code} double[] x = new double[] { 0, 0, 0, 0, 1 }; double[] y = new double[] { 0, 0, 1, 1, 2, 3 }; final KolmogorovSmirnovTest test = new KolmogorovSmirnovTest(); System.out.println(p= + test.kolmogorovSmirnovTest(x, y, true)); System.out.println(D= + test.kolmogorovSmirnovStatistic(x, y)); System.out.println(approximateP= + test.approximateP(test.kolmogorovSmirnovStatistic(x, y), x.length, y.length)); System.out.println(exactP= + test.exactP(test.kolmogorovSmirnovStatistic(x, y), x.length, y.length, false)); {code} returns: {noformat} p=0.35714285714285715 D=0.46673 approximateP=0.5925028311389975 exactP=0.4155844155844156 {noformat} R computes the following: {noformat} data: x and y D = 0.4667, p-value = 0.5925 alternative hypothesis: two-sided {noformat} Incorrect Kolmogorov–Smirnov Statistic for two samples --- Key: MATH-1197 URL: https://issues.apache.org/jira/browse/MATH-1197 Project: Commons Math Issue Type: Bug Affects Versions: 3.4.1 Environment: Ubuntu 14.04 Reporter: Danaja Thiyunuwan Maldeniya Attachments: MATH-1197.patch kolmogorovSmirnovTest(double[],double[]) against the samples given below gives 5.699107852308316E-12 instead of 0.9793 (approx.) Traced the issue to kolmogorovSmirnovStatistic(double[],double[]) which gives 0.49507389162561577 instead of 0.064 (verified with ks.test in R and JDistlib) double[] x = {0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,3.181199,3.181199,3.181199,3.181199,3.181199,3.181199,3.723539 ,3.723539,3.723539,3.723539,4.383482,4.383482,4.383482,4.383482,5.320671,5.320671,5.320671,5.717284,6.964001,7.352165
[jira] [Commented] (MATH-1197) Incorrect Kolmogorov–Smirnov Statistic for two samples
[ https://issues.apache.org/jira/browse/MATH-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284500#comment-14284500 ] Thomas Neidhart commented on MATH-1197: --- The exactP method also seems to have a problem when comparing it with the results from R. Take this example: {code} double[] x = new double[] { 0, 0, 0, 0, 1 }; double[] y = new double[] { 0, 0, 1, 1, 2, 3 }; final KolmogorovSmirnovTest test = new KolmogorovSmirnovTest(); System.out.println(p= + test.kolmogorovSmirnovTest(x, y, true)); System.out.println(D= + test.kolmogorovSmirnovStatistic(x, y)); System.out.println(approximateP= + test.approximateP(test.kolmogorovSmirnovStatistic(x, y), x.length, y.length)); System.out.println(exactP= + test.exactP(test.kolmogorovSmirnovStatistic(x, y), x.length, y.length, false)); {code} returns: {noformat} p=0.35714285714285715 D=0.46673 approximateP=0.5925028311389975 exactP=0.4155844155844156 {noformat} R computes the following: {noformat} data: x and y D = 0.4667, p-value = 0.5925 alternative hypothesis: two-sided {noformat} Incorrect Kolmogorov–Smirnov Statistic for two samples --- Key: MATH-1197 URL: https://issues.apache.org/jira/browse/MATH-1197 Project: Commons Math Issue Type: Bug Affects Versions: 3.4.1 Environment: Ubuntu 14.04 Reporter: Danaja Thiyunuwan Maldeniya Attachments: MATH-1197.patch kolmogorovSmirnovTest(double[],double[]) against the samples given below gives 5.699107852308316E-12 instead of 0.9793 (approx.) Traced the issue to kolmogorovSmirnovStatistic(double[],double[]) which gives 0.49507389162561577 instead of 0.064 (verified with ks.test in R and JDistlib) double[] x = {0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,3.181199,3.181199,3.181199,3.181199,3.181199,3.181199,3.723539 ,3.723539,3.723539,3.723539,4.383482,4.383482,4.383482,4.383482,5.320671,5.320671,5.320671,5.717284,6.964001,7.352165 ,8.710510,8.710510,8.710510,8.710510,8.710510,8.710510,9.539004,9.539004, 10.720619, 17.726077, 17.726077, 17.726077, 17.726077 ,22.053875 ,23.799144 ,27.355308 ,30.584960 ,30.584960 ,30.584960, 30.584960, 30.751808}; double[] y = {0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,3.061758,3.723539,5.628420,5.628420,5.628420,5.628420 ,5.628420,6.916982,6.916982,6.916982, 10.178538, 10.178538, 10.178538, 10.178538, 10.178538 }; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MATH-1197) Incorrect Kolmogorov–Smirnov Statistic for two samples
[ https://issues.apache.org/jira/browse/MATH-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284589#comment-14284589 ] Phil Steitz commented on MATH-1197: --- +1 on the patch Incorrect Kolmogorov–Smirnov Statistic for two samples --- Key: MATH-1197 URL: https://issues.apache.org/jira/browse/MATH-1197 Project: Commons Math Issue Type: Bug Affects Versions: 3.4.1 Environment: Ubuntu 14.04 Reporter: Danaja Thiyunuwan Maldeniya Attachments: MATH-1197.patch kolmogorovSmirnovTest(double[],double[]) against the samples given below gives 5.699107852308316E-12 instead of 0.9793 (approx.) Traced the issue to kolmogorovSmirnovStatistic(double[],double[]) which gives 0.49507389162561577 instead of 0.064 (verified with ks.test in R and JDistlib) double[] x = {0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,3.181199,3.181199,3.181199,3.181199,3.181199,3.181199,3.723539 ,3.723539,3.723539,3.723539,4.383482,4.383482,4.383482,4.383482,5.320671,5.320671,5.320671,5.717284,6.964001,7.352165 ,8.710510,8.710510,8.710510,8.710510,8.710510,8.710510,9.539004,9.539004, 10.720619, 17.726077, 17.726077, 17.726077, 17.726077 ,22.053875 ,23.799144 ,27.355308 ,30.584960 ,30.584960 ,30.584960, 30.584960, 30.751808}; double[] y = {0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,3.061758,3.723539,5.628420,5.628420,5.628420,5.628420 ,5.628420,6.916982,6.916982,6.916982, 10.178538, 10.178538, 10.178538, 10.178538, 10.178538 }; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MATH-1197) Incorrect Kolmogorov–Smirnov Statistic for two samples
[ https://issues.apache.org/jira/browse/MATH-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284576#comment-14284576 ] Phil Steitz commented on MATH-1197: --- Assuming whatever bugs in the D computation have been fixed, our exactP should actually be exact. I could not make sense of, or find documentation for, what R does for small samples. Our code computes the exact distribution of the associated D statistic. I suspect that R does some kind of approximation. As you said, R I think also disallows ties. Incorrect Kolmogorov–Smirnov Statistic for two samples --- Key: MATH-1197 URL: https://issues.apache.org/jira/browse/MATH-1197 Project: Commons Math Issue Type: Bug Affects Versions: 3.4.1 Environment: Ubuntu 14.04 Reporter: Danaja Thiyunuwan Maldeniya Attachments: MATH-1197.patch kolmogorovSmirnovTest(double[],double[]) against the samples given below gives 5.699107852308316E-12 instead of 0.9793 (approx.) Traced the issue to kolmogorovSmirnovStatistic(double[],double[]) which gives 0.49507389162561577 instead of 0.064 (verified with ks.test in R and JDistlib) double[] x = {0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,3.181199,3.181199,3.181199,3.181199,3.181199,3.181199,3.723539 ,3.723539,3.723539,3.723539,4.383482,4.383482,4.383482,4.383482,5.320671,5.320671,5.320671,5.717284,6.964001,7.352165 ,8.710510,8.710510,8.710510,8.710510,8.710510,8.710510,9.539004,9.539004, 10.720619, 17.726077, 17.726077, 17.726077, 17.726077 ,22.053875 ,23.799144 ,27.355308 ,30.584960 ,30.584960 ,30.584960, 30.584960, 30.751808}; double[] y = {0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,3.061758,3.723539,5.628420,5.628420,5.628420,5.628420 ,5.628420,6.916982,6.916982,6.916982, 10.178538, 10.178538, 10.178538, 10.178538, 10.178538 }; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MATH-1197) Incorrect Kolmogorov–Smirnov Statistic for two samples
[ https://issues.apache.org/jira/browse/MATH-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Neidhart updated MATH-1197: -- Attachment: MATH-1197.patch The attached patch fixes the two-sample KS test statistic calculation. Actually, I did not fully understand the way it was calculated before, but I followed now the formula from the wiki page and it results in the correct result. After fixing this issue, another problem popped up in test testTwoSampleProductSizeOverflow which relates to a TODO in the ksSum method. Incorrect Kolmogorov–Smirnov Statistic for two samples --- Key: MATH-1197 URL: https://issues.apache.org/jira/browse/MATH-1197 Project: Commons Math Issue Type: Bug Affects Versions: 3.4.1 Environment: Ubuntu 14.04 Reporter: Danaja Thiyunuwan Maldeniya Attachments: MATH-1197.patch kolmogorovSmirnovTest(double[],double[]) against the samples given below gives 5.699107852308316E-12 instead of 0.9793 (approx.) Traced the issue to kolmogorovSmirnovStatistic(double[],double[]) which gives 0.49507389162561577 instead of 0.064 (verified with ks.test in R and JDistlib) double[] x = {0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,3.181199,3.181199,3.181199,3.181199,3.181199,3.181199,3.723539 ,3.723539,3.723539,3.723539,4.383482,4.383482,4.383482,4.383482,5.320671,5.320671,5.320671,5.717284,6.964001,7.352165 ,8.710510,8.710510,8.710510,8.710510,8.710510,8.710510,9.539004,9.539004, 10.720619, 17.726077, 17.726077, 17.726077, 17.726077 ,22.053875 ,23.799144 ,27.355308 ,30.584960 ,30.584960 ,30.584960, 30.584960, 30.751808}; double[] y = {0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 ,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,2.202653 ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,3.061758,3.723539,5.628420,5.628420,5.628420,5.628420 ,5.628420,6.916982,6.916982,6.916982, 10.178538, 10.178538, 10.178538, 10.178538, 10.178538 }; -- This message was sent by Atlassian JIRA (v6.3.4#6332)