[jira] [Commented] (CODEC-264) murmur3.hash128() does not account for unsigned seed argument
[ https://issues.apache.org/jira/browse/CODEC-264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018547#comment-17018547 ] Andy Seaborne commented on CODEC-264: - Hi Alex, I've visually inspected the code in git ([commit f5a61f0c|https://github.com/apache/commons-codec/commit/f5a61f0c]) and it looks correct. Thanks / Andy > murmur3.hash128() does not account for unsigned seed argument > - > > Key: CODEC-264 > URL: https://issues.apache.org/jira/browse/CODEC-264 > Project: Commons Codec > Issue Type: Bug >Affects Versions: 1.13 >Reporter: Claude Warren >Assignee: Alex Herbert >Priority: Major > Fix For: 1.14 > > Attachments: YonikMurmur3Tests.java > > > The original murmur3_x64_128 code used unsigned int for seed arguments. > Using the equivalent bit patterns in the commons codec version does not yield > the same results. > I believe this is because the commons version does not account for sign > extension etc. > Yonic Seeley [~yonik] has explains the issue in his implementation > https://github.com/yonik/java_util/blob/master/src/util/hash/MurmurHash3.java > He provides a test case to show that his code returns the same answers as the > original C/C++ code. I modified that test to call the codec version to show > the error. > I have attached that test case. > Given that the original code is in the wild I am uncertain how to fix this > issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CODEC-264) murmur3.hash128() does not account for unsigned seed argument
[ https://issues.apache.org/jira/browse/CODEC-264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018468#comment-17018468 ] Alex Herbert commented on CODEC-264: Thanks for the raising this. The effect is that despite creating a new method for the fixed version to maintain behavioural compatibility with the old broken version the code actually fixes the old version and breaks behavioural compatibility. I have added a test to maintain behavioural compatibility and fixed the code as suggested. Please review the current master to check that the fix is correct. > murmur3.hash128() does not account for unsigned seed argument > - > > Key: CODEC-264 > URL: https://issues.apache.org/jira/browse/CODEC-264 > Project: Commons Codec > Issue Type: Bug >Affects Versions: 1.13 >Reporter: Claude Warren >Assignee: Alex Herbert >Priority: Major > Fix For: 1.14 > > Attachments: YonikMurmur3Tests.java > > > The original murmur3_x64_128 code used unsigned int for seed arguments. > Using the equivalent bit patterns in the commons codec version does not yield > the same results. > I believe this is because the commons version does not account for sign > extension etc. > Yonic Seeley [~yonik] has explains the issue in his implementation > https://github.com/yonik/java_util/blob/master/src/util/hash/MurmurHash3.java > He provides a test case to show that his code returns the same answers as the > original C/C++ code. I modified that test to call the codec version to show > the error. > I have attached that test case. > Given that the original code is in the wild I am uncertain how to fix this > issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CODEC-264) murmur3.hash128() does not account for unsigned seed argument
[ https://issues.apache.org/jira/browse/CODEC-264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018374#comment-17018374 ] Andy Seaborne commented on CODEC-264: - The v1.14 version of {{hash128(byte[], , int seed)}} does now apply the seed mask, contrary to the comments. Line 805 {noformat} @Deprecated public static long[] hash128(final byte[] data, final int offset, final int length, final int seed) { // // Note: This fails to apply masking using 0xL to the seed. // return hash128x64(data, offset, length, seed); } {noformat} It calls {{hash128x86(byte[],, int seed)}} (exact signature match), not {hash128x86(byte[],, long seed)}} (type conversion). {{hash128x86(byte[],, int seed)}} applies the mask (checked by debugger walk through in EclipseIDE). {{hash128(byte[],, int seed)}} should be a call of {{hash128x86(byte[],, long)}} directly. I think that casting at the call site will do that: {noformat} return hash128x64(data, offset, length, (long)seed); {noformat} or for clarity explicitly: {noformat} long seedLong = seed; /* unmasked 32->64 bit extension */ return hash128x64(data, offset, length, seedLong); {noformat} If the private static work function had a different name, then automatic, unmasked conversion would have applied. > murmur3.hash128() does not account for unsigned seed argument > - > > Key: CODEC-264 > URL: https://issues.apache.org/jira/browse/CODEC-264 > Project: Commons Codec > Issue Type: Bug >Affects Versions: 1.13 >Reporter: Claude Warren >Assignee: Alex Herbert >Priority: Major > Fix For: 1.14 > > Attachments: YonikMurmur3Tests.java > > > The original murmur3_x64_128 code used unsigned int for seed arguments. > Using the equivalent bit patterns in the commons codec version does not yield > the same results. > I believe this is because the commons version does not account for sign > extension etc. > Yonic Seeley [~yonik] has explains the issue in his implementation > https://github.com/yonik/java_util/blob/master/src/util/hash/MurmurHash3.java > He provides a test case to show that his code returns the same answers as the > original C/C++ code. I modified that test to call the codec version to show > the error. > I have attached that test case. > Given that the original code is in the wild I am uncertain how to fix this > issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CODEC-264) murmur3.hash128() does not account for unsigned seed argument
[ https://issues.apache.org/jira/browse/CODEC-264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981049#comment-16981049 ] Claude Warren commented on CODEC-264: - This issues should be closed with the changes that Alex made to master. I am not certain what the codec procedure is. As the person that opened this issue should I mark it as resolved or close it, or just leave it alone for Alex to record? > murmur3.hash128() does not account for unsigned seed argument > - > > Key: CODEC-264 > URL: https://issues.apache.org/jira/browse/CODEC-264 > Project: Commons Codec > Issue Type: Bug >Affects Versions: 1.13 >Reporter: Claude Warren >Assignee: Alex Herbert >Priority: Major > Attachments: YonikMurmur3Tests.java > > > The original murmur3_x64_128 code used unsigned int for seed arguments. > Using the equivalent bit patterns in the commons codec version does not yield > the same results. > I believe this is because the commons version does not account for sign > extension etc. > Yonic Seeley [~yonik] has explains the issue in his implementation > https://github.com/yonik/java_util/blob/master/src/util/hash/MurmurHash3.java > He provides a test case to show that his code returns the same answers as the > original C/C++ code. I modified that test to call the codec version to show > the error. > I have attached that test case. > Given that the original code is in the wild I am uncertain how to fix this > issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CODEC-264) murmur3.hash128() does not account for unsigned seed argument
[ https://issues.apache.org/jira/browse/CODEC-264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979257#comment-16979257 ] Alex Herbert commented on CODEC-264: I am going to clean up MurmurHash3 with full documentation of the current code. At present the code is sparsely documented and contains a number of methods with no equivalent in the c++ source for MurmurHash3. > murmur3.hash128() does not account for unsigned seed argument > - > > Key: CODEC-264 > URL: https://issues.apache.org/jira/browse/CODEC-264 > Project: Commons Codec > Issue Type: Bug >Affects Versions: 1.13 >Reporter: Claude Warren >Assignee: Alex Herbert >Priority: Major > Attachments: YonikMurmur3Tests.java > > > The original murmur3_x64_128 code used unsigned int for seed arguments. > Using the equivalent bit patterns in the commons codec version does not yield > the same results. > I believe this is because the commons version does not account for sign > extension etc. > Yonic Seeley [~yonik] has explains the issue in his implementation > https://github.com/yonik/java_util/blob/master/src/util/hash/MurmurHash3.java > He provides a test case to show that his code returns the same answers as the > original C/C++ code. I modified that test to call the codec version to show > the error. > I have attached that test case. > Given that the original code is in the wild I am uncertain how to fix this > issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CODEC-264) murmur3.hash128() does not account for unsigned seed argument
[ https://issues.apache.org/jira/browse/CODEC-264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979255#comment-16979255 ] Alex Herbert commented on CODEC-264: As discussed on the PR raised by Claude the two sign extension bugs in MurmurHash3 should be fixed in new methods to avoid behavioural changes to the existing released hash32 and hash128 methods. New methods can take inspiration from the c++ source code method names: {noformat} MurmurHash3_x86_32 -> MurmurHash3.hash32x86 MurmurHash3_x64_128 -> MurmurHash3.hash128x64 {noformat} > murmur3.hash128() does not account for unsigned seed argument > - > > Key: CODEC-264 > URL: https://issues.apache.org/jira/browse/CODEC-264 > Project: Commons Codec > Issue Type: Bug >Affects Versions: 1.13 >Reporter: Claude Warren >Assignee: Alex Herbert >Priority: Major > Attachments: YonikMurmur3Tests.java > > > The original murmur3_x64_128 code used unsigned int for seed arguments. > Using the equivalent bit patterns in the commons codec version does not yield > the same results. > I believe this is because the commons version does not account for sign > extension etc. > Yonic Seeley [~yonik] has explains the issue in his implementation > https://github.com/yonik/java_util/blob/master/src/util/hash/MurmurHash3.java > He provides a test case to show that his code returns the same answers as the > original C/C++ code. I modified that test to call the codec version to show > the error. > I have attached that test case. > Given that the original code is in the wild I am uncertain how to fix this > issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)