[jira] [Commented] (CODEC-264) murmur3.hash128() does not account for unsigned seed argument

2020-01-18 Thread Andy Seaborne (Jira)


[ 
https://issues.apache.org/jira/browse/CODEC-264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018547#comment-17018547
 ] 

Andy Seaborne commented on CODEC-264:
-

Hi Alex,

I've visually inspected the code in git ([commit 
f5a61f0c|https://github.com/apache/commons-codec/commit/f5a61f0c]) and it looks 
correct.

Thanks / Andy

> murmur3.hash128() does not account for unsigned seed argument
> -
>
> Key: CODEC-264
> URL: https://issues.apache.org/jira/browse/CODEC-264
> Project: Commons Codec
>  Issue Type: Bug
>Affects Versions: 1.13
>Reporter: Claude Warren
>Assignee: Alex Herbert
>Priority: Major
> Fix For: 1.14
>
> Attachments: YonikMurmur3Tests.java
>
>
> The original murmur3_x64_128 code used unsigned int for seed arguments.  
> Using the equivalent bit patterns in the commons codec version does not yield 
> the same results.
> I believe this is because the commons version does not account for sign 
> extension etc.
> Yonic Seeley [~yonik] has explains the issue in his implementation 
> https://github.com/yonik/java_util/blob/master/src/util/hash/MurmurHash3.java
> He provides a test case to show that his code returns the same answers as the 
> original C/C++ code.  I modified that test to call the codec version to show 
> the error.
> I have attached that test case.
> Given that the original code is in the wild I am uncertain how to fix this 
> issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CODEC-264) murmur3.hash128() does not account for unsigned seed argument

2020-01-17 Thread Alex Herbert (Jira)


[ 
https://issues.apache.org/jira/browse/CODEC-264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018468#comment-17018468
 ] 

Alex Herbert commented on CODEC-264:


Thanks for the raising this.

The effect is that despite creating a new method for the fixed version to 
maintain behavioural compatibility with the old broken version the code 
actually fixes the old version and breaks behavioural compatibility.

I have added a test to maintain behavioural compatibility and fixed the code as 
suggested. Please review the current master to check that the fix is correct.

> murmur3.hash128() does not account for unsigned seed argument
> -
>
> Key: CODEC-264
> URL: https://issues.apache.org/jira/browse/CODEC-264
> Project: Commons Codec
>  Issue Type: Bug
>Affects Versions: 1.13
>Reporter: Claude Warren
>Assignee: Alex Herbert
>Priority: Major
> Fix For: 1.14
>
> Attachments: YonikMurmur3Tests.java
>
>
> The original murmur3_x64_128 code used unsigned int for seed arguments.  
> Using the equivalent bit patterns in the commons codec version does not yield 
> the same results.
> I believe this is because the commons version does not account for sign 
> extension etc.
> Yonic Seeley [~yonik] has explains the issue in his implementation 
> https://github.com/yonik/java_util/blob/master/src/util/hash/MurmurHash3.java
> He provides a test case to show that his code returns the same answers as the 
> original C/C++ code.  I modified that test to call the codec version to show 
> the error.
> I have attached that test case.
> Given that the original code is in the wild I am uncertain how to fix this 
> issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CODEC-264) murmur3.hash128() does not account for unsigned seed argument

2020-01-17 Thread Andy Seaborne (Jira)


[ 
https://issues.apache.org/jira/browse/CODEC-264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018374#comment-17018374
 ] 

Andy Seaborne commented on CODEC-264:
-

The v1.14 version of {{hash128(byte[], , int seed)}} does now apply the seed 
mask, contrary to the comments.

Line 805
{noformat}
@Deprecated
public static long[] hash128(final byte[] data, final int offset, final int 
length, final int seed) {
// 
// Note: This fails to apply masking using 0xL to the seed.
// 
return hash128x64(data, offset, length, seed);
}
{noformat}

It calls {{hash128x86(byte[],, int seed)}} (exact signature match), not 
{hash128x86(byte[],, long seed)}} (type conversion).

{{hash128x86(byte[],, int seed)}} applies the mask (checked by debugger walk 
through in EclipseIDE).

{{hash128(byte[],, int seed)}} should be a call of {{hash128x86(byte[],, 
long)}} directly.

I think that casting at the call site will do that:
{noformat}
return hash128x64(data, offset, length, (long)seed);
{noformat}
or for clarity explicitly:
{noformat}
long seedLong = seed; /* unmasked 32->64 bit extension */
return hash128x64(data, offset, length, seedLong);
{noformat}

If the private static work function had a different name, then automatic, 
unmasked conversion would have applied.



> murmur3.hash128() does not account for unsigned seed argument
> -
>
> Key: CODEC-264
> URL: https://issues.apache.org/jira/browse/CODEC-264
> Project: Commons Codec
>  Issue Type: Bug
>Affects Versions: 1.13
>Reporter: Claude Warren
>Assignee: Alex Herbert
>Priority: Major
> Fix For: 1.14
>
> Attachments: YonikMurmur3Tests.java
>
>
> The original murmur3_x64_128 code used unsigned int for seed arguments.  
> Using the equivalent bit patterns in the commons codec version does not yield 
> the same results.
> I believe this is because the commons version does not account for sign 
> extension etc.
> Yonic Seeley [~yonik] has explains the issue in his implementation 
> https://github.com/yonik/java_util/blob/master/src/util/hash/MurmurHash3.java
> He provides a test case to show that his code returns the same answers as the 
> original C/C++ code.  I modified that test to call the codec version to show 
> the error.
> I have attached that test case.
> Given that the original code is in the wild I am uncertain how to fix this 
> issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CODEC-264) murmur3.hash128() does not account for unsigned seed argument

2019-11-24 Thread Claude Warren (Jira)


[ 
https://issues.apache.org/jira/browse/CODEC-264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981049#comment-16981049
 ] 

Claude Warren commented on CODEC-264:
-

This issues should be closed with the changes that Alex made to master.  I am 
not certain what the codec procedure is.  As the person that opened this issue 
should I mark it as resolved or close it, or just leave it alone for Alex to 
record?

> murmur3.hash128() does not account for unsigned seed argument
> -
>
> Key: CODEC-264
> URL: https://issues.apache.org/jira/browse/CODEC-264
> Project: Commons Codec
>  Issue Type: Bug
>Affects Versions: 1.13
>Reporter: Claude Warren
>Assignee: Alex Herbert
>Priority: Major
> Attachments: YonikMurmur3Tests.java
>
>
> The original murmur3_x64_128 code used unsigned int for seed arguments.  
> Using the equivalent bit patterns in the commons codec version does not yield 
> the same results.
> I believe this is because the commons version does not account for sign 
> extension etc.
> Yonic Seeley [~yonik] has explains the issue in his implementation 
> https://github.com/yonik/java_util/blob/master/src/util/hash/MurmurHash3.java
> He provides a test case to show that his code returns the same answers as the 
> original C/C++ code.  I modified that test to call the codec version to show 
> the error.
> I have attached that test case.
> Given that the original code is in the wild I am uncertain how to fix this 
> issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CODEC-264) murmur3.hash128() does not account for unsigned seed argument

2019-11-21 Thread Alex Herbert (Jira)


[ 
https://issues.apache.org/jira/browse/CODEC-264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979257#comment-16979257
 ] 

Alex Herbert commented on CODEC-264:


I am going to clean up MurmurHash3 with full documentation of the current code. 
At present the code is sparsely documented and contains a number of methods 
with no equivalent in the c++ source for MurmurHash3.


> murmur3.hash128() does not account for unsigned seed argument
> -
>
> Key: CODEC-264
> URL: https://issues.apache.org/jira/browse/CODEC-264
> Project: Commons Codec
>  Issue Type: Bug
>Affects Versions: 1.13
>Reporter: Claude Warren
>Assignee: Alex Herbert
>Priority: Major
> Attachments: YonikMurmur3Tests.java
>
>
> The original murmur3_x64_128 code used unsigned int for seed arguments.  
> Using the equivalent bit patterns in the commons codec version does not yield 
> the same results.
> I believe this is because the commons version does not account for sign 
> extension etc.
> Yonic Seeley [~yonik] has explains the issue in his implementation 
> https://github.com/yonik/java_util/blob/master/src/util/hash/MurmurHash3.java
> He provides a test case to show that his code returns the same answers as the 
> original C/C++ code.  I modified that test to call the codec version to show 
> the error.
> I have attached that test case.
> Given that the original code is in the wild I am uncertain how to fix this 
> issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CODEC-264) murmur3.hash128() does not account for unsigned seed argument

2019-11-21 Thread Alex Herbert (Jira)


[ 
https://issues.apache.org/jira/browse/CODEC-264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979255#comment-16979255
 ] 

Alex Herbert commented on CODEC-264:


As discussed on the PR raised by Claude the two sign extension bugs in 
MurmurHash3 should be fixed in new methods to avoid behavioural changes to the 
existing released hash32 and hash128 methods.

New methods can take inspiration from the c++ source code method names:
{noformat}
MurmurHash3_x86_32 -> MurmurHash3.hash32x86
MurmurHash3_x64_128 -> MurmurHash3.hash128x64
{noformat}


> murmur3.hash128() does not account for unsigned seed argument
> -
>
> Key: CODEC-264
> URL: https://issues.apache.org/jira/browse/CODEC-264
> Project: Commons Codec
>  Issue Type: Bug
>Affects Versions: 1.13
>Reporter: Claude Warren
>Assignee: Alex Herbert
>Priority: Major
> Attachments: YonikMurmur3Tests.java
>
>
> The original murmur3_x64_128 code used unsigned int for seed arguments.  
> Using the equivalent bit patterns in the commons codec version does not yield 
> the same results.
> I believe this is because the commons version does not account for sign 
> extension etc.
> Yonic Seeley [~yonik] has explains the issue in his implementation 
> https://github.com/yonik/java_util/blob/master/src/util/hash/MurmurHash3.java
> He provides a test case to show that his code returns the same answers as the 
> original C/C++ code.  I modified that test to call the codec version to show 
> the error.
> I have attached that test case.
> Given that the original code is in the wild I am uncertain how to fix this 
> issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)