[ 
https://issues.apache.org/jira/browse/HBASE-875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633657#action_12633657
 ] 

Andrzej Bialecki  commented on HBASE-875:
-----------------------------------------

Re: deserialization. Sure, hash values can be anything. But the first parameter 
in the old format is the number of hash functions to use, not the hash value. 
so it can't be negative.

Re: configuration. I was of a split mind on this, but if we allowed configuring 
hash function in these cases, then we would have to persist this information 
somewhere in the data, which sounds kind of messy - so I decided against it. 
Perhaps the name of the property should indicate that it affects only 
BloomFilters ... OTOH some day we may want to use this conf. knob in other 
places too.

> Use MurmurHash instead of JenkinsHash
> -------------------------------------
>
>                 Key: HBASE-875
>                 URL: https://issues.apache.org/jira/browse/HBASE-875
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: util
>    Affects Versions: 0.19.0
>            Reporter: Andrzej Bialecki 
>         Attachments: murmur.patch
>
>
> I recently ported the MurmurHash (http://murmurhash.googlepages.com/) to 
> Java, and according to my tests it's roughly 5 times faster than the current 
> version of JenkinsHash in the trunk/ . According to the author (and other 
> analysts at comp.sci.crypt) this hash has an excellent avalanche behavior, 
> and low collision rate. I propose to either replace the JenkinsHash or add 
> this hash as an option to be used in BloomFilter-s and related classes.
> If your opinion is positive, I'll prepare a patch. The Java implementation of 
> the hash can be found here: http://www.getopt.org/murmur/MurmurHash.java

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to