[
https://issues.apache.org/jira/browse/HBASE-875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633657#action_12633657
]
Andrzej Bialecki commented on HBASE-875:
-----------------------------------------
Re: deserialization. Sure, hash values can be anything. But the first parameter
in the old format is the number of hash functions to use, not the hash value.
so it can't be negative.
Re: configuration. I was of a split mind on this, but if we allowed configuring
hash function in these cases, then we would have to persist this information
somewhere in the data, which sounds kind of messy - so I decided against it.
Perhaps the name of the property should indicate that it affects only
BloomFilters ... OTOH some day we may want to use this conf. knob in other
places too.
> Use MurmurHash instead of JenkinsHash
> -------------------------------------
>
> Key: HBASE-875
> URL: https://issues.apache.org/jira/browse/HBASE-875
> Project: Hadoop HBase
> Issue Type: Improvement
> Components: util
> Affects Versions: 0.19.0
> Reporter: Andrzej Bialecki
> Attachments: murmur.patch
>
>
> I recently ported the MurmurHash (http://murmurhash.googlepages.com/) to
> Java, and according to my tests it's roughly 5 times faster than the current
> version of JenkinsHash in the trunk/ . According to the author (and other
> analysts at comp.sci.crypt) this hash has an excellent avalanche behavior,
> and low collision rate. I propose to either replace the JenkinsHash or add
> this hash as an option to be used in BloomFilter-s and related classes.
> If your opinion is positive, I'll prepare a patch. The Java implementation of
> the hash can be found here: http://www.getopt.org/murmur/MurmurHash.java
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.