[ https://issues.apache.org/jira/browse/CASSANDRA-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151092#comment-13151092 ]
David Allsopp edited comment on CASSANDRA-2975 at 11/16/11 9:31 AM: -------------------------------------------------------------------- Having slept on it, I realized that the {{switch-case}} is slow because of the {{switch-case}} fallthrough in Java - _i.e. all 15 cases get tested every time_ (http://download.oracle.com/javase/tutorial/java/nutsandbolts/switch.html) Adding a {{break}} to the end of each case fixes this without the verbosity of the binary search approach, i.e: {noformat} switch (length & 15) { case 0: break; case 1: k1 ^= ((long) key.get(offset)); k1 *= c1; k1 = rotl64(k1, 31); k1 *= c2; h1 ^= k1; break; case 2: k1 ^= ((long) key.get(offset + 1)) << 8; break; ...etc... {noformat} Combining this with the inlining almost doubles the speed of hashing for keys with lengths from 1-32 bytes! UPDATE: Argh, this breaks something - the hash output is different. Can't see why yet... was (Author: dallsopp): Having slept on it, I realized that the {{switch-case}} is slow because of the {{switch-case}} fallthrough in Java - _i.e. all 15 cases get tested every time_ (http://download.oracle.com/javase/tutorial/java/nutsandbolts/switch.html) Adding a {{break}} to the end of each case fixes this without the verbosity of the binary search approach, i.e: {noformat} switch (length & 15) { case 0: break; case 1: k1 ^= ((long) key.get(offset)); k1 *= c1; k1 = rotl64(k1, 31); k1 *= c2; h1 ^= k1; break; case 2: k1 ^= ((long) key.get(offset + 1)) << 8; break; ...etc... {noformat} Combining this with the inlining almost doubles the speed of hashing for keys with lengths from 1-32 bytes! > Upgrade MurmurHash to version 3 > ------------------------------- > > Key: CASSANDRA-2975 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2975 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Brian Lindauer > Assignee: Brian Lindauer > Priority: Trivial > Labels: lhf > Fix For: 1.1 > > Attachments: > 0001-Convert-BloomFilter-to-use-MurmurHash-v3-instead-of-.patch, > 0002-Backwards-compatibility-with-files-using-Murmur2-blo.patch > > > MurmurHash version 3 was finalized on June 3. It provides an enormous speedup > and increased robustness over version 2, which is implemented in Cassandra. > Information here: > http://code.google.com/p/smhasher/ > The reference implementation is here: > http://code.google.com/p/smhasher/source/browse/trunk/MurmurHash3.cpp?spec=svn136&r=136 > I have already done the work to port the (public domain) reference > implementation to Java in the MurmurHash class and updated the BloomFilter > class to use the new implementation: > https://github.com/lindauer/cassandra/commit/cea6068a4a3e5d7d9509335394f9ef3350d37e93 > Apart from the faster hash time, the new version only requires one call to > hash() rather than 2, since it returns 128 bits of hash instead of 64. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira