[ https://issues.apache.org/jira/browse/CASSANDRA-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208846#comment-16208846 ]
Jason Brown commented on CASSANDRA-13291: ----------------------------------------- [~beobal] Thanks for doing the hard part of this review! bq. if we want to go back to just using the old thread local here or not I think the {{Hasher}} instance will live so briefly (just the scope of the {{#hashToBigInteger()}} function) it's probably not worth mucking with the {{ThreadLocal}} anymore. trivial/petty nits: - In about 2-3 places in the non-test code, you call {{HashingUtils.CURRENT_HASH_FUNCTION#newHasher()}}. Part of me wants to hide away the {{CURRENT_HASH_FUNCTION}} and just have a function {{HashingUtils#newHasher()}}, which knows to use the {{CURRENT_HASH_FUNCTION}}. wdyt? (not a blocker for +1'ing, btw) - in {{RandomPartitioner}}, {{#hashToBigInteger()}} is called four times from within the same class, and only from this class. Can we a) make the function private, and b) instead of {{RandomPartitioner#hashToBigInteger()}} can we just call {{#hashToBigInteger()}}? - I'm 99% sure {{Validator.CountingHasher#putObject()}} is not being called, and it's just there to fill out the interface. Can you add a quick note that we're not expecting it to be be used, and thus not counting the bytes? - Once again on the "petty, but let's make it correct" front, {{Validator.CountingHasher#putUnencodedChars()}} and {{Validator.CountingHasher#putString()}} both call {{CharSequence#length()}} to get the value to add to the {{count}}. According to the javadoc for {{CharSequence#length()}} reads: {code} Returns the length of this character sequence. The length is the number of 16-bit chars in the sequence. {code} At a minimum we should multiply the output of {{CharSequence#length()}} by 2 to get the number of 8-bit bytes to correctly increment {{count}}; we should also add a small comment, as well. - {{GuidGenerator}} - remove import of {{RandomParitioner}} I think we're almost there! > Replace usages of MessageDigest with Guava's Hasher > --------------------------------------------------- > > Key: CASSANDRA-13291 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13291 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Michael Kjellman > Assignee: Michael Kjellman > Attachments: CASSANDRA-13291-trunk.diff > > > During my profiling of C* I frequently see lots of aggregate time across > threads being spent inside the MD5 MessageDigest implementation. Given that > there are tons of modern alternative hashing functions better than MD5 > available -- both in terms of providing better collision resistance and > actual computational speed -- I wanted to switch out our usage of MD5 for > alternatives (like adler128 or murmur3_128) and test for performance > improvements. > Unfortunately, I found given the fact we use MessageDigest everywhere -- > switching out the hashing function to something like adler128 or murmur3_128 > (for example) -- which don't ship with the JDK -- wasn't straight forward. > The goal of this ticket is to propose switching out usages of MessageDigest > directly in favor of Hasher from Guava. This means going forward we can > change a single line of code to switch the hashing algorithm being used > (assuming there is an implementation in Guava). -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org