[ 
https://issues.apache.org/jira/browse/CASSANDRA-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208846#comment-16208846
 ] 

Jason Brown commented on CASSANDRA-13291:
-----------------------------------------

[~beobal] Thanks for doing the hard part of this review!

bq.  if we want to go back to just using the old thread local here or not

I think the {{Hasher}} instance will live so briefly (just the scope of the 
{{#hashToBigInteger()}} function) it's probably not worth mucking with the 
{{ThreadLocal}} anymore.

trivial/petty nits:
- In about 2-3 places in the non-test code, you call 
{{HashingUtils.CURRENT_HASH_FUNCTION#newHasher()}}. Part of me wants to hide 
away the {{CURRENT_HASH_FUNCTION}} and just have a function 
{{HashingUtils#newHasher()}}, which knows to use the {{CURRENT_HASH_FUNCTION}}. 
wdyt? (not a blocker for +1'ing, btw)
- in {{RandomPartitioner}}, {{#hashToBigInteger()}} is called four times from 
within the same class, and only from this class. Can we a) make the function 
private, and b) instead of {{RandomPartitioner#hashToBigInteger()}} can we just 
call {{#hashToBigInteger()}}?
- I'm 99% sure {{Validator.CountingHasher#putObject()}} is not being called, 
and it's just there to fill out the interface. Can you add a quick note that 
we're not expecting it to be be used, and thus not counting the bytes?
- Once again on the "petty, but let's make it correct" front, 
{{Validator.CountingHasher#putUnencodedChars()}} and 
{{Validator.CountingHasher#putString()}} both call {{CharSequence#length()}} to 
get the value to add to the {{count}}. According to the javadoc for 
{{CharSequence#length()}} reads:

{code}
Returns the length of this character sequence. The length is the number of 
16-bit chars in the sequence.
{code}

At a minimum we should multiply the output of {{CharSequence#length()}} by 2 to 
get the number of 8-bit bytes to correctly increment {{count}}; we should also 
add a small comment, as well.
- {{GuidGenerator}} - remove import of {{RandomParitioner}}

I think we're almost there!

> Replace usages of MessageDigest with Guava's Hasher
> ---------------------------------------------------
>
>                 Key: CASSANDRA-13291
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13291
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Michael Kjellman
>            Assignee: Michael Kjellman
>         Attachments: CASSANDRA-13291-trunk.diff
>
>
> During my profiling of C* I frequently see lots of aggregate time across 
> threads being spent inside the MD5 MessageDigest implementation. Given that 
> there are tons of modern alternative hashing functions better than MD5 
> available -- both in terms of providing better collision resistance and 
> actual computational speed -- I wanted to switch out our usage of MD5 for 
> alternatives (like adler128 or murmur3_128) and test for performance 
> improvements.
> Unfortunately, I found given the fact we use MessageDigest everywhere --  
> switching out the hashing function to something like adler128 or murmur3_128 
> (for example) -- which don't ship with the JDK --  wasn't straight forward.
> The goal of this ticket is to propose switching out usages of MessageDigest 
> directly in favor of Hasher from Guava. This means going forward we can 
> change a single line of code to switch the hashing algorithm being used 
> (assuming there is an implementation in Guava).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to