[ 
https://issues.apache.org/jira/browse/CASSANDRA-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16183552#comment-16183552
 ] 

Michael Kjellman commented on CASSANDRA-13291:
----------------------------------------------

Correct:

{{Hashing.md5()}} internally creates a new {{MessageDigestHashFunction}} ( 
https://github.com/google/guava/blob/master/guava/src/com/google/common/hash/Hashing.java#L174).
 For hash functions that the JDK has implementations of, Guava uses the 
{{MessageDigest}} classes in the JDK. If you look at 
https://github.com/google/guava/blob/master/guava/src/com/google/common/hash/MessageDigestHashFunction.java#L77
 you can see that under the hood this ends up calling the same thing we 
currently do in {{FBUtilities#newMessageDigest}}

The specific goal of this ticket is to keep identical performance to what we 
have today, but pass around the more generic Guava class instead of explicitly 
passing around MD5 Digests everywhere. With the plumbing done, we can then make 
a 1 line change to switch the hash function implementation to something like 
Murmur3_128, which my initial profiling shows could be a pretty big LHF win for 
us overall as we do this particular operation a lot (on every request).

> Replace usages of MessageDigest with Guava's Hasher
> ---------------------------------------------------
>
>                 Key: CASSANDRA-13291
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13291
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Michael Kjellman
>            Assignee: Michael Kjellman
>         Attachments: CASSANDRA-13291-trunk.diff
>
>
> During my profiling of C* I frequently see lots of aggregate time across 
> threads being spent inside the MD5 MessageDigest implementation. Given that 
> there are tons of modern alternative hashing functions better than MD5 
> available -- both in terms of providing better collision resistance and 
> actual computational speed -- I wanted to switch out our usage of MD5 for 
> alternatives (like adler128 or murmur3_128) and test for performance 
> improvements.
> Unfortunately, I found given the fact we use MessageDigest everywhere --  
> switching out the hashing function to something like adler128 or murmur3_128 
> (for example) -- which don't ship with the JDK --  wasn't straight forward.
> The goal of this ticket is to propose switching out usages of MessageDigest 
> directly in favor of Hasher from Guava. This means going forward we can 
> change a single line of code to switch the hashing algorithm being used 
> (assuming there is an implementation in Guava).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to