[ 
https://issues.apache.org/jira/browse/CASSANDRA-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16183505#comment-16183505
 ] 

Jason Brown commented on CASSANDRA-13291:
-----------------------------------------

A [slightly 
fairer|https://github.com/jasobrown/cassandra/commit/e57bc21903687dfed573ff427ad4eeededac41a9]
 comparison, wherein I call {{MessageDigest#close()}} on a prototype instance 
each time, instead of a fresh new instance via 
{{MessageDigest.getInstance("MD5")}}.

Update results:

{noformat}
     [java] Benchmark                            (bufferSize)  Mode  Cnt     
Score     Error  Units
     [java] HashingBench.benchHasherMD5                    31  avgt    5   
340.186 ±  54.611  ns/op
     [java] HashingBench.benchHasherMD5                   131  avgt    5   
708.117 ±  42.826  ns/op
     [java] HashingBench.benchHasherMD5                   517  avgt    5  
1801.402 ±  47.358  ns/op
     [java] HashingBench.benchHasherMD5                  2041  avgt    5  
6294.723 ± 518.325  ns/op
     [java] HashingBench.benchHasherMurmur3_128            31  avgt    5   
286.312 ±  65.617  ns/op
     [java] HashingBench.benchHasherMurmur3_128           131  avgt    5   
429.138 ±  36.589  ns/op
     [java] HashingBench.benchHasherMurmur3_128           517  avgt    5   
908.452 ±  27.860  ns/op
     [java] HashingBench.benchHasherMurmur3_128          2041  avgt    5  
2830.657 ± 225.470  ns/op
     [java] HashingBench.benchMessageDigestMD5             31  avgt    5   
484.350 ± 474.141  ns/op
     [java] HashingBench.benchMessageDigestMD5            131  avgt    5  
1059.691 ±  53.677  ns/op
     [java] HashingBench.benchMessageDigestMD5            517  avgt    5  
2557.586 ± 319.597  ns/op
     [java] HashingBench.benchMessageDigestMD5           2041  avgt    5  
8585.662 ± 135.474  ns/op
{noformat}

Either way, the guava hasher is faster.

In other news, The guava MD5 implementation uses {{MessageDigest}} under the 
covers, so I think the hash results from the guava md5 and the 
{{MessageDigest}} should be the same. [~mkjellman] can you confirm?

> Replace usages of MessageDigest with Guava's Hasher
> ---------------------------------------------------
>
>                 Key: CASSANDRA-13291
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13291
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Michael Kjellman
>            Assignee: Michael Kjellman
>         Attachments: CASSANDRA-13291-trunk.diff
>
>
> During my profiling of C* I frequently see lots of aggregate time across 
> threads being spent inside the MD5 MessageDigest implementation. Given that 
> there are tons of modern alternative hashing functions better than MD5 
> available -- both in terms of providing better collision resistance and 
> actual computational speed -- I wanted to switch out our usage of MD5 for 
> alternatives (like adler128 or murmur3_128) and test for performance 
> improvements.
> Unfortunately, I found given the fact we use MessageDigest everywhere --  
> switching out the hashing function to something like adler128 or murmur3_128 
> (for example) -- which don't ship with the JDK --  wasn't straight forward.
> The goal of this ticket is to propose switching out usages of MessageDigest 
> directly in favor of Hasher from Guava. This means going forward we can 
> change a single line of code to switch the hashing algorithm being used 
> (assuming there is an implementation in Guava).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to