[ https://issues.apache.org/jira/browse/CASSANDRA-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16183505#comment-16183505 ]
Jason Brown commented on CASSANDRA-13291: ----------------------------------------- A [slightly fairer|https://github.com/jasobrown/cassandra/commit/e57bc21903687dfed573ff427ad4eeededac41a9] comparison, wherein I call {{MessageDigest#close()}} on a prototype instance each time, instead of a fresh new instance via {{MessageDigest.getInstance("MD5")}}. Update results: {noformat} [java] Benchmark (bufferSize) Mode Cnt Score Error Units [java] HashingBench.benchHasherMD5 31 avgt 5 340.186 ± 54.611 ns/op [java] HashingBench.benchHasherMD5 131 avgt 5 708.117 ± 42.826 ns/op [java] HashingBench.benchHasherMD5 517 avgt 5 1801.402 ± 47.358 ns/op [java] HashingBench.benchHasherMD5 2041 avgt 5 6294.723 ± 518.325 ns/op [java] HashingBench.benchHasherMurmur3_128 31 avgt 5 286.312 ± 65.617 ns/op [java] HashingBench.benchHasherMurmur3_128 131 avgt 5 429.138 ± 36.589 ns/op [java] HashingBench.benchHasherMurmur3_128 517 avgt 5 908.452 ± 27.860 ns/op [java] HashingBench.benchHasherMurmur3_128 2041 avgt 5 2830.657 ± 225.470 ns/op [java] HashingBench.benchMessageDigestMD5 31 avgt 5 484.350 ± 474.141 ns/op [java] HashingBench.benchMessageDigestMD5 131 avgt 5 1059.691 ± 53.677 ns/op [java] HashingBench.benchMessageDigestMD5 517 avgt 5 2557.586 ± 319.597 ns/op [java] HashingBench.benchMessageDigestMD5 2041 avgt 5 8585.662 ± 135.474 ns/op {noformat} Either way, the guava hasher is faster. In other news, The guava MD5 implementation uses {{MessageDigest}} under the covers, so I think the hash results from the guava md5 and the {{MessageDigest}} should be the same. [~mkjellman] can you confirm? > Replace usages of MessageDigest with Guava's Hasher > --------------------------------------------------- > > Key: CASSANDRA-13291 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13291 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Michael Kjellman > Assignee: Michael Kjellman > Attachments: CASSANDRA-13291-trunk.diff > > > During my profiling of C* I frequently see lots of aggregate time across > threads being spent inside the MD5 MessageDigest implementation. Given that > there are tons of modern alternative hashing functions better than MD5 > available -- both in terms of providing better collision resistance and > actual computational speed -- I wanted to switch out our usage of MD5 for > alternatives (like adler128 or murmur3_128) and test for performance > improvements. > Unfortunately, I found given the fact we use MessageDigest everywhere -- > switching out the hashing function to something like adler128 or murmur3_128 > (for example) -- which don't ship with the JDK -- wasn't straight forward. > The goal of this ticket is to propose switching out usages of MessageDigest > directly in favor of Hasher from Guava. This means going forward we can > change a single line of code to switch the hashing algorithm being used > (assuming there is an implementation in Guava). -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org