[ https://issues.apache.org/jira/browse/CASSANDRA-13292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16840218#comment-16840218 ]
Benedict commented on CASSANDRA-13292: -------------------------------------- https://github.com/rurban/smhasher > Replace MessagingService usage of MD5 with something more modern > ---------------------------------------------------------------- > > Key: CASSANDRA-13292 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13292 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/Core > Reporter: Michael Kjellman > Assignee: Michael Kjellman > Priority: Normal > Attachments: quorum-concurrency-reads-quorum.svg > > > While profiling C* via multiple profilers, I've consistently seen a > significant amount of time being spent calculating MD5 digests. > {code} > Stack Trace Sample Count Percentage(%) > sun.security.provider.MD5.implCompress(byte[], int) 264 1.566 > sun.security.provider.DigestBase.implCompressMultiBlock(byte[], int, int) > 200 1.187 > sun.security.provider.DigestBase.engineUpdate(byte[], int, int) 200 > 1.187 > java.security.MessageDigestSpi.engineUpdate(ByteBuffer) 200 > 1.187 > java.security.MessageDigest$Delegate.engineUpdate(ByteBuffer) > 200 1.187 > java.security.MessageDigest.update(ByteBuffer) 200 1.187 > org.apache.cassandra.db.Column.updateDigest(MessageDigest) > 193 1.145 > > org.apache.cassandra.db.ColumnFamily.updateDigest(MessageDigest) 193 1.145 > > org.apache.cassandra.db.ColumnFamily.digest(ColumnFamily) 193 1.145 > > org.apache.cassandra.service.RowDigestResolver.resolve() 106 0.629 > > org.apache.cassandra.service.RowDigestResolver.resolve() 106 0.629 > > org.apache.cassandra.service.ReadCallback.get() 88 0.522 > > org.apache.cassandra.service.AbstractReadExecutor.get() 88 0.522 > > org.apache.cassandra.service.StorageProxy.fetchRows(List, ConsistencyLevel) > 88 0.522 > > org.apache.cassandra.service.StorageProxy.read(List, ConsistencyLevel) > 88 0.522 > > org.apache.cassandra.service.pager.SliceQueryPager.queryNextPage(int, > ConsistencyLevel, boolean) 88 0.522 > > org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(int) 88 > 0.522 > > org.apache.cassandra.service.pager.SliceQueryPager.fetchPage(int) 88 > 0.522 > > org.apache.cassandra.cql3.statements.SelectStatement.execute(QueryState, > QueryOptions) 88 0.522 > > org.apache.cassandra.cql3.statements.SelectStatement.execute(QueryState, > QueryOptions) 88 0.522 > > org.apache.cassandra.cql3.QueryProcessor.processStatement(CQLStatement, > QueryState, QueryOptions) 88 0.522 > > org.apache.cassandra.cql3.QueryProcessor.process(String, QueryState, > QueryOptions) 88 0.522 > > org.apache.cassandra.transport.messages.QueryMessage.execute(QueryState) > 88 0.522 > > org.apache.cassandra.transport.Message$Dispatcher.messageReceived(ChannelHandlerContext, > MessageEvent) 88 0.522 > > org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(ChannelHandlerContext, > ChannelEvent) 88 0.522 > > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline$DefaultChannelHandlerContext, > ChannelEvent) 88 0.522 > > org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(ChannelEvent) > 88 0.522 > > org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun() > 88 0.522 > > org.jboss.netty.handler.execution.ChannelEventRunnable.run() 88 > 0.522 > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) > 88 0.522 > > java.util.concurrent.ThreadPoolExecutor$Worker.run() 88 > 0.522 > > java.lang.Thread.run() 88 0.522 > {code} > Pending CASSANDRA-13291, it would be pretty easy to: > # Switch out the hashing implementation from MD5 to implementations such as > adler128 and murmur3_128 (but certainly not limited to) and do some profiling > to compare the net improvement on latencies and CPU usage > # As we can't switch the algorithm from MD5 without breaking things, we could > rev the MessagingService protocol version -- like we already do for things > like switching from Snappy compression -> LZ4, we could switch to the new > hashing implementation once all peers in the node are upgraded and support > the new MessagingService version. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org