[ https://issues.apache.org/jira/browse/CASSANDRA-11974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Edward Capriolo reassigned CASSANDRA-11974: ------------------------------------------- Assignee: Edward Capriolo > Failed assert causes OutboundTcpConnection to exit > -------------------------------------------------- > > Key: CASSANDRA-11974 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11974 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging > Reporter: Sean Thornton > Assignee: Edward Capriolo > > I am seeing the following in a client's cluster: > {noformat} > ERROR [MessagingService-Outgoing-/10.0.0.1] 2016-06-06 03:38:19,305 > CassandraDaemon.java:229 - Exception in thread > Thread[MessagingService-Outgoing-/10.0.0.1,5,main] > java.lang.AssertionError: 635174 > at > org.apache.cassandra.utils.ByteBufferUtil.writeWithShortLength(ByteBufferUtil.java:290) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.db.composites.AbstractCType$Serializer.serialize(AbstractCType.java:392) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.db.composites.AbstractCType$Serializer.serialize(AbstractCType.java:381) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.db.filter.ColumnSlice$Serializer.serialize(ColumnSlice.java:271) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.db.filter.ColumnSlice$Serializer.serialize(ColumnSlice.java:259) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.db.filter.SliceQueryFilter$Serializer.serialize(SliceQueryFilter.java:503) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.db.filter.SliceQueryFilter$Serializer.serialize(SliceQueryFilter.java:490) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.db.SliceFromReadCommandSerializer.serialize(SliceFromReadCommand.java:168) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.db.ReadCommandSerializer.serialize(ReadCommand.java:143) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.db.ReadCommandSerializer.serialize(ReadCommand.java:132) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:121) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.net.OutboundTcpConnection.writeInternal(OutboundTcpConnection.java:330) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:282) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:218) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > {noformat} > Obviously they somehow exceeded a 64K limit (quick and dirty suspects - > https://docs.datastax.com/en/cql/3.1/cql/cql_reference/refLimits.html) but > that is neither here nor there. > The problem I see when this happens is > {{ByteBufferUtil.writeWithShortLength}} can throw a > {{java.lang.AssertionError}} which is a true {{Error}} that bubbles up and > totally bypasses the {{catch (Exception e)}} clause in the message processing > loop in {{OutboundTcpConnection.run()}} _which causes the thread to exit and > that node to no longer communicate outgoing messages to other nodes_. > At least from my perspective, there are two things I would like to see > handled differently - > * In the event of _any_ problem, I would like to see whatever details > possible be logged about the problem Message - partition key, CF data, > anything. Right now it can be very difficult to track this down > * The {{java.lang.Error}} possibility needs to be handled somehow. If it's > an assertion error, it seems like we could continue the processing loop. But > shutting down the JVM would be better than what I get now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)