[ https://issues.apache.org/jira/browse/CASSANDRA-11393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15226377#comment-15226377 ]
Benjamin Lerer commented on CASSANDRA-11393: -------------------------------------------- I believe based on the different stack traces that we have in reality 2 different scenarios: # In the case where the assertion is thrown by the {{LegacyReadCommandSerializer}} the problem is caused by the fact that the coordinator though at the time where the message was created that the replica was on version 2.1 and that it discover before serializing the message that the replica has been upgraded to version 3.0. # In the case where the assertion is thrown by the {{ReadCommandSerializer}} the problem is caused by the fact that the coordinator though at the time where the message was created that the replica was on version 3.0 and that it discover before serializing the message that the replica is in fact a 2.1 node. My guess is that this case could happen if the coordinator has just restarted and that the message creation is performed just after the endPoint has been added and before the version is set (MessageService returns the current version if it has no version associated to the end point). > dtest failure in > upgrade_tests.upgrade_through_versions_test.ProtoV3Upgrade_2_1_UpTo_3_0_HEAD.rolling_upgrade_test > ------------------------------------------------------------------------------------------------------------------ > > Key: CASSANDRA-11393 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11393 > Project: Cassandra > Issue Type: Bug > Reporter: Philip Thompson > Assignee: Benjamin Lerer > Labels: dtest > > We are seeing a failure in the upgrade tests that go from 2.1 to 3.0 > {code} > node2: ERROR [SharedPool-Worker-2] 2016-03-10 20:05:17,865 Message.java:611 - > Unexpected exception during request; channel = [id: 0xeb79b477, > /127.0.0.1:39613 => /127.0.0.2:9042] > java.lang.AssertionError: null > at > org.apache.cassandra.db.ReadCommand$LegacyReadCommandSerializer.serializedSize(ReadCommand.java:1208) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$LegacyReadCommandSerializer.serializedSize(ReadCommand.java:1155) > ~[main/:na] > at org.apache.cassandra.net.MessageOut.payloadSize(MessageOut.java:166) > ~[main/:na] > at > org.apache.cassandra.net.OutboundTcpConnectionPool.getConnection(OutboundTcpConnectionPool.java:72) > ~[main/:na] > at > org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:609) > ~[main/:na] > at > org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:758) > ~[main/:na] > at > org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:701) > ~[main/:na] > at > org.apache.cassandra.net.MessagingService.sendRRWithFailure(MessagingService.java:684) > ~[main/:na] > at > org.apache.cassandra.service.AbstractReadExecutor.makeRequests(AbstractReadExecutor.java:110) > ~[main/:na] > at > org.apache.cassandra.service.AbstractReadExecutor.makeDataRequests(AbstractReadExecutor.java:85) > ~[main/:na] > at > org.apache.cassandra.service.AbstractReadExecutor$AlwaysSpeculatingReadExecutor.executeAsync(AbstractReadExecutor.java:330) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.doInitialQueries(StorageProxy.java:1699) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1654) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1601) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1520) > ~[main/:na] > at > org.apache.cassandra.db.SinglePartitionReadCommand.execute(SinglePartitionReadCommand.java:302) > ~[main/:na] > at > org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(AbstractQueryPager.java:67) > ~[main/:na] > at > org.apache.cassandra.service.pager.SinglePartitionPager.fetchPage(SinglePartitionPager.java:34) > ~[main/:na] > at > org.apache.cassandra.cql3.statements.SelectStatement$Pager$NormalPager.fetchPage(SelectStatement.java:297) > ~[main/:na] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:333) > ~[main/:na] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:209) > ~[main/:na] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:76) > ~[main/:na] > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206) > ~[main/:na] > at > org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:472) > ~[main/:na] > at > org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:449) > ~[main/:na] > at > org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:130) > ~[main/:na] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507) > [main/:na] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401) > [main/:na] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.23.Final.jar:4.0.23.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > [netty-all-4.0.23.Final.jar:4.0.23.Final] > at > io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32) > [netty-all-4.0.23.Final.jar:4.0.23.Final] > at > io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324) > [netty-all-4.0.23.Final.jar:4.0.23.Final] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_51] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > [main/:na] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) > [main/:na] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51] > {code} > example failure: > http://cassci.datastax.com/job/upgrade_tests-all/24/testReport/upgrade_tests.upgrade_through_versions_test/ProtoV3Upgrade_2_1_UpTo_3_0_HEAD/rolling_upgrade_test > Failed on CassCI build upgrade_tests-all #24 > The stack trace and context match that of CASSANDRA-10122. It looks like it > may be the same issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)