[ https://issues.apache.org/jira/browse/CASSANDRA-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-10068: --------------------------------------- Fix Version/s: 3.0.0 rc1 > Batchlog replay fails with exception after a node is decommissioned > ------------------------------------------------------------------- > > Key: CASSANDRA-10068 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10068 > Project: Cassandra > Issue Type: Bug > Reporter: Joel Knighton > Assignee: Marcus Eriksson > Fix For: 3.0.0 rc1 > > Attachments: n1.log, n2.log, n3.log, n4.log, n5.log > > > This issue is reproducible through a Jepsen test of materialized views that > crashes and decommissions nodes throughout the test. > At the conclusion of the test, a batchlog replay is initiated through > nodetool and hits the following assertion due to a missing host ID: > https://github.com/apache/cassandra/blob/3413e557b95d9448b0311954e9b4f53eaf4758cd/src/java/org/apache/cassandra/service/StorageProxy.java#L1197 > A nodetool status on the node with failed batchlog replay shows the following > entry for the decommissioned node: > DN 10.0.0.5 ? 256 ? null > rack1 > On the unaffected nodes, there is no entry for the decommissioned node as > expected. > There are occasional hits of the same assertions for logs in other nodes; it > looks like the issue might occasionally resolve itself, but one node seems to > have the errant null entry indefinitely. > In logs for the nodes, this possibly unrelated exception also appears: > java.lang.RuntimeException: Trying to get the view natural endpoint on a > non-data replica > at > org.apache.cassandra.db.view.MaterializedViewUtils.getViewNaturalEndpoint(MaterializedViewUtils.java:91) > ~[apache-cassandra-3.0.0-alpha1-SNAPSHOT.jar:3.0.0-alpha1-SNAPSHOT] > I have a running cluster with the issue on my machine; it is also repeatable. > Nothing stands out in the logs of the decommissioned node (n4) for me. The > logs of each node in the cluster are attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)