[ https://issues.apache.org/jira/browse/CASSANDRA-9912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14981114#comment-14981114 ]
Paulo Motta commented on CASSANDRA-9912: ---------------------------------------- Created a simple [dest|https://github.com/pauloricardomg/cassandra-dtest/blob/7ef7520206d2ebc355f5ae4d0e64dba64481e057/topology_test.py#L32] reproducing the issue. Basically when the node is decommissioned his tokens are wiped from {{TokenMetadata}} but not from the system keyspace, so {{SizeEstimatesRecorder}} tries to fetch the primary ranges of the decommissioned local node's token, which are not in the ring anymore. Simple fix is to not run {{SizeEstimatesRecorder}} when the node is not a member of the ring, generalizing the check that was put by CASSANDRA-9034 to not run {{SizeEstimatesRecorder}} when the node has never joined the ring. In order to guarantee this generalization will not cause a regression of CASSANDRA-9034, I also added a [dtest|https://github.com/pauloricardomg/cassandra-dtest/blob/7ef7520206d2ebc355f5ae4d0e64dba64481e057/topology_test.py#L17] that reproduces that issue. I also removed some related dead code from {{StorageService}} and {{SystemKeyspace}}. Test results will be available shortly: ||2.1||2.2||3.0||trunk||dtest|| |[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...pauloricardomg:2.1-9912]|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:2.2-9912]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-9912]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-9912]|[PR|https://github.com/riptano/cassandra-dtest/pull/637]| |[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-9912-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-9912-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-9912-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-9912-testall/lastCompletedBuild/testReport/]| |[dtests|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-9912-dtest/lastCompletedBuild/testReport/]|[dtests|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-9912-dtest/lastCompletedBuild/testReport/]|[dtests|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-9912-dtest/lastCompletedBuild/testReport/]|[dtests|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-9912-dtest/lastCompletedBuild/testReport/]| > SizeEstimatesRecorder has assertions after decommission sometimes > ----------------------------------------------------------------- > > Key: CASSANDRA-9912 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9912 > Project: Cassandra > Issue Type: Bug > Reporter: Jeremiah Jordan > Assignee: Paulo Motta > Fix For: 2.1.12 > > > Doing some testing with 2.1.8 adding and decommissioning nodes. Sometimes > after decommissioning the following starts being thrown by the > SizeEstimatesRecorder. > {noformat} > java.lang.AssertionError: -9223372036854775808 not found in > -9223372036854775798, 10 > at > org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:683) > ~[cassandra-all-2.1.8.621.jar:2.1.8.621] > at > org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:627) > ~[cassandra-all-2.1.8.621.jar:2.1.8.621] > at > org.apache.cassandra.db.SizeEstimatesRecorder.run(SizeEstimatesRecorder.java:68) > ~[cassandra-all-2.1.8.621.jar:2.1.8.621] > at > org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118) > ~[cassandra-all-2.1.8.621.jar:2.1.8.621] > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_40] > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > [na:1.8.0_40] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > [na:1.8.0_40] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > [na:1.8.0_40] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_40] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_40] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)