[ https://issues.apache.org/jira/browse/CASSANDRA-13229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949608#comment-15949608 ]
Paulo Motta commented on CASSANDRA-13229: ----------------------------------------- Nice catch! I'm afraid we can't fallback to split the token ranges evenly given it's expected that a single vnode range should not span more than 1 disk (CASSANDRA-6696). Actually in this specific case, given it's the system keyspace which spans the whole token range we could probably split the token ranges evenly (and probably should for better distribution), but when {{dontSplitRanges}} flag is passed we should always assign at least 1 vnode range per disk even if one of the disks becomes unbalanced (cases like this will become very rare after CASSANDRA-7032, but we should still protect against it). Although this will probably happen in rare cases when the token ranges are unbalanced and the vnode-to-disk ratio is low, we can probably tweak the {{splitOwnedRangesNoPartialRanges}} algorithm to only add more ranges to the current disk if the # of remaining tokens > # remaining parts. Does this sound reasonable or can you think of a simpler/better approach [~krummas]? > dtest failure in topology_test.TestTopology.size_estimates_multidc_test > ----------------------------------------------------------------------- > > Key: CASSANDRA-13229 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13229 > Project: Cassandra > Issue Type: Bug > Components: Testing > Reporter: Sean McCarthy > Assignee: Alex Petrov > Labels: dtest, test-failure > Fix For: 4.0 > > Attachments: node1_debug.log, node1_gc.log, node1.log, > node2_debug.log, node2_gc.log, node2.log, node3_debug.log, node3_gc.log, > node3.log > > > example failure: > http://cassci.datastax.com/job/trunk_novnode_dtest/508/testReport/topology_test/TestTopology/size_estimates_multidc_test > {code} > Standard Output > Unexpected error in node1 log, error: > ERROR [MemtablePostFlush:1] 2017-02-15 16:07:33,837 CassandraDaemon.java:211 > - Exception in thread Thread[MemtablePostFlush:1,5,main] > java.lang.IndexOutOfBoundsException: Index: 3, Size: 3 > at java.util.ArrayList.rangeCheck(ArrayList.java:653) ~[na:1.8.0_45] > at java.util.ArrayList.get(ArrayList.java:429) ~[na:1.8.0_45] > at > org.apache.cassandra.dht.Splitter.splitOwnedRangesNoPartialRanges(Splitter.java:92) > ~[main/:na] > at org.apache.cassandra.dht.Splitter.splitOwnedRanges(Splitter.java:59) > ~[main/:na] > at > org.apache.cassandra.service.StorageService.getDiskBoundaries(StorageService.java:5180) > ~[main/:na] > at > org.apache.cassandra.db.Memtable.createFlushRunnables(Memtable.java:312) > ~[main/:na] > at org.apache.cassandra.db.Memtable.flushRunnables(Memtable.java:304) > ~[main/:na] > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1150) > ~[main/:na] > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1115) > ~[main/:na] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_45] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$290(NamedThreadFactory.java:81) > [main/:na] > at > org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$5/1321203216.run(Unknown > Source) [main/:na] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45] > Unexpected error in node1 log, error: > ERROR [MigrationStage:1] 2017-02-15 16:07:33,853 CassandraDaemon.java:211 - > Exception in thread Thread[MigrationStage:1,5,main] > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > java.lang.IndexOutOfBoundsException: Index: 3, Size: 3 > at > org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:401) > ~[main/:na] > at > org.apache.cassandra.schema.SchemaKeyspace.lambda$flush$496(SchemaKeyspace.java:284) > ~[main/:na] > at > org.apache.cassandra.schema.SchemaKeyspace$$Lambda$222/1949434065.accept(Unknown > Source) ~[na:na] > at java.lang.Iterable.forEach(Iterable.java:75) ~[na:1.8.0_45] > at > org.apache.cassandra.schema.SchemaKeyspace.flush(SchemaKeyspace.java:284) > ~[main/:na] > at > org.apache.cassandra.schema.SchemaKeyspace.applyChanges(SchemaKeyspace.java:1265) > ~[main/:na] > at org.apache.cassandra.schema.Schema.merge(Schema.java:577) ~[main/:na] > at > org.apache.cassandra.schema.Schema.mergeAndAnnounceVersion(Schema.java:564) > ~[main/:na] > at > org.apache.cassandra.schema.MigrationManager$1.runMayThrow(MigrationManager.java:402) > ~[main/:na] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_45] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_45] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_45] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$290(NamedThreadFactory.java:81) > [main/:na] > at > org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$5/1321203216.run(Unknown > Source) [main/:na] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45] > Caused by: java.util.concurrent.ExecutionException: > java.lang.IndexOutOfBoundsException: Index: 3, Size: 3 > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > ~[na:1.8.0_45] > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > ~[na:1.8.0_45] > at > org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:397) > ~[main/:na] > ... 16 common frames omitted > Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 3 > at java.util.ArrayList.rangeCheck(ArrayList.java:653) ~[na:1.8.0_45] > at java.util.ArrayList.get(ArrayList.java:429) ~[na:1.8.0_45] > at > org.apache.cassandra.dht.Splitter.splitOwnedRangesNoPartialRanges(Splitter.java:92) > ~[main/:na] > at org.apache.cassandra.dht.Splitter.splitOwnedRanges(Splitter.java:59) > ~[main/:na] > at > org.apache.cassandra.service.StorageService.getDiskBoundaries(StorageService.java:5180) > ~[main/:na] > at > org.apache.cassandra.db.Memtable.createFlushRunnables(Memtable.java:312) > ~[main/:na] > at org.apache.cassandra.db.Memtable.flushRunnables(Memtable.java:304) > ~[main/:na] > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1150) > ~[main/:na] > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1115) > ~[main/:na] > ... 5 common frames omitted > Unexpected error in node1 log, error: > ERROR [main] 2017-02-15 16:07:33,857 CassandraDaemon.java:663 - Exception > encountered during startup > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > java.lang.IndexOutOfBoundsException: Index: 3, Size: 3 > at > org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:401) > ~[main/:na] > at > org.apache.cassandra.schema.MigrationManager.announce(MigrationManager.java:384) > ~[main/:na] > at > org.apache.cassandra.schema.MigrationManager.announceNewKeyspace(MigrationManager.java:176) > ~[main/:na] > at > org.apache.cassandra.service.StorageService.maybeAddKeyspace(StorageService.java:1066) > ~[main/:na] > at > org.apache.cassandra.service.StorageService.maybeAddOrUpdateKeyspace(StorageService.java:1091) > ~[main/:na] > at > org.apache.cassandra.service.StorageService.doAuthSetup(StorageService.java:1048) > ~[main/:na] > at > org.apache.cassandra.service.StorageService.finishJoiningRing(StorageService.java:1043) > ~[main/:na] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:966) > ~[main/:na] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:649) > ~[main/:na] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:581) > ~[main/:na] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:364) > [main/:na] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557) > [main/:na] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:646) > [main/:na] > Caused by: java.util.concurrent.ExecutionException: > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > java.lang.IndexOutOfBoundsException: Index: 3, Size: 3 > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > ~[na:1.8.0_45] > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > ~[na:1.8.0_45] > at > org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:397) > ~[main/:na] > ... 12 common frames omitted > Caused by: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.IndexOutOfBoundsException: > Index: 3, Size: 3 > at > org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:401) > ~[main/:na] > at > org.apache.cassandra.schema.SchemaKeyspace.lambda$flush$496(SchemaKeyspace.java:284) > ~[main/:na] > at > org.apache.cassandra.schema.SchemaKeyspace$$Lambda$222/1949434065.accept(Unknown > Source) ~[na:na] > at java.lang.Iterable.forEach(Iterable.java:75) ~[na:1.8.0_45] > at > org.apache.cassandra.schema.SchemaKeyspace.flush(SchemaKeyspace.java:284) > ~[main/:na] > at > org.apache.cassandra.schema.SchemaKeyspace.applyChanges(SchemaKeyspace.java:1265) > ~[main/:na] > at org.apache.cassandra.schema.Schema.merge(Schema.java:577) ~[main/:na] > at > org.apache.cassandra.schema.Schema.mergeAndAnnounceVersion(Schema.java:564) > ~[main/:na] > at > org.apache.cassandra.schema.MigrationManager$1.runMayThrow(MigrationManager.java:402) > ~[main/:na] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_45] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_45] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > ~[na:1.8.0_45] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$290(NamedThreadFactory.java:81) > ~[main/:na] > at > org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$5/1321203216.run(Unknown > Source) ~[na:na] > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_45] > Caused by: java.util.concurrent.ExecutionException: > java.lang.IndexOutOfBoundsException: Index: 3, Size: 3 > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > ~[na:1.8.0_45] > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > ~[na:1.8.0_45] > at > org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:397) > ~[main/:na] > ... 16 common frames omitted > Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 3 > at java.util.ArrayList.rangeCheck(ArrayList.java:653) ~[na:1.8.0_45] > at java.util.ArrayList.get(ArrayList.java:429) ~[na:1.8.0_45] > at > org.apache.cassandra.dht.Splitter.splitOwnedRangesNoPartialRanges(Splitter.java:92) > ~[main/:na] > at org.apache.cassandra.dht.Splitter.splitOwnedRanges(Splitter.java:59) > ~[main/:na] > at > org.apache.cassandra.service.StorageService.getDiskBoundaries(StorageService.java:5180) > ~[main/:na] > at > org.apache.cassandra.db.Memtable.createFlushRunnables(Memtable.java:312) > ~[main/:na] > at org.apache.cassandra.db.Memtable.flushRunnables(Memtable.java:304) > ~[main/:na] > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1150) > ~[main/:na] > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1115) > ~[main/:na] > ... 5 common frames omitted > Unexpected error in node1 log, error: > ERROR [StorageServiceShutdownHook] 2017-02-15 16:07:35,972 > AbstractCommitLogSegmentManager.java:311 - Failed to force-recycle all > segments; at least one segment is still in use with dirty CFs. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)