[ https://issues.apache.org/jira/browse/CASSANDRA-18644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ke Han updated CASSANDRA-18644: ------------------------------- Description: h2. Symptoms I was migrating from Cassandra-3.11.15 to Cassandra-4.1.2(Single node), and I encountered the following error message when I drained and shutdown the 3.11.15 node. {code:java} INFO [MigrationStage:1] 2023-06-30 02:41:08,822 ColumnFamilyStore.java:432 - Initializing uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi.dbqxbgocor INFO [MutationStage-4] 2023-06-30 02:41:11,041 ColumnFamilyStore.java:2250 - Truncating uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi INFO [MutationStage-4] 2023-06-30 02:41:13,547 ColumnFamilyStore.java:2304 - Truncate of uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi is complete INFO [Native-Transport-Requests-5] 2023-06-30 02:41:13,552 MigrationManager.java:362 - Drop Keyspace 'uuidff0deb8dbf894fb1b3f59c0eafb94467' INFO [IndexSummaryManager:1] 2023-06-30 02:41:36,223 IndexSummaryRedistribution.java:78 - Redistributing index summaries INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,637 StorageService.java:1660 - DRAINING: starting drain process INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,641 HintsService.java:210 - Paused hints dispatch INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,645 Server.java:179 - Stop listening for CQL clients INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,645 Gossiper.java:1720 - Announcing shutdown INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,648 StorageService.java:2585 - Node /192.168.215.2 state jump to shutdown INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:48,652 MessagingService.java:985 - Waiting for messaging service to quiesce INFO [ACCEPT-/192.168.215.2] 2023-06-30 02:41:48,655 MessagingService.java:1346 - MessagingService has terminated the accept() thread INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:52,618 HintsService.java:210 - Paused hints dispatch INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:56,835 StorageService.java:1660 - DRAINED ERROR [MigrationStage:1] 2023-06-30 02:41:56,838 CassandraDaemon.java:244 - Exception in thread Thread[MigrationStage:1,5,main] java.lang.ArrayIndexOutOfBoundsException: -1 at java.util.ArrayList.elementData(ArrayList.java:424) at java.util.ArrayList.get(ArrayList.java:437) at org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289) at org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227) at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663) at org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428) at org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413) at org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390) at org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) at java.lang.Thread.run(Thread.java:750) ERROR [Native-Transport-Requests-5] 2023-06-30 02:41:56,839 QueryMessage.java:129 - Unexpected error during query java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: -1 at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:442) at org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:429) at org.apache.cassandra.service.MigrationManager.announceKeyspaceDrop(MigrationManager.java:363) at org.apache.cassandra.cql3.statements.DropKeyspaceStatement.announceMigration(DropKeyspaceStatement.java:63) at org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:123) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:243) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:274) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:259) at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116) at org.apache.cassandra.transport.Message$Dispatcher.processRequest(Message.java:688) at org.apache.cassandra.transport.Message$Dispatcher.lambda$channelRead0$0(Message.java:594) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:113) at java.lang.Thread.run(Thread.java:750) Caused by: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: -1 at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:438) ... 14 common frames omitted Caused by: java.lang.ArrayIndexOutOfBoundsException: -1 at java.util.ArrayList.elementData(ArrayList.java:424) at java.util.ArrayList.get(ArrayList.java:437) at org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289) at org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227) at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663) at org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428) at org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413) at org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390) at org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) ... 1 common frames omitted ERROR [Native-Transport-Requests-5] 2023-06-30 02:41:56,841 ErrorMessage.java:384 - Unexpected exception during request java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: -1 at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:442) at org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:429) at org.apache.cassandra.service.MigrationManager.announceKeyspaceDrop(MigrationManager.java:363) at org.apache.cassandra.cql3.statements.DropKeyspaceStatement.announceMigration(DropKeyspaceStatement.java:63) at org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:123) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:243) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:274) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:259) at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116) at org.apache.cassandra.transport.Message$Dispatcher.processRequest(Message.java:688) at org.apache.cassandra.transport.Message$Dispatcher.lambda$channelRead0$0(Message.java:594) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:113) at java.lang.Thread.run(Thread.java:750) Caused by: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: -1 at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:438) ... 14 common frames omitted Caused by: java.lang.ArrayIndexOutOfBoundsException: -1 at java.util.ArrayList.elementData(ArrayList.java:424) at java.util.ArrayList.get(ArrayList.java:437) at org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289) at org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227) at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663) at org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428) at org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413) at org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390) at org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) ... 1 common frames omitted INFO [RMI TCP Connection(4)-127.0.0.1] 2023-06-30 02:41:59,326 CassandraDaemon.java:576 - Cassandra shutting down... {code} The _java.lang.ArrayIndexOutOfBoundsException_ is thrown in MigrationStage thread. h2. Root Cause >From my preliminary analysis, this bug is related to thread interleaving >between # Thread trying to shutdown the Cassandra instance. # MigrationStage thread trying to recycle segments. if user invokes {_}bin/nodetool stopdaemon{_}, the activeSegments in AbstractCommitLogSegmentManager will be empty. {code:java} // AbstractCommitLogSegmentManager.java /** Active segments, containing unflushed data. The tail of this queue is the one we allocate writes to */private final ConcurrentLinkedQueue<CommitLogSegment> activeSegments = new ConcurrentLinkedQueue<>(); {code} At this time, if the migration thread tries to invoke the forceRecycleAll function, it throws _ArrayIndexOutOfBoundsException_ if there's no active segment. {code:java} // AbstractCommitLogSegmentManager.java void forceRecycleAll(Iterable<UUID> droppedCfs) { List<CommitLogSegment> segmentsToRecycle = new ArrayList<>(activeSegments); CommitLogSegment last = segmentsToRecycle.get(segmentsToRecycle.size() - 1); advanceAllocatingFrom(last); // wait for the commit log modifications last.waitForModifications(); {code} It seems that this ERROR won't impact the data integrity. But I am not sure whether it will cause some problems in the segment recycling process. I think this should be handled considering its ERROR level in system.log. It might confuse users whether their data is still in an integrity state and whether they can perform the migration from 3.x to 4.x. I have attached my full system.log and cassandra.yaml file. was: h2. Symptoms I was migrating from Cassandra-3.11.15 to Cassandra-4.1.2(Single node), and I encountered the following error message when I drained and shutdown the 3.11.15 node. {code:java} INFO [MigrationStage:1] 2023-06-30 02:41:08,822 ColumnFamilyStore.java:432 - Initializing uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi.dbqxbgocor INFO [MutationStage-4] 2023-06-30 02:41:11,041 ColumnFamilyStore.java:2250 - Truncating uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi INFO [MutationStage-4] 2023-06-30 02:41:13,547 ColumnFamilyStore.java:2304 - Truncate of uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi is complete INFO [Native-Transport-Requests-5] 2023-06-30 02:41:13,552 MigrationManager.java:362 - Drop Keyspace 'uuidff0deb8dbf894fb1b3f59c0eafb94467' INFO [IndexSummaryManager:1] 2023-06-30 02:41:36,223 IndexSummaryRedistribution.java:78 - Redistributing index summaries INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,637 StorageService.java:1660 - DRAINING: starting drain process INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,641 HintsService.java:210 - Paused hints dispatch INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,645 Server.java:179 - Stop listening for CQL clients INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,645 Gossiper.java:1720 - Announcing shutdown INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,648 StorageService.java:2585 - Node /192.168.215.2 state jump to shutdown INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:48,652 MessagingService.java:985 - Waiting for messaging service to quiesce INFO [ACCEPT-/192.168.215.2] 2023-06-30 02:41:48,655 MessagingService.java:1346 - MessagingService has terminated the accept() thread INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:52,618 HintsService.java:210 - Paused hints dispatch INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:56,835 StorageService.java:1660 - DRAINED ERROR [MigrationStage:1] 2023-06-30 02:41:56,838 CassandraDaemon.java:244 - Exception in thread Thread[MigrationStage:1,5,main] java.lang.ArrayIndexOutOfBoundsException: -1 at java.util.ArrayList.elementData(ArrayList.java:424) at java.util.ArrayList.get(ArrayList.java:437) at org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289) at org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227) at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663) at org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428) at org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413) at org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390) at org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) at java.lang.Thread.run(Thread.java:750) ERROR [Native-Transport-Requests-5] 2023-06-30 02:41:56,839 QueryMessage.java:129 - Unexpected error during query java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: -1 at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:442) at org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:429) at org.apache.cassandra.service.MigrationManager.announceKeyspaceDrop(MigrationManager.java:363) at org.apache.cassandra.cql3.statements.DropKeyspaceStatement.announceMigration(DropKeyspaceStatement.java:63) at org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:123) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:243) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:274) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:259) at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116) at org.apache.cassandra.transport.Message$Dispatcher.processRequest(Message.java:688) at org.apache.cassandra.transport.Message$Dispatcher.lambda$channelRead0$0(Message.java:594) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:113) at java.lang.Thread.run(Thread.java:750) Caused by: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: -1 at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:438) ... 14 common frames omitted Caused by: java.lang.ArrayIndexOutOfBoundsException: -1 at java.util.ArrayList.elementData(ArrayList.java:424) at java.util.ArrayList.get(ArrayList.java:437) at org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289) at org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227) at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663) at org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428) at org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413) at org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390) at org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) ... 1 common frames omitted ERROR [Native-Transport-Requests-5] 2023-06-30 02:41:56,841 ErrorMessage.java:384 - Unexpected exception during request java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: -1 at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:442) at org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:429) at org.apache.cassandra.service.MigrationManager.announceKeyspaceDrop(MigrationManager.java:363) at org.apache.cassandra.cql3.statements.DropKeyspaceStatement.announceMigration(DropKeyspaceStatement.java:63) at org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:123) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:243) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:274) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:259) at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116) at org.apache.cassandra.transport.Message$Dispatcher.processRequest(Message.java:688) at org.apache.cassandra.transport.Message$Dispatcher.lambda$channelRead0$0(Message.java:594) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:113) at java.lang.Thread.run(Thread.java:750) Caused by: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: -1 at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:438) ... 14 common frames omitted Caused by: java.lang.ArrayIndexOutOfBoundsException: -1 at java.util.ArrayList.elementData(ArrayList.java:424) at java.util.ArrayList.get(ArrayList.java:437) at org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289) at org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227) at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663) at org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428) at org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413) at org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390) at org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) ... 1 common frames omitted INFO [RMI TCP Connection(4)-127.0.0.1] 2023-06-30 02:41:59,326 CassandraDaemon.java:576 - Cassandra shutting down... {code} The _java.lang.ArrayIndexOutOfBoundsException_ is thrown in MigrationStage thread. h2. Root Cause This bug is related to thread interleaving between # Thread trying to shutdown the Cassandra instance. # MigrationStage thread trying to recycle segments. if user invokes {_}bin/nodetool stopdaemon{_}, the activeSegments in AbstractCommitLogSegmentManager will be empty. {code:java} // AbstractCommitLogSegmentManager.java /** Active segments, containing unflushed data. The tail of this queue is the one we allocate writes to */private final ConcurrentLinkedQueue<CommitLogSegment> activeSegments = new ConcurrentLinkedQueue<>(); {code} At this time, if the migration thread tries to invoke the forceRecycleAll function, it throws ArrayIndexOutOfBoundsException if there's no active segment. {code:java} // AbstractCommitLogSegmentManager.java void forceRecycleAll(Iterable<UUID> droppedCfs) { List<CommitLogSegment> segmentsToRecycle = new ArrayList<>(activeSegments); CommitLogSegment last = segmentsToRecycle.get(segmentsToRecycle.size() - 1); advanceAllocatingFrom(last); // wait for the commit log modifications last.waitForModifications(); {code} It seems that this ERROR won't impact the data integrity. But I am not sure whether it will cause some problems in the segment recycling process. I think this should be handled considering its ERROR level. It might confuse users whether their data is still in an integrity state => whether they can perform the migration from 3.x to 4.x. I have attached my full system.log and cassandra.yaml file. > ERROR: ArrayIndexOutOfBoundsException: -1 in MigrationStage thread when > shutdown Cassandra-3.11.15 node > ------------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-18644 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18644 > Project: Cassandra > Issue Type: Bug > Reporter: Ke Han > Priority: Normal > Attachments: cassandra.yaml, system.log > > > h2. Symptoms > I was migrating from Cassandra-3.11.15 to Cassandra-4.1.2(Single node), and I > encountered the following error message when I drained and shutdown the > 3.11.15 node. > {code:java} > INFO [MigrationStage:1] 2023-06-30 02:41:08,822 ColumnFamilyStore.java:432 - > Initializing uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi.dbqxbgocor > INFO [MutationStage-4] 2023-06-30 02:41:11,041 ColumnFamilyStore.java:2250 - > Truncating uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi > INFO [MutationStage-4] 2023-06-30 02:41:13,547 ColumnFamilyStore.java:2304 - > Truncate of uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi is complete > INFO [Native-Transport-Requests-5] 2023-06-30 02:41:13,552 > MigrationManager.java:362 - Drop Keyspace > 'uuidff0deb8dbf894fb1b3f59c0eafb94467' > INFO [IndexSummaryManager:1] 2023-06-30 02:41:36,223 > IndexSummaryRedistribution.java:78 - Redistributing index summaries > INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,637 > StorageService.java:1660 - DRAINING: starting drain process > INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,641 > HintsService.java:210 - Paused hints dispatch > INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,645 > Server.java:179 - Stop listening for CQL clients > INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,645 > Gossiper.java:1720 - Announcing shutdown > INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,648 > StorageService.java:2585 - Node /192.168.215.2 state jump to shutdown > INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:48,652 > MessagingService.java:985 - Waiting for messaging service to quiesce > INFO [ACCEPT-/192.168.215.2] 2023-06-30 02:41:48,655 > MessagingService.java:1346 - MessagingService has terminated the accept() > thread > INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:52,618 > HintsService.java:210 - Paused hints dispatch > INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:56,835 > StorageService.java:1660 - DRAINED > ERROR [MigrationStage:1] 2023-06-30 02:41:56,838 CassandraDaemon.java:244 - > Exception in thread Thread[MigrationStage:1,5,main] > java.lang.ArrayIndexOutOfBoundsException: -1 > at java.util.ArrayList.elementData(ArrayList.java:424) > at java.util.ArrayList.get(ArrayList.java:437) > at > org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289) > at > org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227) > at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663) > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428) > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413) > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390) > at > org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) > at java.lang.Thread.run(Thread.java:750) > ERROR [Native-Transport-Requests-5] 2023-06-30 02:41:56,839 > QueryMessage.java:129 - Unexpected error during query > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > java.lang.ArrayIndexOutOfBoundsException: -1 > at > org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:442) > at > org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:429) > at > org.apache.cassandra.service.MigrationManager.announceKeyspaceDrop(MigrationManager.java:363) > at > org.apache.cassandra.cql3.statements.DropKeyspaceStatement.announceMigration(DropKeyspaceStatement.java:63) > at > org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:123) > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:243) > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:274) > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:259) > at > org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116) > at > org.apache.cassandra.transport.Message$Dispatcher.processRequest(Message.java:688) > at > org.apache.cassandra.transport.Message$Dispatcher.lambda$channelRead0$0(Message.java:594) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:113) > at java.lang.Thread.run(Thread.java:750) > Caused by: java.util.concurrent.ExecutionException: > java.lang.ArrayIndexOutOfBoundsException: -1 > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:438) > ... 14 common frames omitted > Caused by: java.lang.ArrayIndexOutOfBoundsException: -1 > at java.util.ArrayList.elementData(ArrayList.java:424) > at java.util.ArrayList.get(ArrayList.java:437) > at > org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289) > at > org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227) > at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663) > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428) > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413) > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390) > at > org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) > ... 1 common frames omitted > ERROR [Native-Transport-Requests-5] 2023-06-30 02:41:56,841 > ErrorMessage.java:384 - Unexpected exception during request > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > java.lang.ArrayIndexOutOfBoundsException: -1 > at > org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:442) > at > org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:429) > at > org.apache.cassandra.service.MigrationManager.announceKeyspaceDrop(MigrationManager.java:363) > at > org.apache.cassandra.cql3.statements.DropKeyspaceStatement.announceMigration(DropKeyspaceStatement.java:63) > at > org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:123) > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:243) > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:274) > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:259) > at > org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116) > at > org.apache.cassandra.transport.Message$Dispatcher.processRequest(Message.java:688) > at > org.apache.cassandra.transport.Message$Dispatcher.lambda$channelRead0$0(Message.java:594) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:113) > at java.lang.Thread.run(Thread.java:750) > Caused by: java.util.concurrent.ExecutionException: > java.lang.ArrayIndexOutOfBoundsException: -1 > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:438) > ... 14 common frames omitted > Caused by: java.lang.ArrayIndexOutOfBoundsException: -1 > at java.util.ArrayList.elementData(ArrayList.java:424) > at java.util.ArrayList.get(ArrayList.java:437) > at > org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289) > at > org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227) > at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663) > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428) > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413) > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390) > at > org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) > ... 1 common frames omitted > INFO [RMI TCP Connection(4)-127.0.0.1] 2023-06-30 02:41:59,326 > CassandraDaemon.java:576 - Cassandra shutting down... {code} > The _java.lang.ArrayIndexOutOfBoundsException_ is thrown in MigrationStage > thread. > h2. Root Cause > From my preliminary analysis, this bug is related to thread interleaving > between > # Thread trying to shutdown the Cassandra instance. > # MigrationStage thread trying to recycle segments. > if user invokes {_}bin/nodetool stopdaemon{_}, the activeSegments in > AbstractCommitLogSegmentManager will be empty. > {code:java} > // AbstractCommitLogSegmentManager.java > /** Active segments, containing unflushed data. The tail of this queue is the > one we allocate writes to */private final > ConcurrentLinkedQueue<CommitLogSegment> activeSegments = new > ConcurrentLinkedQueue<>(); {code} > At this time, if the migration thread tries to invoke the forceRecycleAll > function, it throws _ArrayIndexOutOfBoundsException_ if there's no active > segment. > {code:java} > // AbstractCommitLogSegmentManager.java > void forceRecycleAll(Iterable<UUID> droppedCfs) > { > List<CommitLogSegment> segmentsToRecycle = new > ArrayList<>(activeSegments); > CommitLogSegment last = segmentsToRecycle.get(segmentsToRecycle.size() - > 1); > advanceAllocatingFrom(last); // wait for the commit log modifications > last.waitForModifications(); {code} > It seems that this ERROR won't impact the data integrity. But I am not sure > whether it will cause some problems in the segment recycling process. > I think this should be handled considering its ERROR level in system.log. It > might confuse users whether their data is still in an integrity state and > whether they can perform the migration from 3.x to 4.x. > I have attached my full system.log and cassandra.yaml file. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org