[ https://issues.apache.org/jira/browse/CASSANDRA-18644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brandon Williams updated CASSANDRA-18644: ----------------------------------------- Bug Category: Parent values: Correctness(12982)Level 1 values: Transient Incorrect Response(12987) Complexity: Normal Component/s: Cluster/Schema Discovered By: User Report Fix Version/s: 3.11.x Severity: Low Status: Open (was: Triage Needed) > ERROR: ArrayIndexOutOfBoundsException: -1 in MigrationStage thread when > shutdown Cassandra-3.11.15 node > ------------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-18644 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18644 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Schema > Reporter: Ke Han > Priority: Normal > Fix For: 3.11.x > > Attachments: cassandra.yaml, system.log > > > h2. Symptoms > I was migrating from Cassandra-3.11.15 to Cassandra-4.1.2(Single node), and I > encountered the following error message when I drained and shutdown the > 3.11.15 node. > {code:java} > INFO [MigrationStage:1] 2023-06-30 02:41:08,822 ColumnFamilyStore.java:432 - > Initializing uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi.dbqxbgocor > INFO [MutationStage-4] 2023-06-30 02:41:11,041 ColumnFamilyStore.java:2250 - > Truncating uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi > INFO [MutationStage-4] 2023-06-30 02:41:13,547 ColumnFamilyStore.java:2304 - > Truncate of uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi is complete > INFO [Native-Transport-Requests-5] 2023-06-30 02:41:13,552 > MigrationManager.java:362 - Drop Keyspace > 'uuidff0deb8dbf894fb1b3f59c0eafb94467' > INFO [IndexSummaryManager:1] 2023-06-30 02:41:36,223 > IndexSummaryRedistribution.java:78 - Redistributing index summaries > INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,637 > StorageService.java:1660 - DRAINING: starting drain process > INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,641 > HintsService.java:210 - Paused hints dispatch > INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,645 > Server.java:179 - Stop listening for CQL clients > INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,645 > Gossiper.java:1720 - Announcing shutdown > INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,648 > StorageService.java:2585 - Node /192.168.215.2 state jump to shutdown > INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:48,652 > MessagingService.java:985 - Waiting for messaging service to quiesce > INFO [ACCEPT-/192.168.215.2] 2023-06-30 02:41:48,655 > MessagingService.java:1346 - MessagingService has terminated the accept() > thread > INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:52,618 > HintsService.java:210 - Paused hints dispatch > INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:56,835 > StorageService.java:1660 - DRAINED > ERROR [MigrationStage:1] 2023-06-30 02:41:56,838 CassandraDaemon.java:244 - > Exception in thread Thread[MigrationStage:1,5,main] > java.lang.ArrayIndexOutOfBoundsException: -1 > at java.util.ArrayList.elementData(ArrayList.java:424) > at java.util.ArrayList.get(ArrayList.java:437) > at > org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289) > at > org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227) > at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663) > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428) > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413) > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390) > at > org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) > at java.lang.Thread.run(Thread.java:750) > ERROR [Native-Transport-Requests-5] 2023-06-30 02:41:56,839 > QueryMessage.java:129 - Unexpected error during query > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > java.lang.ArrayIndexOutOfBoundsException: -1 > at > org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:442) > at > org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:429) > at > org.apache.cassandra.service.MigrationManager.announceKeyspaceDrop(MigrationManager.java:363) > at > org.apache.cassandra.cql3.statements.DropKeyspaceStatement.announceMigration(DropKeyspaceStatement.java:63) > at > org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:123) > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:243) > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:274) > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:259) > at > org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116) > at > org.apache.cassandra.transport.Message$Dispatcher.processRequest(Message.java:688) > at > org.apache.cassandra.transport.Message$Dispatcher.lambda$channelRead0$0(Message.java:594) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:113) > at java.lang.Thread.run(Thread.java:750) > Caused by: java.util.concurrent.ExecutionException: > java.lang.ArrayIndexOutOfBoundsException: -1 > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:438) > ... 14 common frames omitted > Caused by: java.lang.ArrayIndexOutOfBoundsException: -1 > at java.util.ArrayList.elementData(ArrayList.java:424) > at java.util.ArrayList.get(ArrayList.java:437) > at > org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289) > at > org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227) > at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663) > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428) > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413) > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390) > at > org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) > ... 1 common frames omitted > ERROR [Native-Transport-Requests-5] 2023-06-30 02:41:56,841 > ErrorMessage.java:384 - Unexpected exception during request > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > java.lang.ArrayIndexOutOfBoundsException: -1 > at > org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:442) > at > org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:429) > at > org.apache.cassandra.service.MigrationManager.announceKeyspaceDrop(MigrationManager.java:363) > at > org.apache.cassandra.cql3.statements.DropKeyspaceStatement.announceMigration(DropKeyspaceStatement.java:63) > at > org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:123) > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:243) > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:274) > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:259) > at > org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116) > at > org.apache.cassandra.transport.Message$Dispatcher.processRequest(Message.java:688) > at > org.apache.cassandra.transport.Message$Dispatcher.lambda$channelRead0$0(Message.java:594) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:113) > at java.lang.Thread.run(Thread.java:750) > Caused by: java.util.concurrent.ExecutionException: > java.lang.ArrayIndexOutOfBoundsException: -1 > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:438) > ... 14 common frames omitted > Caused by: java.lang.ArrayIndexOutOfBoundsException: -1 > at java.util.ArrayList.elementData(ArrayList.java:424) > at java.util.ArrayList.get(ArrayList.java:437) > at > org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289) > at > org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227) > at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663) > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428) > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413) > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390) > at > org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) > ... 1 common frames omitted > INFO [RMI TCP Connection(4)-127.0.0.1] 2023-06-30 02:41:59,326 > CassandraDaemon.java:576 - Cassandra shutting down... {code} > The _java.lang.ArrayIndexOutOfBoundsException_ is thrown in MigrationStage > thread. > h2. Root Cause > From my preliminary analysis, this bug is related to thread interleaving > between > # Thread trying to shutdown the Cassandra instance. > # MigrationStage thread trying to recycle segments. > if user invokes {_}bin/nodetool stopdaemon{_}, the activeSegments in > AbstractCommitLogSegmentManager will be empty. > {code:java} > // AbstractCommitLogSegmentManager.java > /** Active segments, containing unflushed data. The tail of this queue is the > one we allocate writes to */private final > ConcurrentLinkedQueue<CommitLogSegment> activeSegments = new > ConcurrentLinkedQueue<>(); {code} > At this time, if the migration thread tries to invoke the forceRecycleAll > function, it throws _ArrayIndexOutOfBoundsException_ if there's no active > segment. > {code:java} > // AbstractCommitLogSegmentManager.java > void forceRecycleAll(Iterable<UUID> droppedCfs) > { > List<CommitLogSegment> segmentsToRecycle = new > ArrayList<>(activeSegments); > CommitLogSegment last = segmentsToRecycle.get(segmentsToRecycle.size() - > 1); > advanceAllocatingFrom(last); // wait for the commit log modifications > last.waitForModifications(); {code} > It seems that this ERROR won't impact the data integrity. But I am not sure > whether it will cause some problems in the segment recycling process. > I think this should be handled considering its ERROR level in system.log. It > might confuse users whether their data is still in an integrity state and > whether they can perform the migration from 3.x to 4.x. > I have attached my full system.log and cassandra.yaml file. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org