[ https://issues.apache.org/jira/browse/CASSANDRA-17524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17520027#comment-17520027 ]
Stefan Miklosovic commented on CASSANDRA-17524: ----------------------------------------------- +1 [~jonmeredith], nice catch! I am doing stuff related to draining and executors in CASSANDRA-17493 and I am looking for a reviewer. Would have time to take a look at that, please? > Schema mutations may not be completed on drain > ---------------------------------------------- > > Key: CASSANDRA-17524 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17524 > Project: Cassandra > Issue Type: Bug > Components: Local/Startup and Shutdown > Reporter: Jon Meredith > Assignee: Jon Meredith > Priority: Normal > Fix For: 4.1, 3.0.x, 3.11.x, 4.0.x > > Time Spent: 1h 10m > Remaining Estimate: 0h > > The drain logic (invoked explicitly with nodetool or from the JVM > shutdown hook) closes down executor stages that can create mutations (counter, > view, mutation) before closing down the commitlog. The gossip > stage also commits schema mutations, and should be treated the same way. > The messaging service is shut down as part of drain, so there should be > no new Gossip messages received, however any messages still queued > in the executor could still run after the commitlog allocator is shut down as > part of drain, causing the gossip stage thread to hang indefinitely waiting > for a new segment that never arrives. > Here is an example from an in-JVM dtest, showing an update to the peers table > as it shuts down. > {code:java} > park:-1, Unsafe (jdk.internal.misc) > park:323, LockSupport (java.util.concurrent.locks) > await:289, WaitQueue$Standard$AbstractSignal > (org.apache.cassandra.utils.concurrent) > await:282, WaitQueue$Standard$AbstractSignal > (org.apache.cassandra.utils.concurrent) > awaitUninterruptibly:186, Awaitable$Defaults > (org.apache.cassandra.utils.concurrent) > awaitUninterruptibly:259, Awaitable$AbstractAwaitable > (org.apache.cassandra.utils.concurrent) > awaitAvailableSegment:283, AbstractCommitLogSegmentManager > (org.apache.cassandra.db.commitlog) > advanceAllocatingFrom:257, AbstractCommitLogSegmentManager > (org.apache.cassandra.db.commitlog) > allocate:55, CommitLogSegmentManagerStandard > (org.apache.cassandra.db.commitlog) > add:282, CommitLog (org.apache.cassandra.db.commitlog) > beginWrite:50, CassandraKeyspaceWriteHandler (org.apache.cassandra.db) > applyInternal:622, Keyspace (org.apache.cassandra.db) > apply:506, Keyspace (org.apache.cassandra.db) > apply:215, Mutation (org.apache.cassandra.db) > apply:220, Mutation (org.apache.cassandra.db) > apply:229, Mutation (org.apache.cassandra.db) > executeInternalWithoutCondition:644, ModificationStatement > (org.apache.cassandra.cql3.statements) > executeLocally:635, ModificationStatement > (org.apache.cassandra.cql3.statements) > executeInternal:431, QueryProcessor (org.apache.cassandra.cql3) > updateTokens:804, SystemKeyspace (org.apache.cassandra.db) > updateTokenMetadata:2941, StorageService (org.apache.cassandra.service) > handleStateNormal:3057, StorageService (org.apache.cassandra.service) > onChange:2498, StorageService (org.apache.cassandra.service) > markAsShutdown:607, Gossiper (org.apache.cassandra.gms) > doVerb:39, GossipShutdownVerbHandler (org.apache.cassandra.gms) > lambda$new$0:78, InboundSink (org.apache.cassandra.net) > accept:-1, 581110313 (org.apache.cassandra.net.InboundSink$$Lambda$2638) > accept:64, InboundSink$Filtered (org.apache.cassandra.net) > accept:50, InboundSink$Filtered (org.apache.cassandra.net) > accept:97, InboundSink (org.apache.cassandra.net) > accept:45, InboundSink (org.apache.cassandra.net) > run:433, InboundMessageHandler$ProcessMessage (org.apache.cassandra.net) > run:124, ExecutionFailure$1 (org.apache.cassandra.concurrent) > runWorker:1128, ThreadPoolExecutor (java.util.concurrent) > run:628, ThreadPoolExecutor$Worker (java.util.concurrent) > run:30, FastThreadLocalRunnable (io.netty.util.concurrent) > run:829, Thread (java.lang) > {code} > This causes an exception during shutdown for the in-JVM dtest as it is > unable to shutdown {{{}Stage.GOSSIP{}}}, but does not prevent regular > shutdown for Cassandra as the executors are not stopped. The schema update > would be lost, despite requesting a graceful shutdown. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org