[ https://issues.apache.org/jira/browse/CASSANDRA-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16770869#comment-16770869 ]
Stefan Podkowinski commented on CASSANDRA-15027: ------------------------------------------------ * [ [trunk|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-15027] ][ [circleci|https://circleci.com/workflow-run/2b027f87-cf45-48ee-8eae-45a563701bc6] ] > Handle IR prepare phase failures less race prone by waiting for all results > --------------------------------------------------------------------------- > > Key: CASSANDRA-15027 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15027 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Local/Compaction > Reporter: Stefan Podkowinski > Assignee: Stefan Podkowinski > Priority: Major > Fix For: 4.x > > > Handling incremental repairs as a coordinator begins by sending a > {{PrepareConsistentRequest}} message to all participants, which may also > include the coordinator itself. Participants will run anti-compactions upon > receiving such a message and report the result of the operation back to the > coordinator. > Once we receive a failure response from any of the participants, we fail-fast > in {{CoordinatorSession.handlePrepareResponse()}}, which will in turn > completes the {{prepareFuture}} that {{RepairRunnable}} is blocking on. Then > the repair command will terminate with an error status, as expected. > The issue is that in case the node will both be coordinator and participant, > we may end up with a local session and submitted anti-compactions, which will > be executed without any coordination with the coordinator session (on same > node). This may result in situations where running repair commands right > after another, may cause overlapping execution of anti-compactions that will > cause the following (misleading) message to show up in the logs and will > cause the repair to fail again: > "Prepare phase for incremental repair session %s has failed because it > encountered intersecting sstables belonging to another incremental repair > session (%s). This is by starting an incremental repair session before a > previous one has completed. Check nodetool repair_admin for hung sessions and > fix them." -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org