[jira] [Commented] (CASSANDRA-15027) Handle IR prepare phase failures less race prone by waiting for all results

Stefan Podkowinski (JIRA) Mon, 18 Feb 2019 00:43:20 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16770869#comment-16770869
 ]


Stefan Podkowinski commented on CASSANDRA-15027:
------------------------------------------------

* [ [trunk|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-15027] ][ 
[circleci|https://circleci.com/workflow-run/2b027f87-cf45-48ee-8eae-45a563701bc6]
 ]

> Handle IR prepare phase failures less race prone by waiting for all results
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15027
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15027
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Repair, Local/Compaction
>            Reporter: Stefan Podkowinski
>            Assignee: Stefan Podkowinski
>            Priority: Major
>             Fix For: 4.x
>
>
> Handling incremental repairs as a coordinator begins by sending a 
> {{PrepareConsistentRequest}} message to all participants, which may also 
> include the coordinator itself. Participants will run anti-compactions upon 
> receiving such a message and report the result of the operation back to the 
> coordinator.
> Once we receive a failure response from any of the participants, we fail-fast 
> in {{CoordinatorSession.handlePrepareResponse()}}, which will in turn 
> completes the {{prepareFuture}} that {{RepairRunnable}} is blocking on. Then 
> the repair command will terminate with an error status, as expected.
> The issue is that in case the node will both be coordinator and participant, 
> we may end up with a local session and submitted anti-compactions, which will 
> be executed without any coordination with the coordinator session (on same 
> node). This may result in situations where running repair commands right 
> after another, may cause overlapping execution of anti-compactions that will 
> cause the following (misleading) message to show up in the logs and will 
> cause the repair to fail again:
>  "Prepare phase for incremental repair session %s has failed because it 
> encountered intersecting sstables belonging to another incremental repair 
> session (%s). This is by starting an incremental repair session before a 
> previous one has completed. Check nodetool repair_admin for hung sessions and 
> fix them."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15027) Handle IR prepare phase failures less race prone by waiting for all results

Reply via email to