[jira] [Comment Edited] (CASSANDRA-15027) Handle IR prepare phase failures less race prone by waiting for all results

Stefan Podkowinski (JIRA) Fri, 22 Feb 2019 08:04:26 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16775178#comment-16775178
 ]


Stefan Podkowinski edited comment on CASSANDRA-15027 at 2/22/19 4:03 PM:
-------------------------------------------------------------------------

Your updates look like valuable improvement over the initial patch. I'm +1 in 
general as for the changes, but also fixed some additional minor issues and 
added a new tests:

* 
[CASSANDRA-15027|https://github.com/spodkowinski/cassandra/commits/CASSANDRA-15027]
* [https://circleci.com/workflow-run/2b444c33-a54c-46b5-9923-bcded8bcf465]

Please see comments with each commit in branch above for details.

Also happy to discuss any of the changes (most likely the last commit) in 
another jira, if you feel it's out of scope for this ticket.
 


was (Author: spo...@gmail.com):
Your updates look like valuable improvement over the initial patch. I'm +1 in 
general as for the changes, but also fixed some additional minor issues and 
added a new tests:

* 
[CASSANDRA-15027|https://github.com/spodkowinski/cassandra/commits/CASSANDRA-15027]
* [circleci|https://circleci.com/gh/spodkowinski/cassandra/653]

Please see comments with each commit in branch above for details.

Also happy to discuss any of the changes (most likely the last commit) in 
another jira, if you feel it's out of scope for this ticket.
 

> Handle IR prepare phase failures less race prone by waiting for all results
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15027
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15027
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Repair, Local/Compaction
>            Reporter: Stefan Podkowinski
>            Assignee: Stefan Podkowinski
>            Priority: Major
>             Fix For: 4.x
>
>
> Handling incremental repairs as a coordinator begins by sending a 
> {{PrepareConsistentRequest}} message to all participants, which may also 
> include the coordinator itself. Participants will run anti-compactions upon 
> receiving such a message and report the result of the operation back to the 
> coordinator.
> Once we receive a failure response from any of the participants, we fail-fast 
> in {{CoordinatorSession.handlePrepareResponse()}}, which will in turn 
> completes the {{prepareFuture}} that {{RepairRunnable}} is blocking on. Then 
> the repair command will terminate with an error status, as expected.
> The issue is that in case the node will both be coordinator and participant, 
> we may end up with a local session and submitted anti-compactions, which will 
> be executed without any coordination with the coordinator session (on same 
> node). This may result in situations where running repair commands right 
> after another, may cause overlapping execution of anti-compactions that will 
> cause the following (misleading) message to show up in the logs and will 
> cause the repair to fail again:
>  "Prepare phase for incremental repair session %s has failed because it 
> encountered intersecting sstables belonging to another incremental repair 
> session (%s). This is by starting an incremental repair session before a 
> previous one has completed. Check nodetool repair_admin for hung sessions and 
> fix them."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-15027) Handle IR prepare phase failures less race prone by waiting for all results

Reply via email to