[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart

Stefan Miklosovic (Jira) Mon, 21 Aug 2023 02:10:04 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17756770#comment-17756770
 ]


Stefan Miklosovic commented on CASSANDRA-16290:
-----------------------------------------------

Seems to be correct on face value but I think you also need to write a proper 
test for this, probably in-jvm dtest.

> Consistency can be violated when bootstrap or decommission is resumed after 
> node restart
> ----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-16290
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16290
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Bootstrap and Decommission
>            Reporter: Paulo Motta
>            Assignee: Bartlomiej
>            Priority: Normal
>              Labels: lhf
>             Fix For: 3.0.x, 3.11.x, 4.0.x
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Since CASSANDRA-12008, successfully transferred ranges during decommission 
> are saved on the {{system.transferred_ranges}} table. This allow skipping 
> ranges already transferred when a failed decommission is retried with 
> {{nodetool decommission}}.
> If instead of resuming the decommission, an operator restarts the node, waits 
> N minutes and then performs a new decommission, the previously transferred 
> ranges will be skipped during streaming, and any writes received by the 
> decommissioned node during these N minutes will not be replicated to the new 
> range owner, what violates consistency.
> This issue is analogous to the issue mentioned [on this 
> comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234]
>  for resumable bootstrap (CASSANDRA-8838).
> In order to prevent consistency violations we should clear the 
> {{system.transferred_ranges}} state during node restart, and maybe a system 
> property to disable it. While we're at this, we should change the default of 
> {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the 
> {{system.available_ranges}} state by default when a bootstrapping node is 
> restarted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart

Reply via email to