[jira] [Updated] (CASSANDRA-19364) Data loss during decommission possible due to a delayed and unsynced pending ranges calculation

Jacek Lewandowski (Jira) Mon, 05 Feb 2024 03:03:04 -0800


     [ 
https://issues.apache.org/jira/browse/CASSANDRA-19364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jacek Lewandowski updated CASSANDRA-19364:
------------------------------------------
    Description: 
This possible issue has been discovered while inspecting flaky tests of 
CASSANDRA-18824. Pending ranges calculation is executed asynchronously when the 
node is decommissioned. If the data is inserted during decommissioning, and 
pending ranges calculation is delayed for some reason (it can be as it is not 
synchronous), we may end up with partial data loss. That can be just a wrong 
test. Thus, I perceive this ticket more like a memo for further investigation 
or discussion. 

Note that this has obviously been fixed by TCM.

The test in question was:

{code:java}
        try (Cluster cluster = init(builder().withNodes(2)
                                             
.withTokenSupplier(evenlyDistributedTokens(2))
                                             
.withNodeIdTopology(NetworkTopology.singleDcNetworkTopology(2, "dc0", "rack0"))
                                             .withConfig(config -> 
config.with(NETWORK, GOSSIP))
                                             .start(), 1))
        {
            IInvokableInstance nodeToDecommission = cluster.get(1);
            IInvokableInstance nodeToRemainInCluster = cluster.get(2);

            // Start decomission on nodeToDecommission
            cluster.forEach(statusToDecommission(nodeToDecommission));
            logger.info("Decommissioning node {}", 
nodeToDecommission.broadcastAddress());

            // Add data to cluster while node is decomissioning
            int numRows = 100;
            cluster.schemaChange("CREATE TABLE IF NOT EXISTS " + KEYSPACE + 
".tbl (pk int, ck int, v int, PRIMARY KEY (pk, ck))");
            insertData(cluster, 1, numRows, ConsistencyLevel.ONE); // 
<------------------- HERE - when PRC is delayed, we get there only ~50% of 
inserted rows

            // Check data before cleanup on nodeToRemainInCluster
            assertEquals(100, nodeToRemainInCluster.executeInternal("SELECT * 
FROM " + KEYSPACE + ".tbl").length);
    }
{code}


  was:
This possible issue has been discovered while inspecting flaky tests of 
CASSANDRA-18824. Pending ranges calculation is executed asynchronously when the 
node is decommissioned. If the data is inserted during decommissioning, and 
pending ranges calculation is delayed for some reason (it can be as it is not 
synchronous), we may end up with partial data loss. That can be just a wrong 
test. Thus, I perceive this ticket more like a memo for further investigation 
or discussion. 

Note that this has obviously been fixed by TCM.


> Data loss during decommission possible due to a delayed and unsynced pending 
> ranges calculation
> -----------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-19364
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19364
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Bootstrap and Decommission
>            Reporter: Jacek Lewandowski
>            Priority: Normal
>
> This possible issue has been discovered while inspecting flaky tests of 
> CASSANDRA-18824. Pending ranges calculation is executed asynchronously when 
> the node is decommissioned. If the data is inserted during decommissioning, 
> and pending ranges calculation is delayed for some reason (it can be as it is 
> not synchronous), we may end up with partial data loss. That can be just a 
> wrong test. Thus, I perceive this ticket more like a memo for further 
> investigation or discussion. 
> Note that this has obviously been fixed by TCM.
> The test in question was:
> {code:java}
>         try (Cluster cluster = init(builder().withNodes(2)
>                                              
> .withTokenSupplier(evenlyDistributedTokens(2))
>                                              
> .withNodeIdTopology(NetworkTopology.singleDcNetworkTopology(2, "dc0", 
> "rack0"))
>                                              .withConfig(config -> 
> config.with(NETWORK, GOSSIP))
>                                              .start(), 1))
>         {
>             IInvokableInstance nodeToDecommission = cluster.get(1);
>             IInvokableInstance nodeToRemainInCluster = cluster.get(2);
>             // Start decomission on nodeToDecommission
>             cluster.forEach(statusToDecommission(nodeToDecommission));
>             logger.info("Decommissioning node {}", 
> nodeToDecommission.broadcastAddress());
>             // Add data to cluster while node is decomissioning
>             int numRows = 100;
>             cluster.schemaChange("CREATE TABLE IF NOT EXISTS " + KEYSPACE + 
> ".tbl (pk int, ck int, v int, PRIMARY KEY (pk, ck))");
>             insertData(cluster, 1, numRows, ConsistencyLevel.ONE); // 
> <------------------- HERE - when PRC is delayed, we get there only ~50% of 
> inserted rows
>             // Check data before cleanup on nodeToRemainInCluster
>             assertEquals(100, nodeToRemainInCluster.executeInternal("SELECT * 
> FROM " + KEYSPACE + ".tbl").length);
>     }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19364) Data loss during decommission possible due to a delayed and unsynced pending ranges calculation

Reply via email to