David Capwell created CASSANDRA-19769:
-----------------------------------------

             Summary: CEP-15: (Accord) sequence EpochReady.coordinating to 
allow syncComplete to be learned from newer epochs
                 Key: CASSANDRA-19769
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19769
             Project: Cassandra
          Issue Type: Bug
          Components: Accord
            Reporter: David Capwell


When a node is bootstrapping or doing a host replacement it sees several epochs 
before it actually joins the ring, but in Accord we only synchronize epoch 
knowledge to the nodes that have already joined; this means we won’t ever 
synchronize the epochs seen on the new nodes! This becomes a problem because it 
forces these nodes to include far more epochs than required (because they don’t 
know if the peers know the epoch), and may include stale epochs that are not 
possible to reach quorum (such 2 host replacements to the same range would 
cause that historic range to not be able to reach quorum).

By sequencing EpochReady.coordinating, we have the property that we only mark 
sync complete for epoch=N if and only if epoch=N-1 has done it as well.  With 
this, peers are able to recover the past data when a new epoch is seen.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to