[ 
https://issues.apache.org/jira/browse/CASSANDRA-5062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13592743#comment-13592743
 ] 

Cristian Opris commented on CASSANDRA-5062:
-------------------------------------------

Say you have this: 

Proposer has committed R-1, starts round R, proposal timestamp Tn

Acceptor recovers with committed R-n < R-1, and has accepted value A at R-n+1 < 
R-1 at Tm in the paxos state log.

When Acceptor receives proposal, if it doesn't check R, if Tm > Tn (clock 
mismatch) according to paxos it needs to send it's old accepted value and the 
proposer will have to use it to commit. It will end up committing an old value.

It's an edge case but not impossible. Paxos holds within the same round, but 
not across rounds.

This makes sense because a Paxos round just means agree on a value which once 
accepted by a quorum
can never change.

Which is why you can't have an out of date replica participate in a round.

The idea is to move from quorum that committed (learned) R to quorum that 
accepts R+1 to quorum that commits R+1 and so on. Note the quorums don't need 
to be made of same components.

To ensure this you maintain the invariant that *you can't propose or accept R+1 
locally if you haven't committed R*

So a replica can die and recover, but to recover and participate in paxos needs 
to learn the latest value.

This also gives you consistent read (at the possible cost of an extra read 
paxos proposal to ensure that the last paxos round is committed if left 
ambiguous)


                
> Support CAS
> -----------
>
>                 Key: CASSANDRA-5062
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5062
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>             Fix For: 2.0
>
>         Attachments: half-baked commit 1.jpg, half-baked commit 2.jpg, 
> half-baked commit 3.jpg
>
>
> "Strong" consistency is not enough to prevent race conditions.  The classic 
> example is user account creation: we want to ensure usernames are unique, so 
> we only want to signal account creation success if nobody else has created 
> the account yet.  But naive read-then-write allows clients to race and both 
> think they have a green light to create.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to