[ 
https://issues.apache.org/jira/browse/CASSANDRA-5062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588303#comment-13588303
 ] 

Sergio Bossa commented on CASSANDRA-5062:
-----------------------------------------

[~jbellis]

{quote}This is not correct for Paxos. (Not sufficiently familiar with ZAB to 
comment there){quote}

Right, I was talking about Zab, which does that exactly for improving liveness 
and performance.

{quote}What does this 2PC-that-avoids-lost-acks look like?{quote}

Well, given my lack of familiarity with Cassandra internals, I may be missing 
something here, so let's be clear about the lost-ack problem: my understanding 
of lost-ack is about what happens when the coordinator node sends a QUORUM 
request and fails before getting the ack back, causing uncertainty about the 
request status. So please correct me if I'm wrong here.
But stated this way, this problem can be overcame with Zab-like 2PC: once the 
coordinator gets the acks from the prepare phase, it can commit without having 
to wait for all acks, because only committed values with the highest "commit 
id" will be (QUORUM) read. Then:
1) If the coordinator fails during the prepare phase (lost ack), nothing will 
be committed, hence the previous committed value will be read, and if it will 
be hinted/repaired, it will just be a tentative value.
2) If the coordinator fails after sending commits, the coordinator with the 
highest commit id will take over and "realign" followers.
3) If a partition happens, the coordinator with the minority of followers will 
refuse to operate CAS (Paxos would behave exactly the same here).

Does it make sense to you?

Obviously I may be missing some corner case, and above all, I'm not sure about 
how comfortably this could be implemented in Cassandra (lack of knowledge 
again), so take my comments just as food for thoughts.
                
> Support CAS
> -----------
>
>                 Key: CASSANDRA-5062
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5062
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>             Fix For: 2.0
>
>
> "Strong" consistency is not enough to prevent race conditions.  The classic 
> example is user account creation: we want to ensure usernames are unique, so 
> we only want to signal account creation success if nobody else has created 
> the account yet.  But naive read-then-write allows clients to race and both 
> think they have a green light to create.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to