[ 
https://issues.apache.org/jira/browse/CASSANDRA-5062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626665#comment-13626665
 ] 

Jonathan Ellis commented on CASSANDRA-5062:
-------------------------------------------

bq. in other words, the current algorithm does not do Paxos at the row level, 
but rather paxos at the level of all rows whose key's hashcode modulo 1024 is 
equal, and 2 proposers on 2 different row keys may compete with each other.

Agreed, but what are you proposing as an alternative?  Allocating a lock per 
row would be madness, and attempting to write it without locks looks very 
difficult.

(I will note that the contention is probably not that bad, since the two rows 
would have to hash to the same lock across a majority of replicas, which in a 
vnode world is unlikely.)

bq. this in turn means that we'll timeout as soon as any node that we though 
were alive happens to be dead, even if we have a QUORUM of responses

Not so -- we will wait for it, but when it does not reply we will be able to 
continue as long as the nodes that did reply constitute a quorum.  (I.e., we 
check responseCount vs requiredParticipants, not vs endpoints.size().)

bq. Can't we just say in PrepareCallback that if promised != true for a 
response, then we do a 'while (latch.getCount() > 0) latch.countDown()'?

Sure, but then we have to add special cases back up to avoid throwing UAE, or 
move the promise check into PrepareCallback and have it throw a control-flow 
exception, neither of which appeals to me.

bq. I think we want to guarantee that the column timestamp resolution won't 
break that order

You're right, this could get ugly with HH, RR, etc.
                
> Support CAS
> -----------
>
>                 Key: CASSANDRA-5062
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5062
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>             Fix For: 2.0
>
>         Attachments: half-baked commit 1.jpg, half-baked commit 2.jpg, 
> half-baked commit 3.jpg
>
>
> "Strong" consistency is not enough to prevent race conditions.  The classic 
> example is user account creation: we want to ensure usernames are unique, so 
> we only want to signal account creation success if nobody else has created 
> the account yet.  But naive read-then-write allows clients to race and both 
> think they have a green light to create.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to