[jira] [Commented] (CASSANDRA-2434) node bootstrapping can violate consistency

Zhu Han (JIRA) Mon, 05 Sep 2011 16:20:34 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13097625#comment-13097625
 ]


Zhu Han commented on CASSANDRA-2434:
------------------------------------

bq. Also if only one node is down you should still be able to read/write at 
quorum and achieve consistency

I suppose quorum read plus quorum write should provide monotonic read 
consistency. [1] Supposing  quorum write on key1 hits node A and node B, not on 
node C due to temporal network partition. After that node B is replaced by node 
D since it is down, and node D streams data from node C. If the following 
quorum read on key1 hits only node C and node D, the monotonic consistency is 
violated. This is rare but not unrealistic, especially when hint handoff is 
disabled. 

Maybe it is more resonable to give the admin an option, to specify that the 
bootstrapped node should not accept any read request until the admin turn it on 
manually. So the admin can start a manual repair if he wants to assure 
everything goes fine.

[1]http://www.allthingsdistributed.com/2007/12/eventually_consistent.html

> node bootstrapping can violate consistency
> ------------------------------------------
>
>                 Key: CASSANDRA-2434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2434
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Peter Schuller
>            Assignee: paul cannon
>             Fix For: 1.1
>
>         Attachments: 2434.patch.txt
>
>
> My reading (a while ago) of the code indicates that there is no logic 
> involved during bootstrapping that avoids consistency level violations. If I 
> recall correctly it just grabs neighbors that are currently up.
> There are at least two issues I have with this behavior:
> * If I have a cluster where I have applications relying on QUORUM with RF=3, 
> and bootstrapping complete based on only one node, I have just violated the 
> supposedly guaranteed consistency semantics of the cluster.
> * Nodes can flap up and down at any time, so even if a human takes care to 
> look at which nodes are up and things about it carefully before 
> bootstrapping, there's no guarantee.
> A complication is that not only does it depend on use-case where this is an 
> issue (if all you ever do you do at CL.ONE, it's fine); even in a cluster 
> which is otherwise used for QUORUM operations you may wish to accept 
> less-than-quorum nodes during bootstrap in various emergency situations.
> A potential easy fix is to have bootstrap take an argument which is the 
> number of hosts to bootstrap from, or to assume QUORUM if none is given.
> (A related concern is bootstrapping across data centers. You may *want* to 
> bootstrap to a local node and then do a repair to avoid sending loads of data 
> across DC:s while still achieving consistency. Or even if you don't care 
> about the consistency issues, I don't think there is currently a way to 
> bootstrap from local nodes only.)
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2434) node bootstrapping can violate consistency

Reply via email to