Hi Rob,

regarding 1-3:
Thank you for the step-by-step explanation :-) My mistake was to use
join_ring=false during the inital start already. It now works for me as its
supposed to. Nevertheless it does not what I want, as it does not take
writes during the time of repair/rebuild: Running an 8 hour repair will
lead to 8 hours of data missing.


regarding 1-6:
This is what we did. And it works of course. Our issue was just that we had
some global-QUORUMS hidden somewhere, which the operator was not aware of.
Therefore it would have been nice if the ops guy could prevent these reads
by himself.




Another issue I think the current bootstrapping process has: Doesn't it
practically reduce the RF for old data by one? (With old data I mean any
data that was written before the bootstrap).

Let me give an example:

Lets assume I have a cluster of Node 1,2 and 3 with RF=3. And lets assume a
single write on node 2 got lost. So this particular write is only available
on node 1 and 3.

Now I add node 4, which takes the range in such a way that node 1 will not
own that previously written key any more. Also assume that the new node
loads its data from node 2.

This means we have a cluster where the previously mentioned write is only
on node 3. (Node 1 is not responsible for the key any more and node 4
loaded its data from the wrong node)

Any quorum-read that hit node 2 & 4 will not return the column. So this
means we effectively lowered the CL/RF.


Therefore what I would like to be able to do is:
- Add new node 4, but leave it in a joining state. (This means it gets all
the writes but does not serve reads.)
- Do "nodetool rebuild"
- New node should not serve reads yet. And node 1 should not yet give up
its ranges to node 4.
- Do "nodetool repair", to ensure consistency.
- Finish bootstrap. Now node1 should not be responsible for the range and
node4 should become eligible for reads.


regards,
Christian




On Tue, Sep 8, 2015 at 11:51 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Tue, Sep 8, 2015 at 2:39 PM, horschi <hors...@gmail.com> wrote:
>
>> I tried to set up a new node with join_ring=false once. In my test that
>> node did not pick a token in the ring. I assume running repair or rebuild
>> would not do anything in that case: No tokens = no data. But I must admit:
>> I have not tried running rebuild.
>>
>
> I admit I haven't been following this thread closely, perhaps I have
> missed what exactly it is you're trying to do.
>
> It's possible you'd need to :
>
> 1) join the node with auto_bootstrap=false
> 2) immediately stop it
> 3) re-start it with join_ring=false
>
> To actually use repair or rebuild in this way.
>
> However, if your goal is to create a new data-center and rebuild a node
> there without any risk of reading from that node while creating the new
> data center, you can just :
>
> 1) create nodes in new data-center, with RF=0 for that DC
> 2) change RF in that DC
> 3) run rebuild on new data-center nodes
> 4) while doing so, don't talk to new data-center coordinators from your
> client
> 5) and also use LOCAL_ONE/LOCAL_QUORUM to avoid cross-data-center reads
> from your client
> 6) modulo the handful of current bugs which make 5) currently imperfect
>
> What problem are you encountering with this procedure? If it's this ...
>
> I've learned from experience that the node immediately joins the cluster,
>> and starts accepting reads (from other DCs) for the range it owns.
>
>
> This seems to be the incorrect assumption at the heart of the confusion.
> You "should" be able to prevent this behavior entirely via correct use of
> ConsistencyLevel and client configuration.
>
> In an ideal world, I'd write a detailed blog post explaining this... :/ in
> my copious spare time...
>
> =Rob
>
>
>

Reply via email to