[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15225475#comment-15225475
 ] 

Brandon Williams commented on CASSANDRA-8523:
---------------------------------------------

bq. I would consider a new, non dead, state for replacing a node rather than 
using hibernate. This should allow us to use the Failure Detector right?

It looks like this would work now, because the gossiper is the only thing that 
registers with the FD.  In the past this wasn't always true, though.

bq. Alternatively, would a new Gossip property that indicates a "shadow" of an 
existing node be easier to implement?

Hmm, can you can explain further?

> Writes should be sent to a replacement node while it is streaming in data
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8523
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Richard Wagner
>            Assignee: Brandon Williams
>             Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to