Re: Bringing a dead node back up after fixing hardware issues

Brandon Williams Mon, 23 Jul 2012 18:49:37 -0700

On Mon, Jul 23, 2012 at 6:26 PM, Eran Chinthaka Withana
<eran.chinth...@gmail.com> wrote:
> Method 1: I copied the data from all the nodes in that data center, into the
> repaired node, and brought it back up. But because of the rate of updates
> happening, the read misses started going up.


That's not really a good method when you scale up and the amount of
data in the cluster won't fit on a single machine.

> Method 2: I issued a removetoken command for that node's token and let the
> cluster stream the data into relevant nodes. At the end of this process, the
> dead node was not showing up in the ring output. Then I brought the node
> back up. I was expecting, Cassandra to first stream data into the new node
> (which happens to be the dead node which was in the cluster earlier) and
> once its done then make it serve reads. But, in the server log, I can see as
> soon the node comes up, it started serving reads, creating a large number of
> read misses.

Removetoken is for dead nodes, so the node has no way of locally
knowing it shouldn't be a cluster member any longer when it starts up.
 Instead if you had decommissioned, it would have saved a flag to
indicate it should bootstrap at the next startup.

> So the question is, what is the best way to bring back a dead node (once its
> hardware issues are fixed) without impacting read misses?

Increase your consistency level.  Run a repair on the node once it's
back up, unless the repair time took longer than gc_grace, in which
case you need to removetoken it, delete all the data, and bootstrap it
back in if you don't want anything deleted to resurrect.

-Brandon

Re: Bringing a dead node back up after fixing hardware issues

Reply via email to