RE: node stuck "leaving" on 1.0.5

2011-12-13 Thread Bryce Godfrey
:02 PM To: user@cassandra.apache.org Subject: node stuck "leaving" on 1.0.5 I have a dead node I need to remove from the cluster so that I can rebalance among the existing servers (can't replace it for a while). I used nodetool removetoken and it's been stuck in the "leaving&q

node stuck "leaving" on 1.0.5

2011-12-11 Thread Bryce Godfrey
I have a dead node I need to remove from the cluster so that I can rebalance among the existing servers (can't replace it for a while). I used nodetool removetoken and it's been stuck in the "leaving" state for over a day now. I've tried a rolling restart, which kicks of some streaming for a w

Re: node stuck "leaving"

2011-07-12 Thread Casey Deccio
On Tue, Jul 12, 2011 at 10:10 AM, Brandon Williams wrote: > On Mon, Jul 11, 2011 at 11:51 PM, Casey Deccio wrote: > > java.lang.RuntimeException: Cannot recover SSTable with version f > (current > > version g). > > You need to scrub before any streaming is performed. > > Okay, turns out that my

Re: node stuck "leaving"

2011-07-12 Thread Brandon Williams
On Mon, Jul 11, 2011 at 11:51 PM, Casey Deccio wrote: > java.lang.RuntimeException: Cannot recover SSTable with version f (current > version g). You need to scrub before any streaming is performed. -Brandon

Re: node stuck "leaving"

2011-07-11 Thread Casey Deccio
On Sat, Jul 9, 2011 at 4:47 PM, aaron morton wrote: > Check the log on all the machines for ERROR messages. An error on any of > the nodes could have caused the streaming to hang. nodetool netstats will > let you know if there is a failed stream. > > Here's what I see in the logs on the node I'm s

Re: node stuck "leaving"

2011-07-10 Thread Héctor Izquierdo Seliva
At the end I had to restart the whole cluster. This is the second time I've had to do this. Would it be possible to add a command that forces all nodes to remove all the ring data and start it fresh? I'd rather have a few seconds of errors in the clients that the two to five minutes that takes a fu

Re: node stuck "leaving"

2011-07-10 Thread aaron morton
Thats the correct way to use remove token, it's there when the node you are removing from the ring cannot be started http://wiki.apache.org/cassandra/Operations#Removing_nodes_entirely Dead nodes popping up and an inconsistent view of the ring is a bit nasty. You can *try* restarting the node w

Re: node stuck "leaving"

2011-07-10 Thread Héctor Izquierdo Seliva
I'm also having problems with removetoken. Maybe I'm doing it wrong, but I was under the impression that I just had to call once removetoken. When I take a look at the nodes ring, the dead node keeps popping up. What's even more incredible is that in some of them it says UP

Re: node stuck "leaving"

2011-07-09 Thread aaron morton
Check the log on all the machines for ERROR messages. An error on any of the nodes could have caused the streaming to hang. nodetool netstats will let you know if there is a failed stream. AFAIK if you restart the cass service on 1 it will forget it was leaving and rejoin in a normal state. c

node stuck "leaving"

2011-07-08 Thread Casey Deccio
I've got a node that is stuck "Leaving" the ring. Running "nodetool decommission" never terminates. It's been in this state for about a week, and the load has not decreased: $ nodetool -h localhost ring Address DC RackStatus State Load OwnsToken Token(bytes[de4075