Re: Node stuck in joining the ring

2015-03-02 Thread Phil Yang
I encountered a similar situation that streaming can not finish, not only
in joining but in removing a node. My tricky solution is: restart every
node in the cluster before you starting the new node. In my experience
streaming stucked only shows in the node that have been running many days
although I have no idea about the reason.

2015-03-03 2:42 GMT+08:00 Nate McCall n...@thelastpickle.com:

 Can you verify that casssandra-rackdc.properties and
 cassandra-topology.properties are the same on the cluster?

 On Thu, Feb 26, 2015 at 7:52 AM, Batranut Bogdan batra...@yahoo.com
 wrote:

 No errors in the system.log file
 [root@cassa09 cassandra]# grep ERROR system.log
 [root@cassa09 cassandra]#

 Nothing.


   On Thursday, February 26, 2015 1:55 PM, mck m...@apache.org wrote:


 Any errors in your log file?

 We saw something similar when bootstrap crashed when rebuilding
 secondary indexes.

 See CASSANDRA-8798

 ~mck






 --
 -
 Nate McCall
 Austin, TX
 @zznate

 Co-Founder  Sr. Technical Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com




-- 
Thanks,
Phil Yang


Re: Node stuck in joining the ring

2015-03-02 Thread Nate McCall
Can you verify that casssandra-rackdc.properties and
cassandra-topology.properties are the same on the cluster?

On Thu, Feb 26, 2015 at 7:52 AM, Batranut Bogdan batra...@yahoo.com wrote:

 No errors in the system.log file
 [root@cassa09 cassandra]# grep ERROR system.log
 [root@cassa09 cassandra]#

 Nothing.


   On Thursday, February 26, 2015 1:55 PM, mck m...@apache.org wrote:


 Any errors in your log file?

 We saw something similar when bootstrap crashed when rebuilding
 secondary indexes.

 See CASSANDRA-8798

 ~mck






-- 
-
Nate McCall
Austin, TX
@zznate

Co-Founder  Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: Node stuck in joining the ring

2015-02-26 Thread Batranut Bogdan
No errors in the system.log file [root@cassa09 cassandra]# grep ERROR 
system.log[root@cassa09 cassandra]#
Nothing. 

 On Thursday, February 26, 2015 1:55 PM, mck m...@apache.org wrote:
   

 Any errors in your log file?

We saw something similar when bootstrap crashed when rebuilding
secondary indexes.

See CASSANDRA-8798

~mck


   

Re: Node stuck in joining the ring

2015-02-26 Thread Jonathan Haddad
I've seen this before, when I tried to be clever and add nodes of a different 
major version into a cluster.  Any chance that's what's happening here?


 On Feb 25, 2015, at 4:52 PM, Robert Coli rc...@eventbrite.com wrote:
 
 On Wed, Feb 25, 2015 at 3:38 PM, Batranut Bogdan batra...@yahoo.com wrote:
 I have a new node that I want to add to the ring. The problem is that 
 nodetool says UJ I have left it for several days and the status has not 
 changed. In Opscenter it is seen as in an unknown cluster. 
 
 If I were you, I would do the following [1]  :
 
 1) stop the joining node
 2) make sure that the other nodes no longer see it joining
 3) wipe the joining node's data directory
 4) verify cluster name is correct in cassandra.yaml, and matches the other 
 nodes
 5) re-join the node
 
 What version of Cassandra?
 
 =Rob
 [1] Which, jeesh, I should put into a dealing with failed bootstrap blog 
 post one of these days...


Re: Node stuck in joining the ring

2015-02-26 Thread Batranut Bogdan
All the nodes have the same version. 2.0.12

Re: Node stuck in joining the ring

2015-02-26 Thread Batranut Bogdan
Hello Jan,
Yes I do have ntp and it is in synch.
 

 On Thursday, February 26, 2015 11:49 AM, Jan Kesten j.kes...@enercast.de 
wrote:
   

 Hi Batranut,

apart from the other suggestions - do you have ntp running on all your 
cluster nodes and are times in sync?

Jan




Re: Node stuck in joining the ring

2015-02-26 Thread Jan Kesten

Hi Batranut,

apart from the other suggestions - do you have ntp running on all your 
cluster nodes and are times in sync?


Jan


Re: Node stuck in joining the ring

2015-02-26 Thread Batranut Bogdan
C* version 2.0.12
How do I resolve item 2) ? Just want to mention that when the node is stopped, 
nodetool status does not show it Down, it is missing from the list...
Thanks for the support. 

 On Thursday, February 26, 2015 2:52 AM, Robert Coli rc...@eventbrite.com 
wrote:
   

 On Wed, Feb 25, 2015 at 3:38 PM, Batranut Bogdan batra...@yahoo.com wrote:

I have a new node that I want to add to the ring. The problem is that nodetool 
says UJ I have left it for several days and the status has not changed. In 
Opscenter it is seen as in an unknown cluster. 

If I were you, I would do the following [1]  :
1) stop the joining node2) make sure that the other nodes no longer see it 
joining3) wipe the joining node's data directory4) verify cluster name is 
correct in cassandra.yaml, and matches the other nodes5) re-join the node
What version of Cassandra?
=Rob[1] Which, jeesh, I should put into a dealing with failed bootstrap blog 
post one of these days...

   

Re: Node stuck in joining the ring

2015-02-26 Thread mck
Any errors in your log file?

We saw something similar when bootstrap crashed when rebuilding
secondary indexes.

See CASSANDRA-8798

~mck


Re: Node stuck in joining the ring

2015-02-25 Thread Robert Coli
On Wed, Feb 25, 2015 at 3:38 PM, Batranut Bogdan batra...@yahoo.com wrote:

 I have a new node that I want to add to the ring. The problem is that
 nodetool says UJ I have left it for several days and the status has not
 changed. In Opscenter it is seen as in an unknown cluster.


If I were you, I would do the following [1]  :

1) stop the joining node
2) make sure that the other nodes no longer see it joining
3) wipe the joining node's data directory
4) verify cluster name is correct in cassandra.yaml, and matches the other
nodes
5) re-join the node

What version of Cassandra?

=Rob
[1] Which, jeesh, I should put into a dealing with failed bootstrap blog
post one of these days...


Node stuck in joining the ring

2015-02-25 Thread Batranut Bogdan
Hello all,
I have a new node that I want to add to the ring. The problem is that nodetool 
says UJ I have left it for several days and the status has not changed. In 
Opscenter it is seen as in an unknown cluster. 
From the time that I started it, it was streaming data and the data size is 
5,9 TB. This is very strange since all other nodes in the cluster have about 
3,3 TB of data. Also tonight I saw that it stopped getting streams and the 
status in nodetool was still UJ. So I thought to decommission the node delete 
the data and start again. Nodetool throws unsupported operation: local node is 
not a member of the token ring yet. So I have just restarted the node. Now 
streaming data begins again. At this rate, I'll run out of disk space on that 
node.One ideea that comes to mind is to stop, clear the data and restart. But 
I am not sure about the implications for that. Also, I have tried nodetool 
join. I got: This node has already joined the ring.
So nodetool status says UJ but nodetool join says otherwise, or am I not 
understanding someting here.
Any ideeas?