We had a cluster of 4 nodes in AWS. The average load on each node was approx 750GB. We added 4 new nodes. It is now more than 30 hours and the node is still in JOINING mode. Specifically I am analyzing the one with IP 10.3.1.29. There is no compaction or streaming or index building happening.
$ ./nodetool ring Note: Ownership information does not include topology, please specify a keyspace. Address DC Rack Status State Load Owns Token 148873535527910577765226390751398592512 10.3.1.179 datacenter1 rack1 Up Normal 740.41 GB 25.00% 0 10.3.1.29 datacenter1 rack1 Up Joining 562.49 GB 0.00% 21267647932558653966460912964485513215 10.3.1.175 datacenter1 rack1 Up Normal 755.7 GB 25.00% 42535295865117307932921825928971026431 10.3.1.30 datacenter1 rack1 Up Joining 565.68 GB 0.00% 63802943797675961899382738893456539648 10.3.1.177 datacenter1 rack1 Up Normal 754.18 GB 25.00% 85070591730234615865843651857942052863 10.3.1.135 datacenter1 rack1 Up Normal 95.97 GB 20.87% 120580289963820081458352857409882669785 10.3.1.178 datacenter1 rack1 Up Normal 747.53 GB 4.13% 127605887595351923798765477786913079295 10.3.1.24 datacenter1 rack1 Up Joining 522.09 GB 0.00% 148873535527910577765226390751398592512 $ ./nodetool netstats Mode: JOINING Not sending any streams. Nothing streaming from /10.3.1.177 Nothing streaming from /10.3.1.179 Pool Name Active Pending Completed Commands n/a 0 82 Responses n/a 0 40135123 $ ./nodetool compactionStats pending tasks: 0 Active compaction remaining time : n/a $ ./nodetool info Token : 21267647932558653966460912964485513215 Gossip active : true Thrift active : false Load : 562.49 GB Generation No : 1382981644 Uptime (seconds) : 90340 Heap Memory (MB) : 9298.59 / 13272.00 Data Center : datacenter1 Rack : rack1 Exceptions : 2 Key Cache : size 104857584 (bytes), capacity 104857584 (bytes), 187373 hits, 94709046 requests, 0.002 recent hit rate, 14400 save period in seconds Row Cache : size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds The 2 Exceptions in info output are the ones that were logged when I stopped index build to let bootstrap complete faster. Any clue whats wrong and where should I look for to further analyze the issue? I haven't restarted the Cassandra process. I am afraid the node will start bootstrap again if I restart the node. Thanks, Naren -- Narendra Sharma Software Engineer *http://www.aeris.com* *http://narendrasharma.blogspot.com/*