nodetool move caused the moved node to drop itself from 'nodetool ring' output; 
others think it's 'joining'
-----------------------------------------------------------------------------------------------------------

                 Key: CASSANDRA-1840
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1840
             Project: Cassandra
          Issue Type: Bug
            Reporter: Peter Schuller


I have a test cluster with three nodes on a very recent 0.7 (last few days 
branch). It has very little data in it (so maybe timing can be an issue given 
how fast operations complete). It was otherwise healthy; nodetool ring was 
consistent on all nodes and I had just run some compactions and 'repair' 
commands on all nodes repeatedly.

I had a single client doing some reads/writes of single columns; nothing 
extreme (low load).

When I did a 'nodetool move' the node exited the ring, stopped responding to 
thrift RPC, entered the ring again, and started accepting RPC requests via 
thrift. It reports in the log that it is joined.

However, at this point 'nodetool ring' on the node I moved does *not* show its 
own location in the ring, and other nodes show it as 'joining' (with the new 
token, not the old token). I will include nodetool ring output and log output 
below.

The situation was un-wedged by restarting the node that I had moved. After it 
started and a few seconds passed, nodetool ring looked correct on the node in 
question and other nodes now reported it as 'up' rather than 'joining'.

Moved node said post-move (.61 in the below pastes is the node that I moved):

Address         Status State   Load            Owns    Token                    
                   
                                                       
110288156320304836825416347816186393502     
78.31.15.204    Up     Normal  224.34 KB       61.44%  
44678687293344048155696022135861768368      
193.182.3.229   Up     Normal  251.84 KB       38.56%  
110288156320304836825416347816186393502     

And the other two:


Address         Status State   Load            Owns    Token                    
                   
                                                       
164957594472845753490452447750528540018     
78.31.15.204    Up     Normal  224.34 KB       29.31%  
44678687293344048155696022135861768368      
193.182.3.229   Up     Normal  251.84 KB       38.56%  
110288156320304836825416347816186393502     
193.182.3.61    Up     Joining 194.76 KB       32.13%  
164957594472845753490452447750528540018     

Address         Status State   Load            Owns    Token                    
                   
                                                       
164957594472845753490452447750528540018     

78.31.15.204    Up     Normal  
224.34 KB       29.31%  44678687293344048155696022135861768368      
193.182.3.229   Up     Normal  251.84 KB       38.56%  
110288156320304836825416347816186393502     
193.182.3.61    Up     Joining 194.76 KB       32.13%  
164957594472845753490452447750528540018     

I'll try reproducing a few times, and also merge latest 0.7.

Here is some system log output from the node that got moved; it looks good to 
me:

 INFO [RMI TCP Connection(32)-193.182.3.61] 2010-12-10 09:38:17,560 
StorageService.java (line 455) Leaving: sleeping 30000 ms for pending range 
setup
 INFO [RMI TCP Connection(32)-193.182.3.61] 2010-12-10 09:38:47,564 
StorageService.java (line 455) Leaving: streaming data to other nodes
 INFO [StreamStage:1] 2010-12-10 09:38:47,566 StreamOut.java (line 75) 
Beginning transfer to /78.31.15.204
 INFO [StreamStage:1] 2010-12-10 09:38:47,566 StreamOut.java (line 98) Flushing 
memtables for KeyspaceSlask...
 INFO [StreamStage:1] 2010-12-10 09:38:47,567 ColumnFamilyStore.java (line 639) 
switching in a fresh Memtable for KeyValue at 
CommitLogContext(file='/var/lib/spotify-cassandra/slask/commitlog/CommitLog-1291973418600.log',
 position=2573610)
 INFO [StreamStage:1] 2010-12-10 09:38:47,567 ColumnFamilyStore.java (line 943) 
Enqueuing flush of memtable-keyva...@1131602880(370711 bytes, 5533 operations)
 INFO [FlushWriter:1] 2010-12-10 09:38:47,567 Memtable.java (line 155) Writing 
memtable-keyva...@1131602880(370711 bytes, 5533 operations)
 INFO [FlushWriter:1] 2010-12-10 09:38:47,599 Memtable.java (line 162) 
Completed flushing 
/var/lib/spotify-cassandra/slask/data/KeyspaceSlask/KeyValue-e-68-Data.db 
(15042 bytes)
 INFO [StreamStage:1] 2010-12-10 09:38:47,601 StreamOut.java (line 171) Stream 
context metadata 
[/var/lib/spotify-cassandra/slask/data/KeyspaceSlask/KeyValue-e-67-Data.db/(0,10094)
         progress=0/10094 - 0%, 
/var/lib/spotify-cassandra/slask/data/KeyspaceSlask/KeyValue-e-68-Data.db/(0,10094)
         progress=0/10094 - 0%], 2 sstables.
 INFO [StreamStage:1] 2010-12-10 09:38:47,601 StreamOutSession.java (line 175) 
Streaming to /78.31.15.204
 INFO [StreamStage:1] 2010-12-10 09:38:47,601 StreamOut.java (line 75) 
Beginning transfer to /193.182.3.229
 INFO [StreamStage:1] 2010-12-10 09:38:47,602 StreamOut.java (line 98) Flushing 
memtables for KeyspaceSlask...
 INFO [StreamStage:1] 2010-12-10 09:38:47,602 ColumnFamilyStore.java (line 639) 
switching in a fresh Memtable for KeyValue at 
CommitLogContext(file='/var/lib/spotify-cassandra/slask/commitlog/CommitLog-1291973418600.log',
 position=2573755)
 INFO [StreamStage:1] 2010-12-10 09:38:47,602 ColumnFamilyStore.java (line 943) 
Enqueuing flush of memtable-keyva...@1894688899(67 bytes, 1 operations)
 INFO [FlushWriter:1] 2010-12-10 09:38:47,604 Memtable.java (line 155) Writing 
memtable-keyva...@1894688899(67 bytes, 1 operations)
 INFO [FlushWriter:1] 2010-12-10 09:38:47,635 Memtable.java (line 162) 
Completed flushing 
/var/lib/spotify-cassandra/slask/data/KeyspaceSlask/KeyValue-e-69-Data.db (198 
bytes)
 INFO [StreamStage:1] 2010-12-10 09:38:47,637 StreamOut.java (line 171) Stream 
context metadata 
[/var/lib/spotify-cassandra/slask/data/KeyspaceSlask/KeyValue-e-67-Data.db/(10094,15042)
         progress=0/4948 - 0%, 
/var/lib/spotify-cassandra/slask/data/KeyspaceSlask/KeyValue-e-68-Data.db/(10094,15042)
         progress=0/4948 - 0%], 3 sstables.
 INFO [StreamStage:1] 2010-12-10 09:38:47,637 StreamOutSession.java (line 175) 
Streaming to /193.182.3.229
 INFO [RMI TCP Connection(32)-193.182.3.61] 2010-12-10 09:38:49,734 
StorageService.java (line 1682) re-bootstrapping to new token 
164957594472845753490452447750528540018
 INFO [RMI TCP Connection(32)-193.182.3.61] 2010-12-10 09:38:49,735 
ColumnFamilyStore.java (line 639) switching in a fresh Memtable for 
LocationInfo at 
CommitLogContext(file='/var/lib/spotify-cassandra/slask/commitlog/CommitLog-1291973418600.log',
 position=2576080)
 INFO [RMI TCP Connection(32)-193.182.3.61] 2010-12-10 09:38:49,736 
ColumnFamilyStore.java (line 943) Enqueuing flush of 
memtable-locationi...@578162504(53 bytes, 2 operations)
 INFO [FlushWriter:1] 2010-12-10 09:38:49,737 Memtable.java (line 155) Writing 
memtable-locationi...@578162504(53 bytes, 2 operations)
 INFO [FlushWriter:1] 2010-12-10 09:38:49,865 Memtable.java (line 162) 
Completed flushing 
/var/lib/spotify-cassandra/slask/data/system/LocationInfo-e-17-Data.db (301 
bytes)
 INFO [RMI TCP Connection(32)-193.182.3.61] 2010-12-10 09:38:49,866 
StorageService.java (line 455) Joining: sleeping 30000 ms for pending range 
setup
 INFO [RMI TCP Connection(32)-193.182.3.61] 2010-12-10 09:39:19,866 
StorageService.java (line 455) Bootstrapping
 INFO [CompactionExecutor:1] 2010-12-10 09:39:19,960 SSTableReader.java (line 
170) Opening /var/lib/spotify-cassandra/slask/data/KeyspaceSlask/KeyValue-e-70
 INFO [CompactionExecutor:1] 2010-12-10 09:39:19,981 SSTableReader.java (line 
170) Opening /var/lib/spotify-cassandra/slask/data/KeyspaceSlask/KeyValue-e-71
 INFO [Thread-71] 2010-12-10 09:39:19,983 StreamInSession.java (line 160) 
Finished streaming session 258244662990017 from /193.182.3.229
 INFO [RMI TCP Connection(32)-193.182.3.61] 2010-12-10 09:39:19,983 
StorageService.java (line 249) Bootstrap/move completed! Now serving reads.

So based on the output the node certainly claims to have finished the entire 
bootstrapping procedure and started serving reads (which it is; reads are 
working against it).


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to