Hi, I've seen some strange occurrence of a deleted node reappearing all of a sudden in the ring, which leads to my question: where is the ring structure maintained (memory with local copies?) and what prompts it to change? I appreciate any thoughts on the events below.
I'm running 0.7.4 on 4 EC2 large machines with a replication factor of 3. On Sunday I dropped a node that was misbehaving (drained then decommissioned). Everything was well until a few minutes ago: On 1.2.3.47 (nevermind the temporary key imbalance) ubuntu@YYY:~$ nodetool -h localhost ring 1.2.3.47 Up Normal 17.89 GB 12.48% 0 1.2.3.36 Up Normal 27.72 GB 25.00% 42535295865117307932921825928971026432 1.2.3.193 Up Normal 42.14 GB 50.00% 127605887595351923798765477786913079296 1.2.3.252 Up Normal 36.71 GB 12.52% 148904621249875869977532879268261763219 Then all of a sudden the node that used to sit in the middle shows up (as "Down"). The machine itself was decommissioned over the week-end. It's confirmed that it is not in play. ubuntu@YYY:~$ nodetool -h localhost ring 1.2.3.47 Up Normal 17.93 GB 12.48% 0 1.2.3.36 Up Normal 27.76 GB 25.00% 42535295865117307932921825928971026432 2.3.4.193 Down Normal 12.35 GB 25.00% 85070591730234615865843651857942052864 1.2.3.193 Up Normal 42.24 GB 25.00% 127605887595351923798765477786913079296 1.2.3.252 Up Normal 36.66 GB 12.52% 148904621249875869977532879268261763219 >From logs on each node: 2011-03-22T21:30:17.040407+00:00 Node /2.3.4.193 is now part of the cluster 2011-03-22T21:30:16.956335+00:00 Node /2.3.4.193 is now part of the cluster 2011-03-22T21:30:18.887269+00:00 Node /2.3.4.193 is now part of the cluster 2011-03-22T21:30:18.978861+00:00 Node /2.3.4.193 is now part of the cluster (a node coming back from the dead) On 1.2.3.193, trying to remove the ghost token... ubuntu@XXX:~$ nodetool -h localhost ring 148904621249875869977532879268261763219 1.2.3.47 Up Normal 17.93 GB 12.48% 0 1.2.3.36 Up Normal 27.76 GB 25.00% 42535295865117307932921825928971026432 2.3.4.193 Down Leaving 12.35 GB 25.00% 85070591730234615865843651857942052864 1.2.3.193 Up Normal 52.06 GB 25.00% 127605887595351923798765477786913079296 1.2.3.252 Up Normal 43.11 GB 12.52% 148904621249875869977532879268261763219 ubuntu@XXX:~$ nodetool -h localhost removetoken status RemovalStatus: Removing token (85070591730234615865843651857942052864). Waiting for replication confirmation from [/1.2.3.193]. (wait wait wait) ubuntu@XXX:~$ nodetool -h localhost removetoken force RemovalStatus: Removing token (85070591730234615865843651857942052864). Waiting for replication confirmation from [/1.2.3.193]. (fixed) ubuntu@XXX:~$ nodetool -h localhost ring 1.2.3.47 Up Normal 17.93 GB 12.48% 0 1.2.3.36 Up Normal 27.76 GB 25.00% 42535295865117307932921825928971026432 1.2.3.193 Up Normal 53.73 GB 50.00% 127605887595351923798765477786913079296 1.2.3.252 Up Normal 43.11 GB 12.52% 148904621249875869977532879268261763219 -- Alexis Lê-Quôc