Re: Dead node still being pinged

Nicolas Lalevée Wed, 13 Jun 2012 05:07:16 -0700

Le 13 juin 2012 à 10:30, aaron morton a écrit :

> Here is what I *think* is going on, if Brandon is around he may be able to 
> help out. 
> 
> 
> The old nodes are being included in the Gossip rounds, because 
> Gossiper.doGossipToUnreachableMember() just looks at the nodes that are 
> unreachable. It does not check if they have been removed from the cluster. 
> 
> Information about the removed nodes is kept by gossip so that if a node is 
> removed while it is down it will shut down when restarted. This information 
> *should* stay in gossip for 3 days. 
> 
> In your gossip info, the last long on the STATUS lines is the expiry time for 
> this info…
> 
> /10.10.0.24
> STATUS:removed,127605887595351923798765477786913079296,1336530323263
> REMOVAL_COORDINATOR:REMOVER,0
> /10.10.0.22
> STATUS:removed,42535295865117307932921825928971026432,1336529659203
> REMOVAL_COORDINATOR:REMOVER,113427455640312814857969558651062452224
> 
> For the first line it's 
> In [48]: datetime.datetime.fromtimestamp(1336530323263/1000)
> Out[48]: datetime.datetime(2012, 5, 9, 14, 25, 23)
> 
> So that's good. 
> 
> The Gossip round will remove the 0.24 and 0.22 nodes from the local state if 
> the expiry time has passed, and the node is marked as dead and it's not in 
> the token ring. 
> 
> You can see if the node thinks 0.24 and 0.22 are up by looking 
> getSimpleStates() on the FailureDetectorMBean. (I use jmxterm to do this sort 
> of thing)


The two old nodes are still seen as down:
SimpleStates:[/10.10.0.22:DOWN, /10.10.0.24:DOWN, /10.10.0.26:UP, 
/10.10.0.25:UP, /10.10.0.27:UP]

> 
> The other thing that can confuse things is the gossip generation. If your old 
> nodes were started with a datetime in the future that can muck things up. 

I have just checked, my old nodes machines are nicely synchronized. My new 
nodes have some lag of few seconds, some in the future, some in the past. I 
definitively need to fix that.

> The simple to try is starting the server with the -Dcassandra.join_ring=false 
> JVM option. This will force the node to get the ring info from othernodes. 
> Check things with nodetool gossip info to see if the other nodes tell it 
> about the old ones again.

You meant -Dcassandra.load_ring_state=false right ?

Then nothing changed.

> Sorry, gossip can be tricky to diagnose over email. 

No worry, I really appreciate that you take time looking into my issues.

Maybe I could open a jira about my issue ? Maybe there was a config mess on my 
part at some point, ie the unsynchronized date on my machines, but I think it 
would be nice if cassandra could resolve itself of that inconsistent state.

Nicolas

> 
> 
> 
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 12/06/2012, at 10:33 PM, Nicolas Lalevée wrote:
> 
>> I have one dirty solution to try: bring data-2 and data-4 back up and down 
>> again. Is there any way I can tell cassandra to not get any data, so when I 
>> would get my old node up, no streaming would start ?
>> 
>> cheers,
>> Nicolas
>> 
>> Le 12 juin 2012 à 12:25, Nicolas Lalevée a écrit :
>> 
>>> Le 12 juin 2012 à 11:03, aaron morton a écrit :
>>> 
>>>> Try purging the hints for 10.10.0.24 using the HintedHandOffManager MBean.
>>> 
>>> As far as I could tell, there were no hinted hand off to be delivered. 
>>> Nevertheless I have called "deleteHintsForEndpoint" on every node for the 
>>> two expected to be out nodes.
>>> Nothing changed, I still see packet being send to these old nodes.
>>> 
>>> I looked closer to ResponsePendingTasks of MessagingService. Actually the 
>>> numbers change, between 0 and about 4. So tasks are ending but new ones 
>>> come just after.
>>> 
>>> Nicolas
>>> 
>>>> 
>>>> Cheers
>>>> 
>>>> -----------------
>>>> Aaron Morton
>>>> Freelance Developer
>>>> @aaronmorton
>>>> http://www.thelastpickle.com
>>>> 
>>>> On 12/06/2012, at 3:33 AM, Nicolas Lalevée wrote:
>>>> 
>>>>> finally, thanks to the groovy jmx builder, it was not that hard.
>>>>> 
>>>>> 
>>>>> Le 11 juin 2012 à 12:12, Samuel CARRIERE a écrit :
>>>>> 
>>>>>> If I were you, I would connect (through JMX, with jconsole) to one of 
>>>>>> the nodes that is sending messages to an old node, and would have a look 
>>>>>> at these MBean : 
>>>>>> - org.apache.net.FailureDetector : does SimpleStates looks good ? (or do 
>>>>>> you see an IP of an old node)
>>>>> 
>>>>> SimpleStates:[/10.10.0.22:DOWN, /10.10.0.24:DOWN, /10.10.0.26:UP, 
>>>>> /10.10.0.25:UP, /10.10.0.27:UP]
>>>>> 
>>>>>> - org.apache.net.MessagingService : do you see one of the old IP in one 
>>>>>> of the attributes ?
>>>>> 
>>>>> data-5:
>>>>> CommandCompletedTasks:
>>>>> [10.10.0.22:2, 10.10.0.26:6147307, 10.10.0.27:6084684, 10.10.0.24:2]
>>>>> CommandPendingTasks:
>>>>> [10.10.0.22:0, 10.10.0.26:0, 10.10.0.27:0, 10.10.0.24:0]
>>>>> ResponseCompletedTasks:
>>>>> [10.10.0.22:1487, 10.10.0.26:6187204, 10.10.0.27:6062890, 10.10.0.24:1495]
>>>>> ResponsePendingTasks:
>>>>> [10.10.0.22:0, 10.10.0.26:0, 10.10.0.27:0, 10.10.0.24:0]
>>>>> 
>>>>> data-6:
>>>>> CommandCompletedTasks:
>>>>> [10.10.0.22:2, 10.10.0.27:6064992, 10.10.0.24:2, 10.10.0.25:6308102]
>>>>> CommandPendingTasks:
>>>>> [10.10.0.22:0, 10.10.0.27:0, 10.10.0.24:0, 10.10.0.25:0]
>>>>> ResponseCompletedTasks:
>>>>> [10.10.0.22:1463, 10.10.0.27:6067943, 10.10.0.24:1474, 10.10.0.25:6367692]
>>>>> ResponsePendingTasks:
>>>>> [10.10.0.22:0, 10.10.0.27:0, 10.10.0.24:2, 10.10.0.25:0]
>>>>> 
>>>>> data-7:
>>>>> CommandCompletedTasks:
>>>>> [10.10.0.22:2, 10.10.0.26:6043653, 10.10.0.24:2, 10.10.0.25:5964168]
>>>>> CommandPendingTasks:
>>>>> [10.10.0.22:0, 10.10.0.26:0, 10.10.0.24:0, 10.10.0.25:0]
>>>>> ResponseCompletedTasks:
>>>>> [10.10.0.22:1424, 10.10.0.26:6090251, 10.10.0.24:1431, 10.10.0.25:6094954]
>>>>> ResponsePendingTasks:
>>>>> [10.10.0.22:4, 10.10.0.26:0, 10.10.0.24:1, 10.10.0.25:0]
>>>>> 
>>>>>> - org.apache.net.StreamingService : do you see an old IP in 
>>>>>> StreamSources or StreamDestinations ?
>>>>> 
>>>>> nothing streaming on the 3 nodes.
>>>>> nodetool netstats confirmed that.
>>>>> 
>>>>>> - org.apache.internal.HintedHandoff : are there non-zero ActiveCount, 
>>>>>> CurrentlyBlockedTasks, PendingTasks, TotalBlockedTask ?
>>>>> 
>>>>> On the 3 nodes, all at 0.
>>>>> 
>>>>> I don't know much what I'm looking at, but it seems that some 
>>>>> ResponsePendingTasks needs to end.
>>>>> 
>>>>> Nicolas
>>>>> 
>>>>>> 
>>>>>> Samuel 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Nicolas Lalevée <nicolas.lale...@hibnet.org>
>>>>>> 08/06/2012 21:03
>>>>>> Veuillez répondre à
>>>>>> user@cassandra.apache.org
>>>>>> 
>>>>>> A
>>>>>> user@cassandra.apache.org
>>>>>> cc
>>>>>> Objet
>>>>>> Re: Dead node still being pinged
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Le 8 juin 2012 à 20:02, Samuel CARRIERE a écrit :
>>>>>> 
>>>>>>> I'm in the train but just a guess : maybe it's hinted handoff. A look 
>>>>>>> in the logs of the new nodes could confirm that : look for the IP of an 
>>>>>>> old node and maybe you'll find hinted handoff related messages.
>>>>>> 
>>>>>> I grepped on every node about every old node, I got nothing since the 
>>>>>> "crash".
>>>>>> 
>>>>>> If it can be of some help, here is some grepped log of the crash:
>>>>>> 
>>>>>> system.log.1: WARN [RMI TCP Connection(1037)-10.10.0.26] 2012-05-06 
>>>>>> 00:39:30,241 StorageService.java (line 2417) Endpoint /10.10.0.24 is 
>>>>>> down and will not receive data for re-replication of /10.10.0.22
>>>>>> system.log.1: WARN [RMI TCP Connection(1037)-10.10.0.26] 2012-05-06 
>>>>>> 00:39:30,242 StorageService.java (line 2417) Endpoint /10.10.0.24 is 
>>>>>> down and will not receive data for re-replication of /10.10.0.22
>>>>>> system.log.1: WARN [RMI TCP Connection(1037)-10.10.0.26] 2012-05-06 
>>>>>> 00:39:30,242 StorageService.java (line 2417) Endpoint /10.10.0.24 is 
>>>>>> down and will not receive data for re-replication of /10.10.0.22
>>>>>> system.log.1: WARN [RMI TCP Connection(1037)-10.10.0.26] 2012-05-06 
>>>>>> 00:39:30,243 StorageService.java (line 2417) Endpoint /10.10.0.24 is 
>>>>>> down and will not receive data for re-replication of /10.10.0.22
>>>>>> system.log.1: WARN [RMI TCP Connection(1037)-10.10.0.26] 2012-05-06 
>>>>>> 00:39:30,243 StorageService.java (line 2417) Endpoint /10.10.0.24 is 
>>>>>> down and will not receive data for re-replication of /10.10.0.22
>>>>>> system.log.1: INFO [GossipStage:1] 2012-05-06 00:44:33,822 Gossiper.java 
>>>>>> (line 818) InetAddress /10.10.0.24 is now dead.
>>>>>> system.log.1: INFO [GossipStage:1] 2012-05-06 04:25:23,894 Gossiper.java 
>>>>>> (line 818) InetAddress /10.10.0.24 is now dead.
>>>>>> system.log.1: INFO [OptionalTasks:1] 2012-05-06 04:25:23,895 
>>>>>> HintedHandOffManager.java (line 179) Deleting any stored hints for 
>>>>>> /10.10.0.24
>>>>>> system.log.1: INFO [GossipStage:1] 2012-05-06 04:25:23,895 
>>>>>> StorageService.java (line 1157) Removing token 
>>>>>> 127605887595351923798765477786913079296 for /10.10.0.24
>>>>>> system.log.1: INFO [GossipStage:1] 2012-05-09 04:26:25,015 Gossiper.java 
>>>>>> (line 818) InetAddress /10.10.0.24 is now dead.
>>>>>> 
>>>>>> 
>>>>>> Maybe its the way I have removed nodes ? AFAIR I didn't used the 
>>>>>> decommission command. For each node I got the node down and then issue a 
>>>>>> remove token command.
>>>>>> Here is what I can find in the log about when I removed one of them:
>>>>>> 
>>>>>> system.log.1: INFO [GossipTasks:1] 2012-05-02 17:21:10,281 Gossiper.java 
>>>>>> (line 818) InetAddress /10.10.0.24 is now dead.
>>>>>> system.log.1: INFO [HintedHandoff:1] 2012-05-02 17:21:21,496 
>>>>>> HintedHandOffManager.java (line 292) Endpoint /10.10.0.24 died before 
>>>>>> hint delivery, aborting
>>>>>> system.log.1: INFO [GossipStage:1] 2012-05-02 17:21:59,307 Gossiper.java 
>>>>>> (line 818) InetAddress /10.10.0.24 is now dead.
>>>>>> system.log.1: INFO [HintedHandoff:1] 2012-05-02 17:31:20,336 
>>>>>> HintedHandOffManager.java (line 292) Endpoint /10.10.0.24 died before 
>>>>>> hint delivery, aborting
>>>>>> system.log.1: INFO [HintedHandoff:1] 2012-05-02 17:41:06,177 
>>>>>> HintedHandOffManager.java (line 292) Endpoint /10.10.0.24 died before 
>>>>>> hint delivery, aborting
>>>>>> system.log.1: INFO [HintedHandoff:1] 2012-05-02 17:51:18,148 
>>>>>> HintedHandOffManager.java (line 292) Endpoint /10.10.0.24 died before 
>>>>>> hint delivery, aborting
>>>>>> system.log.1: INFO [HintedHandoff:1] 2012-05-02 18:00:31,709 
>>>>>> HintedHandOffManager.java (line 292) Endpoint /10.10.0.24 died before 
>>>>>> hint delivery, aborting
>>>>>> system.log.1: INFO [HintedHandoff:1] 2012-05-02 18:11:02,521 
>>>>>> HintedHandOffManager.java (line 292) Endpoint /10.10.0.24 died before 
>>>>>> hint delivery, aborting
>>>>>> system.log.1: INFO [HintedHandoff:1] 2012-05-02 18:20:38,282 
>>>>>> HintedHandOffManager.java (line 292) Endpoint /10.10.0.24 died before 
>>>>>> hint delivery, aborting
>>>>>> system.log.1: INFO [HintedHandoff:1] 2012-05-02 18:31:09,513 
>>>>>> HintedHandOffManager.java (line 292) Endpoint /10.10.0.24 died before 
>>>>>> hint delivery, aborting
>>>>>> system.log.1: INFO [HintedHandoff:1] 2012-05-02 18:40:31,565 
>>>>>> HintedHandOffManager.java (line 292) Endpoint /10.10.0.24 died before 
>>>>>> hint delivery, aborting
>>>>>> system.log.1: INFO [HintedHandoff:1] 2012-05-02 18:51:10,566 
>>>>>> HintedHandOffManager.java (line 292) Endpoint /10.10.0.24 died before 
>>>>>> hint delivery, aborting
>>>>>> system.log.1: INFO [HintedHandoff:1] 2012-05-02 19:00:32,197 
>>>>>> HintedHandOffManager.java (line 292) Endpoint /10.10.0.24 died before 
>>>>>> hint delivery, aborting
>>>>>> system.log.1: INFO [HintedHandoff:1] 2012-05-02 19:11:17,018 
>>>>>> HintedHandOffManager.java (line 292) Endpoint /10.10.0.24 died before 
>>>>>> hint delivery, aborting
>>>>>> system.log.1: INFO [HintedHandoff:1] 2012-05-02 19:21:21,759 
>>>>>> HintedHandOffManager.java (line 292) Endpoint /10.10.0.24 died before 
>>>>>> hint delivery, aborting
>>>>>> system.log.1: INFO [GossipStage:1] 2012-05-02 20:05:57,281 Gossiper.java 
>>>>>> (line 818) InetAddress /10.10.0.24 is now dead.
>>>>>> system.log.1: INFO [OptionalTasks:1] 2012-05-02 20:05:57,281 
>>>>>> HintedHandOffManager.java (line 179) Deleting any stored hints for 
>>>>>> /10.10.0.24
>>>>>> system.log.1: INFO [GossipStage:1] 2012-05-02 20:05:57,281 
>>>>>> StorageService.java (line 1157) Removing token 
>>>>>> 145835300108973619103103718265651724288 for /10.10.0.24
>>>>>> 
>>>>>> 
>>>>>> Nicolas
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> ----- Message d'origine -----
>>>>>>> De : Nicolas Lalevée [nicolas.lale...@hibnet.org]
>>>>>>> Envoyé : 08/06/2012 19:26 ZE2
>>>>>>> À : user@cassandra.apache.org
>>>>>>> Objet : Re: Dead node still being pinged
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Le 8 juin 2012 à 15:17, Samuel CARRIERE a écrit :
>>>>>>> 
>>>>>>>> What does nodetool ring says ? (Ask every node)
>>>>>>> 
>>>>>>> currently, each of new node see only the tokens of the new nodes.
>>>>>>> 
>>>>>>>> Have you checked that the list of seeds in every yaml is correct ?
>>>>>>> 
>>>>>>> yes, it is correct, every of my new node point to the first of my new 
>>>>>>> node
>>>>>>> 
>>>>>>>> What version of cassandra are you using ?
>>>>>>> 
>>>>>>> Sorry I should have wrote this in my first mail.
>>>>>>> I use the 1.0.9
>>>>>>> 
>>>>>>> Nicolas
>>>>>>> 
>>>>>>>> 
>>>>>>>> Samuel
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Nicolas Lalevée <nicolas.lale...@hibnet.org>
>>>>>>>> 08/06/2012 14:10
>>>>>>>> Veuillez répondre à
>>>>>>>> user@cassandra.apache.org
>>>>>>>> 
>>>>>>>> A
>>>>>>>> user@cassandra.apache.org
>>>>>>>> cc
>>>>>>>> Objet
>>>>>>>> Dead node still being pinged
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I had a configuration where I had 4 nodes, data-1,4. We then bought 3 
>>>>>>>> bigger machines, data-5,7. And we moved all data from data-1,4 to 
>>>>>>>> data-5,7.
>>>>>>>> To move all the data without interruption of service, I added one new 
>>>>>>>> node at a time. And then I removed one by one the old machines via a 
>>>>>>>> "remove token".
>>>>>>>> 
>>>>>>>> Everything was working fine. Until there was an expected load on our 
>>>>>>>> cluster, the machine started to swap and become unresponsive. We fixed 
>>>>>>>> the unexpected load and the three new machines were restarted. After 
>>>>>>>> that the new cassandra machines were stating that some old token were 
>>>>>>>> not assigned, namely from data-2 and data-4. To fix this I issued 
>>>>>>>> again some "remove token" commands.
>>>>>>>> 
>>>>>>>> Everything seems to be back to normal, but on the network I still see 
>>>>>>>> some packet from the new cluster to the old machines. On the port 7000.
>>>>>>>> How I can tell cassandra to completely forget about the old machines ?
>>>>>>>> 
>>>>>>>> Nicolas
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
>

Re: Dead node still being pinged

Reply via email to