[ 
https://issues.apache.org/jira/browse/CASSANDRA-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13868685#comment-13868685
 ] 

Vijay edited comment on CASSANDRA-6571 at 1/11/14 6:46 AM:
-----------------------------------------------------------

Not sure if this will fix it, because the remote machine has not responded back 
(echo message response). 
1) I think we need to always mark the nodes as dead and mark it up only after 
we received the echo response
2) I think we need to check or reset the socket in the receiving side, may be 
need to markDead (or retry the message after x seconds?)

May be because we removed the hibernate during restarts, this issue shows up? 
(we are not resetting the states) <== [~brandon.williams]
I think the hang on the echo response (socket.write())


was (Author: vijay2...@yahoo.com):
Not sure if this will fix it, because the remote machine has not responded back 
(echo message response). 
1) I think we need to always mark the nodes as dead and mark it up only after 
we received the echo response
2) I think we need to check or reset the socket in the receiving side, may be 
need to markDead (or retry the message after x seconds?)

May be because we removed the hibernate during restarts, this issue shows up? 
(we are not restarting the states) <== [~brandon.williams]
I think the hang on the echo response (socket.write())

> Quickly restarted nodes can list others as down indefinitely
> ------------------------------------------------------------
>
>                 Key: CASSANDRA-6571
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6571
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Richard Low
>            Assignee: sankalp kohli
>              Labels: gossip
>             Fix For: 2.0.5
>
>
> In a healthy cluster, if a node is restarted quickly, it may list other nodes 
> as down when it comes back up and never list them as up.  I reproduced it on 
> a small cluster running in Docker containers.
> 1. Have a healthy 5 node cluster:
> {quote}
> $ nodetool status
> Datacenter: datacenter1
> =======================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address          Load       Tokens  Owns (effective)  Host ID             
>                   Rack
> UN  192.168.100.1    40.88 KB   256     38.3%             
> 92930ef6-1b29-49f0-a8cd-f962b55dca1b  rack1
> UN  192.168.100.254  80.63 KB   256     39.6%             
> ef15a717-9d60-48fb-80a9-e0973abdd55e  rack1
> UN  192.168.100.3    87.78 KB   256     40.8%             
> 4e6765db-97ed-4429-a9f4-8e29de247f18  rack1
> UN  192.168.100.2    75.22 KB   256     40.6%             
> e89bc581-5345-4abd-88ba-7018371940fc  rack1
> UN  192.168.100.4    80.83 KB   256     40.8%             
> 466a9798-d484-44f0-aae8-bb2b78d80331  rack1
> {quote}
> 2. Kill a node and restart it quickly:
> bq. kill -9 <pid> && start-cassandra
> 3. Wait for the node to come back and more often than not, it lists one or 
> more other nodes as down indefinitely:
> {quote}
> $ nodetool status
> Datacenter: datacenter1
> =======================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address          Load       Tokens  Owns (effective)  Host ID             
>                   Rack
> UN  192.168.100.1    40.88 KB   256     38.3%             
> 92930ef6-1b29-49f0-a8cd-f962b55dca1b  rack1
> UN  192.168.100.254  80.63 KB   256     39.6%             
> ef15a717-9d60-48fb-80a9-e0973abdd55e  rack1
> DN  192.168.100.3    87.78 KB   256     40.8%             
> 4e6765db-97ed-4429-a9f4-8e29de247f18  rack1
> DN  192.168.100.2    75.22 KB   256     40.6%             
> e89bc581-5345-4abd-88ba-7018371940fc  rack1
> DN  192.168.100.4    80.83 KB   256     40.8%             
> 466a9798-d484-44f0-aae8-bb2b78d80331  rack1
> {quote}
> From trace logging, here's what I think is going on:
> 1. The nodes are all happy gossiping
> 2. Restart node X. When it comes back up it starts gossiping with the other 
> nodes.
> 3. Before node X marks node Y as alive, X sends an echo message (introduced 
> in CASSANDRA-3533)
> 4. The echo message is received by Y. To reply, Y attempts to reuse a 
> connection to X. The connection is dead, but the message is attempted anyway 
> but fails.
> 5. X never receives the echo back, so Y isn't marked as alive.
> 6. X gossips to Y again, but because the endpoint isAlive() returns true, it 
> never calls markAlive() to properly set Y as alive.
> I tried to fix this by defaulting isAlive=false in the constructor of 
> EndpointState. This made it less likely to mark a node as down but it still 
> happens.
> The workaround is to leave a node down for a while so the connections die on 
> the remaining nodes.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to