[jira] [Comment Edited] (CASSANDRA-16182) A replacement node, although completed bootstrap and joined ring according to itself, stuck in Joining state as per the peers

Paulo Motta (Jira) Tue, 06 Oct 2020 13:11:12 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-16182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209120#comment-17209120
 ]


Paulo Motta edited comment on CASSANDRA-16182 at 10/6/20, 8:10 PM:
-------------------------------------------------------------------

I think the safest thing to prevent this edge case is to make C' abort 
replacement if it hears about C via gossip. Likewise if node C learns about C' 
via gossip it should probably halt execution to prevent potential consistency 
violations.


was (Author: pauloricardomg):
I think the safest thing to prevent this edge case is to make C' abort 
replacement if it hears about the C via gossip. Likewise if node C learns about 
C' via gossip it should probably halt execution to prevent potential 
consistency violations.

> A replacement node, although completed bootstrap and joined ring according to 
> itself, stuck in Joining state as per the peers
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-16182
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16182
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Gossip
>            Reporter: Sumanth Pasupuleti
>            Assignee: Sumanth Pasupuleti
>            Priority: Normal
>             Fix For: 3.0.x
>
>
> This issue occurred in a production 3.0.21 cluster.
> Here is what happened
> # We had, say, a three node Cassandra cluster with nodes A, B and C
> # C got "terminated by cloud provider" due to health check failure and a 
> replacement node C' got launched.
> # C' started bootstrapping data from its neighbors
> # Network flaw: Nodes A,B were still able to communicate with terminated node 
> C and consequently still have C as alive.
> # The replacement node C' learnt about C through gossip but was unable to 
> communicate with C and marked C as DOWN.
> # C' completed bootstrapping successfully and itself and its peers logged 
> this statement "Node C' will complete replacement of C for tokens 
> [-7686143363672898397]"
> # C' logged the statement "Nodes C' and C have the same token 
> -7686143363672898397. C' is the new owner"
> # C' started listening for thrift and cql clients
> # Peer nodes A and B logged "Node C' cannot complete replacement of alive 
> node C "
> # A few seconds later, A and B marked C as DOWN
> C' continued to log below lines in an endless fashion
> {code:java}
> Node C is now part of the cluster
> Nodes () and C' have the same token C.  Ignoring -7686143363672898397 (Needs 
> a log statement fix)
> FatClient C has been silent for 30000ms, removing from gossip
> {code}
> My reasoning of what happened: 
> By the time replacement node (C') finished bootstrapping and announced it's 
> state to Normal, A and B were still able to communicate with the replacing 
> node C (while C' was not able to with C), and hence rejected C' replacing C. 
> C' does not know this and does not attempt to recommunicate its "Normal" 
> state to rest of the cluster. (Worth noting that A and B marked C as down 
> soon after)
> Gossip keeps telling C' to add C to its metadata, and C' keeps kicking C out 
> eventually based on FailureDetector. 
> Proposed fix:
> When C' is notified through gossip about C, and given both own the same token 
> and given C' has finished bootstrapping, C' can emit its Normal state again 
> which should fix this in my opinion (so long as A and B have marked C as 
> DOWN, which they did eventually)
> I ended up manually fixing this by restarting Cassandra on C', which forced 
> it to announce its "Normal" state via
> StorageService.initServer --> joinTokenRing() --> finishJoiningRing() --> 
> setTokens() --> setGossipTokens()
> Alternately, I could have possibly achieved the same behavior if I disabled 
> and enabled gossip via jmx/nodetool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16182) A replacement node, although completed bootstrap and joined ring according to itself, stuck in Joining state as per the peers

Reply via email to