[jira] [Commented] (CASSANDRA-19983) Cassandra gossip issue for large cluster

Michael Semb Wever (Jira) Fri, 18 Oct 2024 02:29:11 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-19983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17890875#comment-17890875
 ]


Michael Semb Wever commented on CASSANDRA-19983:
------------------------------------------------

I don't think large clusters are a hard requirement for this to manifest, just 
any network or partitioning issue along with the use of fat-clients.
Suggest re-titling to '{{Gossip issue with coordinator-only nodes (fat clients) 
leads to missing DC/Rack/Host ID endpoint state}}' (or something similar…)

> Cassandra gossip issue for large cluster
> ----------------------------------------
>
>                 Key: CASSANDRA-19983
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19983
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Gossip
>            Reporter: Runtian Liu
>            Assignee: Runtian Liu
>            Priority: Normal
>             Fix For: 4.0.x, 4.1.x, 5.0.x
>
>
> When adding a new node to a cluster, we see a lot of nodes reporting below 
> error:
> {code:java}
> java.lang.NullPointerException: null at 
> o.a.cassandra.gms.Gossiper.getHostId(Gossiper.java:1378) at 
> o.a.cassandra.gms.Gossiper.getHostId(Gossiper.java:1373) at 
> o.a.c.service.StorageService.handleStateBootstrap(StorageService.java:3088) 
> at o.a.c.service.StorageService.onChange(StorageService.java:2783) at 
> o.a.cassandra.gms.Gossiper.doOnChangeNotifications(Gossiper.java:1851) at 
> o.a.cassandra.gms.Gossiper.applyNewStates(Gossiper.java:1816) at 
> o.a.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1749) at 
> o.a.c.g.GossipDigestAckVerbHandler.doVerb(GossipDigestAckVerbHandler.java:81) 
> at o.a.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:79) at 
> o.a.cassandra.net.InboundSink.accept(InboundSink.java:98) at 
> o.a.cassandra.net.InboundSink.accept(InboundSink.java:46) at 
> o.a.c.n.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430)
>  at o.a.c.c.ExecutionFailure$1.run(ExecutionFailure.java:133) at 
> j.u.c.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at 
> j.u.c.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at 
> i.n.u.c.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at 
> java.lang.Thread.run(Thread.java:829){code}
> After some investigation of this issue, the existing nodes of the cluster 
> have removed the new node as a fat client. The reason for this is the new 
> node is busy with gossip and the gossip queue has a lot of task piling up. 
> The gossip state for the new node on the existing host is:
>  
>  
> {code:java}
> /1.1.1.1
>   generation:1727479926
>   heartbeat:25
>   LOAD:20:31174.0
>   SCHEMA:16:59adb24e-f3cd-3e02-97f0-5b395827453f
>   DC:12:dc1
>   RACK:14:0
>   RELEASE_VERSION:5:4.1.3
>   NET_VERSION:1:12
>   HOST_ID:2:b9cc4587-68f5-4bb6-a933-fd0c77a064dc
>   INTERNAL_ADDRESS_AND_PORT:8:1.1.1.1:7000
>   NATIVE_ADDRESS_AND_PORT:3:1.1.1.1:9042
>   SSTABLE_VERSIONS:6:big-nb
>   TOKENS: not present {code}
> Later this endpoint is removed from gossip endpointstate map because it is 
> treated as a fat client.
>  
>  
> {code:java}
> FatClient /1.1.1.1:7000 has been silent for 30000ms, removing from gossip 
> {code}
> But before it is removed from gossip, the node may have send gossip sync 
> message to the new node asking for gossip info for this new node with 
> heartbeat version larger than 20 in this example.
>  
> The new node gossip queue has too many task to be processed, so it cannot 
> process this request immediately. When it send the gossip ack request back to 
> the existing node, the node has removed the gossip info about the new node. 
> So the gossip will look like below on some existing node:
> {code:java}
> /1.1.1.1 
>   generation:1727479926 
>   heartbeat:229 
>   LOAD:200:3.0 
>   SCHEMA:203:59adb24e-f3cd-3e02-97f0-5b395827453f {code}
> All the information relate to DC/Rack/Host ID is gone.
> When the new node later get gossip settled and modified the local state as 
> BOOT and decided its token. The existing node will receive the STATUS and 
> TOKEN info, then the gossip state will become:
> {code:java}
> /1.1.1.1 
>   generation:1727479926 
>   heartbeat:329 
>   LOAD:300:3.0 
>   SCHEMA:303:59adb24e-f3cd-3e02-97f0-5b395827453f
>   STATUS_WITH_PORT:308:BOOT,-142070360466566106
>   TOKENS:309:<hidden>{code}
> When the existing node process this bootstrap event, we will see the NPE due 
> to host_id missing.
> This issue will create consistency problem because for large clusters, a lot 
> of nodes will consider the joining nodes a remote DC nodes if DC info is 
> missing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-19983) Cassandra gossip issue for large cluster

Reply via email to