[ 
https://issues.apache.org/jira/browse/CASSANDRA-18845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766192#comment-17766192
 ] 

Cameron Zemek commented on CASSANDRA-18845:
-------------------------------------------

CASSANDRA-18543 had 3 components:
 # Allow for overriding the values used in waitToSettle
 # Make waitToSettle also consider the liveEndpoint members as part of settling.
 # Changes to handling of ECHO requests to remove duplicate inflight ECHO and 
duplicate log messages about the same node going into UP state 'is now UP'

 

With the reverting in CASSANDRA-18854 did the changes to waitToSettle need to 
be reverted? The problem seems to be the changes to ECHO. 

 

> The next step for this ticket to move forward will be to create tests that 
> demonstrate the problem and guard against regressions.

This is going to be very difficult todo. dtests setup clusters on loopback 
addresses and waitToSettle code path has a guard against it if using a loopback 
address. Also, the problems mostly become apparent with large clusters.

If redo the patch and remove the changes to ECHO and show those tests do not 
have regression would this allow the ticket to move forward?

I also in process of setting up a large test cluster. 

[^example.log] shows an example of what happens without the patched 
waitToSettle. Gossip settles before nodes have finished marked as UP.

> Waiting for gossip to settle on live endpoints
> ----------------------------------------------
>
>                 Key: CASSANDRA-18845
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18845
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Cameron Zemek
>            Priority: Normal
>         Attachments: 18845-3.11.patch, 18845-4.0.patch, 18845-4.1.patch, 
> 18845-5.0.patch, example.log, image-2023-09-14-11-16-23-020.png
>
>
> This is a follow up to CASSANDRA-18543
> Although that ticket added ability to set cassandra.gossip_settle_min_wait_ms 
> this is tedious and error prone. On a node just observed a 79 second gap 
> between waiting for gossip and the first echo response to indicate a node is 
> UP.
> The problem being that do not want to start Native Transport until gossip 
> settles otherwise queries can fail consistency such as LOCAL_QUORUM as it 
> thinks the replicas are still in DOWN state.
> Instead of having to set gossip_settle_min_wait_ms I am proposing that 
> (outside single node cluster) wait for UP message from another node before 
> considering gossip as settled. Eg.
> {code:java}
>             if (currentSize == epSize && currentLive == liveSize && liveSize 
> > 1)
>             {
>                 logger.debug("Gossip looks settled.");
>                 numOkay++;
>             } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to