[ 
https://issues.apache.org/jira/browse/CASSANDRA-18543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cameron Zemek updated CASSANDRA-18543:
--------------------------------------
    Description: 
When a node starts it will get endpoint states (via shadow round) but have all 
nodes marked as down. The problem is the wait to settle only checks the size of 
endpoint states is stable before starting Native transport. Once native 
transport starts it will receive queries and fail consistency levels such as 
LOCAL_QUORUM since it still thinks nodes are down.

This is problem for a number of large clusters for our customers. The cluster 
has quorum but due to this issue a node restart is causing a bunch of query 
errors.

My initial solution to this was to only check live endpoints size in addition 
to size of endpoint states. This worked but I noticed in testing this fix that 
there also a lot of duplication of checking the same node (via Echo messages) 
for liveness. So the patch also removes this duplication of checking node is UP 
in markAlive.

The final problem I found while testing is sometimes could still not see a 
change in live endpoints due to only 1 second polling, so the patch allows for 
overridding the settle parameters. I could not reliability reproduce this but 
think its worth providing a way to override these hardcoded values.

  was:
When a node starts it will get endpoint states (via shadow round) but have all 
nodes marked as down. The problem is the wait to settle only checks the size of 
endpoint states is stable before starting Native transport. Once native 
transport starts it will receive queries and fail consistency levels such as 
LOCAL_QUORUM since it still thinks nodes are down.

This is problem for a number of large clusters for our customers. The cluster 
has quorum but due to this a node restart is causing a bunch of query errors.

My initial solution to this was to only check live endpoints size in addition 
to size of endpoint states. This worked but I noticed in testing this fix that 
there also a lot of duplication of checking the same node (via Echo messages) 
for liveness. So the patch also removes this duplication of checking node is UP 
in markAlive.

The final problem I found while testing is sometimes could still not see a 
change in live endpoints due to only 1 second polling, so the patch allows for 
overridding the settle parameters. I could not reliability reproduce this but 
think its worth providing a way to override these hardcoded values.


> Waiting for gossip to settle does not wait for live endpoints
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-18543
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18543
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Cameron Zemek
>            Priority: Normal
>         Attachments: gossip.patch
>
>
> When a node starts it will get endpoint states (via shadow round) but have 
> all nodes marked as down. The problem is the wait to settle only checks the 
> size of endpoint states is stable before starting Native transport. Once 
> native transport starts it will receive queries and fail consistency levels 
> such as LOCAL_QUORUM since it still thinks nodes are down.
> This is problem for a number of large clusters for our customers. The cluster 
> has quorum but due to this issue a node restart is causing a bunch of query 
> errors.
> My initial solution to this was to only check live endpoints size in addition 
> to size of endpoint states. This worked but I noticed in testing this fix 
> that there also a lot of duplication of checking the same node (via Echo 
> messages) for liveness. So the patch also removes this duplication of 
> checking node is UP in markAlive.
> The final problem I found while testing is sometimes could still not see a 
> change in live endpoints due to only 1 second polling, so the patch allows 
> for overridding the settle parameters. I could not reliability reproduce this 
> but think its worth providing a way to override these hardcoded values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to