[ 
https://issues.apache.org/jira/browse/CASSANDRA-13993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16376945#comment-16376945
 ] 

Jason Brown commented on CASSANDRA-13993:
-----------------------------------------

[~jolynch] I understand what you are saying. I think the difference between a 
"percent UP" and a "count of DOWN nodes" isn't that much, so either one is 
probably fine. Further, I intentionally wanted this feature to "just work out 
of the box", without requiring extra configuration (for local vs each dc, and 
so on).

bq. relying on the timeout for large clusters (although it would be awesome if 
this timeout re-used or defaulted to an existing timeout relevant to gossip 
convergence such as BROADCAST_INTERVAL or RING_DELAY).

I'm reticent to tie this new behavior to one of those values as the use cases 
are different; meaning, if you change the value for one semantic meaning, you 
alter the other.

bq. especially with vnode=256 clusters where any 2 nodes down in different 
racks essentially guarantees an unavailable error for some intersecting token 
range.

This is a fair point, and I'd be open to bumping up the default threshold. 
However, remember that behavior exists already in cassandra (it's what you buy 
in to when using vnodes); this patch helps to alleviate the 
unavailables/timeouts, not eliminate nor accentuate them.



> Add optional startup delay to wait until peers are ready
> --------------------------------------------------------
>
>                 Key: CASSANDRA-13993
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13993
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Lifecycle
>            Reporter: Jason Brown
>            Assignee: Jason Brown
>            Priority: Minor
>             Fix For: 4.x
>
>
> When bouncing a node in a large cluster, is can take a while to recognize the 
> rest of the cluster as available. This is especially true if using TLS on 
> internode messaging connections. The bouncing node (and any clients connected 
> to it) may see a series of Unavailable or Timeout exceptions until the node 
> is 'warmed up' as connecting to the rest of the cluster is asynchronous from 
> the rest of the startup process.
> There are two aspects that drive a node's ability to successfully communicate 
> with a peer after a bounce:
> - marking the peer as 'alive' (state that is held in gossip). This affects 
> the unavailable exceptions
> - having both open outbound and inbound connections open and ready to each 
> peer. This affects timeouts.
> Details of each of these mechanisms are described in the comments below.
> This ticket proposes adding a mechanism, optional and configurable, to delay 
> opening the client native protocol port until some percentage of the peers in 
> the cluster is marked alive and connected to/from. Thus while we potentially 
> slow down startup (delay opening the client port), we alleviate the chance 
> that queries made by clients don't hit transient unavailable/timeout 
> exceptions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to