[ 
https://issues.apache.org/jira/browse/CASSANDRA-13993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377336#comment-16377336
 ] 

Joseph Lynch commented on CASSANDRA-13993:
------------------------------------------

{quote}Further, I intentionally wanted this feature to "just work out of the 
box", without requiring extra configuration (for local vs each dc, and so on).
{quote}
[~jasobrown], I completely agree, and I believe there is a difference in 
"percent UP" from "count of DOWN" from a usability perspective, in particular 
"percent UP" is harder (or impossible) for users of the database to set 
properly (it will do what they want) or consistently (they leave it to the 
default or if they change it they use one setting everywhere), and the best 
default I can think of is 100%. Compare this to a "count DOWN" which is more 
likely to be a constant 1 or 2. Consider a user who has two multi-region 
clusters, one that has 12 nodes and one with 120 nodes. Seventy percent is an 
ok default for the first cluster, but a very bad one in the second and in 
either case you still have no guarantee that you will not see latency or errors 
even if you put the timeout at 2 days, and reflecting on it I think 
{{(percent_up, timeout) = (100%, 10-30s)}} would be the only default that gives 
users what they expect (restarting their database does not lead to errors). 
That aggressive setting would have clients doing local CLs waiting on all 
remote replicas, however, which other than preventing hint replay is a bit 
wasteful. On the other hand, in both clusters a {{block_for_peers_local_dc=1}} 
default setting is quite reasonable. The way that my patch implemented the 
three options it works out of the box for all deployments (vnodes, no vnodes, 
large clusters, small clusters, etc) whereas percent up only works well if the 
user _changes_ the default percentage to 100% or is not using vnodes.
{quote}I'm reticent to tie this new behavior to one of those values as the use 
cases are different; meaning, if you change the value for one semantic meaning, 
you alter the other.
{quote}
Ok, that makes sense.
{quote}This is a fair point, and I'd be open to bumping up the default 
threshold. However, remember that behavior exists already in cassandra (it's 
what you buy in to when using vnodes); this patch helps to alleviate the 
unavailables/timeouts, not eliminate nor accentuate them.
{quote}
I agree, this is a great step forward, but with a small change I think this 
strategy could practically eliminate the unavailables/timeouts. If I 
implemented the functionality with unit tests in a separate Jira would you 
consider reviewing it or do you think the slight additional complexity is not 
worth it? Even separating percentage up by local/remote datacenters would be a 
big step forward I think, and if we went with counts I could reduce the number 
of settings to 2 or 1 instead of 3 to give the advanced users less control if 
you think that would be less confusing for newer users.
  

> Add optional startup delay to wait until peers are ready
> --------------------------------------------------------
>
>                 Key: CASSANDRA-13993
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13993
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Lifecycle
>            Reporter: Jason Brown
>            Assignee: Jason Brown
>            Priority: Minor
>             Fix For: 4.0
>
>
> When bouncing a node in a large cluster, is can take a while to recognize the 
> rest of the cluster as available. This is especially true if using TLS on 
> internode messaging connections. The bouncing node (and any clients connected 
> to it) may see a series of Unavailable or Timeout exceptions until the node 
> is 'warmed up' as connecting to the rest of the cluster is asynchronous from 
> the rest of the startup process.
> There are two aspects that drive a node's ability to successfully communicate 
> with a peer after a bounce:
> - marking the peer as 'alive' (state that is held in gossip). This affects 
> the unavailable exceptions
> - having both open outbound and inbound connections open and ready to each 
> peer. This affects timeouts.
> Details of each of these mechanisms are described in the comments below.
> This ticket proposes adding a mechanism, optional and configurable, to delay 
> opening the client native protocol port until some percentage of the peers in 
> the cluster is marked alive and connected to/from. Thus while we potentially 
> slow down startup (delay opening the client port), we alleviate the chance 
> that queries made by clients don't hit transient unavailable/timeout 
> exceptions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to