[ https://issues.apache.org/jira/browse/CASSANDRA-14297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591000#comment-16591000 ]
Joseph Lynch commented on CASSANDRA-14297: ------------------------------------------ I updated the patch to fix the merge conflicts, and reduced it two just two options to make life easier (the default is tuned to wait for all but 2 local DC nodes and not care about non local DC == local_quorum). This is ready for review. > Optional startup delay for peers should wait for count rather than percentage > ----------------------------------------------------------------------------- > > Key: CASSANDRA-14297 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14297 > Project: Cassandra > Issue Type: Improvement > Components: Lifecycle > Reporter: Joseph Lynch > Assignee: Joseph Lynch > Priority: Minor > Labels: 4.0-feature-freeze-review-requested, PatchAvailable > > As I commented in CASSANDRA-13993, the current wait for functionality is a > great step in the right direction, but I don't think that the current setting > (70% of nodes in the cluster) is the right configuration option. First I > think this because 70% will not protect against errors as if you wait for 70% > of the cluster you could still very easily have {{UnavailableException}} or > {{ReadTimeoutException}} exceptions. This is because if you have even two > nodes down in different racks in a Cassandra cluster these exceptions are > possible (or with the default {{num_tokens}} setting of 256 it is basically > guaranteed). Second I think this option is not easy for operators to set, the > only setting I could think of that would "just work" is 100%. > I proposed in that ticket instead of having `block_for_peers_percentage` > defaulting to 70%, we instead have `block_for_peers` as a count of nodes that > are allowed to be down before the starting node makes itself available as a > coordinator. Of course, we would still have the timeout to limit startup time > and deal with really extreme situations (whole datacenters down etc). > I started working on a patch for this change [on > github|https://github.com/jasobrown/cassandra/compare/13993...jolynch:13993], > and am happy to finish it up with unit tests and such if someone can > review/commit it (maybe [~aweisberg]?). > I think the short version of my proposal is we replace: > {noformat} > block_for_peers_percentage: <percentage needed up, defaults to 70%> > {noformat} > with either > {noformat} > block_for_peers: <number that can be down, defaults to 1> > {noformat} > or, if we want to do even better imo and enable advanced operators to finely > tune this behavior (while still having good defaults that work for almost > everyone): > {noformat} > block_for_peers_local_dc: <number that can be down, defaults to 1> > block_for_peers_each_dc: <number that can be down, defaults to sys.maxint> > block_for_peers_all_dcs: <number that can be down, defaults to sys.maxint> > {noformat} > For example if an operator knows that they must be available at > {{LOCAL_QUORUM}} they would set {{block_for_peers_local_dc=1}}, if they use > {{EACH_QUOURM}} they would set {{block_for_peers_local_dc=1}}, if they use > {{QUORUM}} (RF=3, dcs=2) they would set {{block_for_peers_all_dcs=2}}. > Naturally everything would of course have a timeout to prevent startup taking > too long. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org