[jira] [Comment Edited] (CASSANDRA-17212) Migrate threshold for minimum keyspace replication factor to guardrails

Benedict Elliott Smith (Jira) Sat, 22 Jan 2022 01:21:34 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-17212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17480375#comment-17480375
 ]


Benedict Elliott Smith edited comment on CASSANDRA-17212 at 1/22/22, 9:20 AM:
------------------------------------------------------------------------------

bq. To me grouping based off feature is the only way for a config to actually 
be "discoverable", pulling bits and pieces out to other places because they are 
"limits" would always break this in my mind.

In my experience this is a near-impossible task to do consistently with an 
intuitive structure, as nobody really agrees what features should be called, 
let alone their hierarchy.

For instance, once you have a {{query}} heading, should that then include 
coordinator level configurations? timeouts? concurrency? CQL configurations? 
caches? Logically it _could_ include half of the settings. 

Many things also don't neatly fall under any feature, so we will fabricate 
features for them: non-query timeouts cannot be grouped under {{query}}, so now 
we probably _won't_ group query timeouts under {{query}} either, else it will 
be inconsistent. But now both are inconsistent, there's no rules underpinning 
it anymore.

This kind of grouping can also be more volatile - once a new setting (and 
suitable grouping) is introduced, it more readily affects the decisions for the 
existing hierarchy. This means this kind of layout probably needs a lot of 
upfront work to produce a coherent grouping for the whole config file, to 
demonstrate that it can be done in a manner everyone agrees on, and to avoid 
lots of churn.

I think when I tried to produce groupings, your approach was what I tried 
initially before deciding this approach was cleaner. Plus it has the added 
benefit that a user that _doesn't_ know the config options (i.e. most users), 
when encountering instability, will have good discoverability of dials that can 
be modified to improve cluster health. I think this is a very underrated 
benefit, as most users cannot afford teams of developers that are intimately 
familiar with the codebase.

TL;DR: it's messy, arbitrary and inconsistent. Some upfront work needs to be 
done to work out what it would like in totality.


was (Author: benedict):
bq. To me grouping based off feature is the only way for a config to actually 
be "discoverable", pulling bits and pieces out to other places because they are 
"limits" would always break this in my mind.

In my experience this is a near-impossible task to do consistently with an 
intuitive structure, as nobody really agrees what features should be called, 
let alone their hierarchy.

For instance, once you have a {{query}} heading, should that then include 
coordinator level configurations? timeouts? concurrency? CQL configurations? 
caches? Logically it _could_ include half of the settings. 

Many things also don't neatly fall under any feature, so we will fabricate 
features for them: non-query timeouts cannot be grouped under {{query}}, so now 
we probably _won't_ group query timeouts under {{query}} either, else it will 
be inconsistent. But now both are inconsistent, there's no rules underpinning 
it anymore.

This kind of grouping can also be more volatile - once a new setting (and 
suitable grouping) is introduced, it more readily affects the decisions for the 
existing hierarchy. This means this kind of layout probably needs a lot of 
upfront work to produce a coherent grouping for the whole config file, to 
demonstrate that it can be done in a manner everyone agrees on, and to avoid 
lots of churn.

I think when I tried to produce groupings, your approach was what I tried 
initially before deciding this approach was cleaner. Plus it has the added 
benefit that a user that _doesn't_ know the config options (i.e. most users), 
when encountering instability, will have good discoverability of dials that can 
be modified to improve cluster health. I think this is a very underrated 
benefit, as most users cannot afford teams of developers that are intimately 
familiar with the codebase.



> Migrate threshold for minimum keyspace replication factor to guardrails
> -----------------------------------------------------------------------
>
>                 Key: CASSANDRA-17212
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17212
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Feature/Guardrails
>            Reporter: Andres de la Peña
>            Priority: Normal
>
> The config property 
> [{{minimum_keyspace_rf}}|https://github.com/apache/cassandra/blob/5fdadb25f95099b8945d9d9ee11d3e380d3867f4/conf/cassandra.yaml]
>  that was added by CASSANDRA-14557 can be migrated to guardrails, for example:
> {code}
> guardrails:
>     ...
>     replication_factor:
>         warn_threshold: 2
>         abort_threshold: 3
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-17212) Migrate threshold for minimum keyspace replication factor to guardrails

Reply via email to