Re: [DISCUSS] Nested YAML configs for new features

Bowen Song Fri, 19 Nov 2021 12:53:44 -0800

I'm with Stefan. I prefer the flat YAML file which I can easily use grepto check and confirm the settings on large number of servers withparallel-ssh. This will be very hard to do on nested config in a YAML file.

In addition to that, I also use grep in the Cassandra source code tolocate the relevant files based on the config name. The flat config nameis long and unique, and this helps me efficiently navigate within thesource code. I can imagine this is not going to work very well (if itworks at all) with the nested config name.

p.s.: I'm not a Java developer, it will take me much longer to find therelevant code if grep doesn't work in the source code. It is also goingto be harder for me to understand it if the nested config is turned intoa Java object/class.


On 19/11/2021 19:07, Stefan Miklosovic wrote:

Hi David,

while I do not oppose nested structure, it is really handy to grep
cassandra.yaml on some config key and you know the value instantly.
This is not possible when it is nested (easily & fastly) as it is on
two lines. Or maybe my grepping is just not advanced enough to cover
this case? If it is flat, I can just grep "track_warnings" and I have
them all.

Can you elaborate on your last bullet point? Parsing layer ... What do
you mean specifically?

Thanks

On Fri, 19 Nov 2021 at 19:36, David Capwell <dcapw...@gmail.com> wrote:

This has been brought up in a few tickets, so pushing to the dev list.

CASSANDRA-15234 - Standardise config and JVM parameters
CASSANDRA-16896 - hard/soft limits for queries
CASSANDRA-17147 - Guardrails prototype

In short, do we as a project wish to move "new features" into nested
YAML when the feature has "enough" to justify the nesting?  I would
really like to focus this discussion on new features rather than
retroactively grouping (leaving that to CASSANDRA-15234), as there is
already a place to talk about that.

To get things started, let's start with the track-warning feature
(hard/soft limits for queries), currently the configs look as follows
(assuming 15234)

track_warnings:
     enabled: true
     coordinator_read_size:
         warn_threshold: 10kb
         abort_threshold: 1mb
     local_read_size:
         warn_threshold: 10kb
         abort_threshold: 1mb
     row_index_size:
         warn_threshold: 100mb
         abort_threshold: 1gb

or should this be "flat"

track_warnings_enabled: true
track_warnings_coordinator_read_size_warn_threshold: 10kb
track_warnings_coordinator_read_size_abort_threshold: 1mb
track_warnings_local_read_size_warn_threshold: 10kb
track_warnings_local_read_size_abort_threshold: 1mb
track_warnings_row_index_size_warn_threshold: 100mb
track_warnings_row_index_size_abort_threshold: 1gb

For me I prefer nested for a few reasons
* easier to enforce consistency as the configs can use shared types;
in the track warnings patch I had mismatches cross configs (warn vs
warns, fail vs abort, etc.) before going nested, now everything reuses
the same types
* even though it is longer, things can be more clear how they are related
* parsing layer can add support for mixed or purely flat depending on
user preference (example:
track_warnings.row_index_size.abort_threshold, using the '.' notation
to represent nested structures)

Thoughts?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: [DISCUSS] Nested YAML configs for new features

Reply via email to