I'm with Stefan. I prefer the flat YAML file which I can easily use grep to check and confirm the settings on large number of servers with parallel-ssh. This will be very hard to do on nested config in a YAML file.

In addition to that, I also use grep in the Cassandra source code to locate the relevant files based on the config name. The flat config name is long and unique, and this helps me efficiently navigate within the source code. I can imagine this is not going to work very well (if it works at all) with the nested config name.

p.s.: I'm not a Java developer, it will take me much longer to find the relevant code if grep doesn't work in the source code. It is also going to be harder for me to understand it if the nested config is turned into a Java object/class.

On 19/11/2021 19:07, Stefan Miklosovic wrote:
Hi David,

while I do not oppose nested structure, it is really handy to grep
cassandra.yaml on some config key and you know the value instantly.
This is not possible when it is nested (easily & fastly) as it is on
two lines. Or maybe my grepping is just not advanced enough to cover
this case? If it is flat, I can just grep "track_warnings" and I have
them all.

Can you elaborate on your last bullet point? Parsing layer ... What do
you mean specifically?

Thanks

On Fri, 19 Nov 2021 at 19:36, David Capwell <dcapw...@gmail.com> wrote:
This has been brought up in a few tickets, so pushing to the dev list.

CASSANDRA-15234 - Standardise config and JVM parameters
CASSANDRA-16896 - hard/soft limits for queries
CASSANDRA-17147 - Guardrails prototype

In short, do we as a project wish to move "new features" into nested
YAML when the feature has "enough" to justify the nesting?  I would
really like to focus this discussion on new features rather than
retroactively grouping (leaving that to CASSANDRA-15234), as there is
already a place to talk about that.

To get things started, let's start with the track-warning feature
(hard/soft limits for queries), currently the configs look as follows
(assuming 15234)

track_warnings:
     enabled: true
     coordinator_read_size:
         warn_threshold: 10kb
         abort_threshold: 1mb
     local_read_size:
         warn_threshold: 10kb
         abort_threshold: 1mb
     row_index_size:
         warn_threshold: 100mb
         abort_threshold: 1gb

or should this be "flat"

track_warnings_enabled: true
track_warnings_coordinator_read_size_warn_threshold: 10kb
track_warnings_coordinator_read_size_abort_threshold: 1mb
track_warnings_local_read_size_warn_threshold: 10kb
track_warnings_local_read_size_abort_threshold: 1mb
track_warnings_row_index_size_warn_threshold: 100mb
track_warnings_row_index_size_abort_threshold: 1gb

For me I prefer nested for a few reasons
* easier to enforce consistency as the configs can use shared types;
in the track warnings patch I had mismatches cross configs (warn vs
warns, fail vs abort, etc.) before going nested, now everything reuses
the same types
* even though it is longer, things can be more clear how they are related
* parsing layer can add support for mixed or purely flat depending on
user preference (example:
track_warnings.row_index_size.abort_threshold, using the '.' notation
to represent nested structures)

Thoughts?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Reply via email to