Thanks Stefan.

With that the 45 fields in the Deprecated, Paxos, TCM, and Accord groups
are now marked @HiddenInYaml.

8 fields are still DEFERRED for which I'm looking for second opinions. My
current lean in parentheses:

   - *native_transport_max_message_size*,
   *native_transport_max_request_data_in_flight*,
   *native_transport_max_request_data_in_flight_per_ip*, *max_mutation_size*
   (lean: hide all four) — interdependent quartet. When max_message_size is
   null it computes as min() of the other three; when set explicitly it's
   validated against the _in_flight ceilings at startup. Exposing any
   subset invites inconsistent configs.
   - *auto_bootstrap* (lean: yaml) — no runtime setter, widely documented
   in operational guides. Hiding it would surprise operators who already rely
   on it.
   - *compact_tables_enabled* (lean: hide) — COMPACT STORAGE is a 2.x
   legacy; the guardrail only gates new-table creation and has a JMX setter.
   - *min_free_space_per_drive* (lean: hide) — flagging that it's used in
   production paths beyond tests (compaction, memtable flush placement,
   streaming receive, sstable import), but hiding is still fine.
   - *enforce_native_deadline_for_hints* (lean: yaml) — deliberate
   correctness-vs-overload knob with JMX setter; the default should be visible
   to operators.

Each row in the spreadsheet has the full reasoning and the original JIRA.
Feel free to leave a comment on the rows if you disagree with my suggestion
to hide them or put them in YAML, or reply here. Once decided I'll apply
annotations and yaml entries in one PR.

Spreadsheet: CASSANDRA-20249 audit
<https://docs.google.com/spreadsheets/d/1_gtX6hqZdzDZEYo3WsXpkT1Ux79rN7k0/edit?usp=sharing&ouid=109301672816987582477&rtpof=true&sd=true>

On Mon, 13 Apr 2026 at 12:45, Štefan Miklošovič <[email protected]>
wrote:

> I created
>
> https://issues.apache.org/jira/browse/CASSANDRA-21303
> https://issues.apache.org/jira/browse/CASSANDRA-21304
> https://issues.apache.org/jira/browse/CASSANDRA-21305
>
> For tracking the exposure of Paxos / TCM / Accord properties if we
> ever decide so. For now we can just annotate them as hidden, they can
> be disclosed later on.
>
> On Mon, Apr 13, 2026 at 1:32 PM Štefan Miklošovič
> <[email protected]> wrote:
> >
> > The quick feedback I gathered pinging TCM / Accord channels:
> >
> > Paxos - no particular opinion
> > Accord - people will probably want to configure some of them
> > TCM - we might just hide them for now and create a ticket to expose later
> >
> > Personally I do not know how to comment on these. Some are very
> > specific and require knowledge to explain what they are for which I do
> > not have.
> >
> > So I would say, let's hide them all, create the tickets and if
> > somebody finds a need for exposing them they can do so (might be
> > really anybody).
> >
> > On Sun, Apr 12, 2026 at 10:34 PM Pedro Gordo <[email protected]>
> wrote:
> > >
> > > I've been working on this ticket. I have a JUnit test
> (ConfigYamlCoverageTest) with forward and reverse checks, and an
> @HiddenInYaml annotation ready to go. I've triaged all 131 Config.java
> fields that are missing from the yaml files and have decision suggestions
> for most of them.
> > >
> > > Before I share the full spreadsheet, I'd like to gauge whether we can
> handle some groups at the group level rather than property-by-property:
> > >
> > > Deprecated fields (13 fields): back_pressure_*, otc_coalescing_*,
> otc_backlog_*, windows_timer_interval, max_streaming_retries,
> repair_session_max_tree_depth, scripted_user_defined_functions_enabled,
> use_deterministic_table_id, cms_default_retry_backoff,
> cms_default_max_retry_backoff. These are all @Deprecated in Config.java.
> The test already skips them automatically, but should they also be
> annotated @HiddenInYaml for clarity?
> > > Paxos v2 (16 fields): All the paxos_* and skip_paxos_repair_* fields
> from CEP-14. Internal LWT protocol tuning.
> > > TCM/CMS (12 fields): CMS timeouts/retries (cms_*), metadata
> snapshotting, discovery timeout, unsafe_tcm_mode, progress barrier fields
> (progress_barrier_*), and short_rpc_timeout. All internal CEP-21
> infrastructure.
> > > Accord (3 fields): accord_preaccept_timeout,
> concurrent_accord_operations, consensus_migration_cache_size. CEP-15
> internals. Still maturing.
> > >
> > > Does anyone see a reason any of these should be exposed in
> cassandra.yaml rather than marked @HiddenInYaml? If we can agree on these
> groups, it reduces the remaining discussion to about 13 individual fields
> which is much more manageable.
> > >
> > > On Tue, 4 Mar 2025 at 09:31, Dmitry Konstantinov <[email protected]>
> wrote:
> > >>
> > >> >>
> https://docs.google.com/spreadsheets/d/11MOxhNqwE1tWP4ex2gzKG2pmeAWFaHDKo-CRp25h9BU/edit?gid=0#gid=0
> > >> We still have a lot of rows empty. I have added many default values
> and a Cassandra version when a parameter was introduced (to differentiate
> some recent parameters from old ones) based on source code but it would be
> nice to get a description for parameters from the authors as well as
> classification exposed/hidden.
> > >> Maybe we should not wait for collecting info about all parameters and
> update what we have + use a threshold in the Ant validation task to fail
> when new missed parameters are added. The logic in the dev branch here
> https://issues.apache.org/jira/browse/CASSANDRA-20249 already supports a
> threshold.
> > >>
> > >>
> > >> On Tue, 28 Jan 2025 at 00:04, Josh McKenzie <[email protected]>
> wrote:
> > >>>
> > >>> Good point re: the implications of parsing and durability in the
> face of seeing unknown or missing parameters. I don't think widening the
> scope on that would be ideal, especially considering the entire impetus for
> this conversation is "we've misbehaved with our config and have a bunch of
> undocumented stuff we're not sure is still useful, or what it's for". =/
> > >>>
> > >>> On Mon, Jan 27, 2025, at 3:41 PM, Štefan Miklošovič wrote:
> > >>>
> > >>> "we take "unclaimed" items and move them to their own
> InternalConfig.java or something"
> > >>>
> > >>> This is interesting. If we are meant to be still able to put these
> properties into cassandra.yaml (even they are "internal ones") and they
> would be just in InternalConfig.java for some basic separation of internal
> / user-facing configuration, then we would need to have two yaml loaders:
> > >>>
> > >>> 1) the one as we have now which loads cassandra.yaml it into
> Config.java
> > >>> 2) the second one which would load cassandra.yaml into
> InternalConfig.java
> > >>>
> > >>> For both cases, we could not fail when there are unrecognized
> properties in cassandra.yaml while parsing it (1), because every loader,
> for Config.java as well as InternalConfig.java, is parsing just some
> "subset" of yaml.
> > >>>
> > >>> (1)
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/config/YamlConfigurationLoader.java#L443-L444
> > >>>
> > >>> If we just had "public InternalConfig internal = new InternalConfig"
> as a field in Config.java, then this would lead to properties being
> effectively renamed in cassandra.yaml like
> > >>>
> > >>> internal:
> > >>>     some_currently_internal_property: false
> > >>>
> > >>> instead of just
> > >>>
> > >>> some_currently_internal_property: false
> > >>>
> > >>> I do not think we want to have them renamed / under different
> configuration sections in yaml. I get that they are "internal" etc but we
> just don't know how / where it is used and deployed and just blindly
> renaming them is not a good idea imho.
> > >>>
> > >>> On Mon, Jan 27, 2025 at 8:46 PM Josh McKenzie <[email protected]>
> wrote:
> > >>>
> > >>>
> > >>> This may be an off-base comparison, but this reminds me of struggles
> we've had getting to 0 failing unit tests before and the debates on fencing
> off a snapshot of the current "failure set" so you can have a set point
> where no further degradation is allowed in a primary data set.
> > >>>
> > >>> All of which is to say - maybe at the end of the spreadsheet, we
> take "unclaimed" items and move them to their own InternalConfig.java or
> something and add an ant target that a) disallows further addition to
> InternalConfig.java w/out throwing an error / needing whitelist update, and
> b) disallows further regression in the Config.java <-> cassandra.yaml
> relationship for non-annotated fields.
> > >>>
> > >>> That way we can at least halt the progression of the disease even if
> we're stymied on cleaning up some of the existing symptoms.
> > >>>
> > >>> On Mon, Jan 27, 2025, at 1:38 PM, Štefan Miklošovič wrote:
> > >>>
> > >>> Indeed, we need to balance that and thoughtfully choose what is
> going to be added and what not. However, we should not hide something which
> is meant to be tweaked by a user. The config is intimidating mostly because
> everything is just in one file. I merely remember discussions a few years
> ago which were about splitting cassandra.yaml into multiple files which
> would be focused just on one subsystem / would cover some logically
> isolated domain.
> > >>>
> > >>> Anyway, I think the main goal of this effort for now would be to at
> least map where we are at. Some of them are genuinely missing. E.g.
> guardrails, how is a user meant to know about that if it is not even
> documented ...
> > >>>
> > >>> On Mon, Jan 27, 2025 at 6:16 PM Chris lohfink <[email protected]>
> wrote:
> > >>>
> > >>> Might be a bit of a balance between exposing what people actually
> are likely to need to modify vs having a super intimidating config file.
> It's already nearly 2000 lines. Personally I'd rather see some
> auto-documentation or something that's in the docs than an effort to
> manually add another 1000 lines.
> > >>>
> > >>> Chris
> > >>>
> > >>> On Fri, Jan 24, 2025 at 9:41 AM Dmitry Konstantinov <
> [email protected]> wrote:
> > >>>
> > >>> Maybe I missed some patterns but it looks like a pretty good
> estimation, I did like 10 random checks manually to verify :-)
> > >>> I will try to make an ant target with a similar logic (hopefully,
> during the weekend)
> > >>> I will create a ticket to track this activity (to share attachments
> there to not overload the thread with such outputs in future).
> > >>>
> > >>> On Fri, 24 Jan 2025 at 15:37, Štefan Miklošovič <
> [email protected]> wrote:
> > >>>
> > >>> Oh my god, 112? :DD I was thinking it would be less than 10.
> > >>>
> > >>> Anyway, I think we need to integrate this to some ant target. If you
> expanded on this, that would be great.
> > >>>
> > >>> On Fri, Jan 24, 2025 at 4:31 PM Dmitry Konstantinov <
> [email protected]> wrote:
> > >>>
> > >>> A very primitive implementation of the 1st idea below:
> > >>>
> > >>> String configUrl =
> "file:///Users/dmitry/IdeaProjects/cassandra-trunk/conf/cassandra.yaml";
> > >>> Field[] allFields = Config.class.getFields();
> > >>> List<String> topLevelPropertyNames = new ArrayList<>();
> > >>> for(Field field : allFields)
> > >>> {
> > >>>     if (!Modifier.isStatic(field.getModifiers()))
> > >>>     {
> > >>>         topLevelPropertyNames.add(field.getName());
> > >>>     }
> > >>> }
> > >>>
> > >>> URL url = new URL(configUrl);
> > >>> List<String> lines = Files.readAllLines(Paths.get(url.toURI()));
> > >>>
> > >>> int missedCount = 0;
> > >>> for (String propertyName : topLevelPropertyNames)
> > >>> {
> > >>>     boolean found = false;
> > >>>     for (String line : lines)
> > >>>     {
> > >>>         if (line.startsWith(propertyName + ":")
> > >>>             || line.startsWith("#" + propertyName + ":")
> > >>>             || line.startsWith("# " + propertyName + ":")) {
> > >>>             found = true;
> > >>>             break;
> > >>>         }
> > >>>     }
> > >>>     if (!found)
> > >>>     {
> > >>>         missedCount++;
> > >>>         System.out.println(propertyName);
> > >>>     }
> > >>> }
> > >>> System.out.println("Total missed:" + missedCount);
> > >>>
> > >>>
> > >>> It prints the following config property names which are defined in
> Config.java but not present as "property" or "# property " in a file:
> > >>>
> > >>> permissions_cache_max_entries
> > >>> roles_cache_max_entries
> > >>> credentials_cache_max_entries
> > >>> auto_bootstrap
> > >>> force_new_prepared_statement_behaviour
> > >>> use_deterministic_table_id
> > >>> repair_request_timeout
> > >>> stream_transfer_task_timeout
> > >>> cms_await_timeout
> > >>> cms_default_max_retries
> > >>> cms_default_retry_backoff
> > >>> epoch_aware_debounce_inflight_tracker_max_size
> > >>> metadata_snapshot_frequency
> > >>> available_processors
> > >>> repair_session_max_tree_depth
> > >>> use_offheap_merkle_trees
> > >>> internode_max_message_size
> > >>> native_transport_max_message_size
> > >>> native_transport_max_request_data_in_flight_per_ip
> > >>> native_transport_max_request_data_in_flight
> > >>> native_transport_receive_queue_capacity
> > >>> min_free_space_per_drive
> > >>> max_space_usable_for_compactions_in_percentage
> > >>> reject_repair_compaction_threshold
> > >>> concurrent_index_builders
> > >>> max_streaming_retries
> > >>> commitlog_max_compression_buffers_in_pool
> > >>> max_mutation_size
> > >>> dynamic_snitch
> > >>> failure_detector
> > >>> use_creation_time_for_hint_ttl
> > >>> key_cache_migrate_during_compaction
> > >>> key_cache_invalidate_after_sstable_deletion
> > >>> paxos_cache_size
> > >>> file_cache_round_up
> > >>> disk_optimization_estimate_percentile
> > >>> disk_optimization_page_cross_chance
> > >>> purgeable_tobmstones_metric_granularity
> > >>> windows_timer_interval
> > >>> otc_coalescing_strategy
> > >>> otc_coalescing_window_us
> > >>> otc_coalescing_enough_coalesced_messages
> > >>> otc_backlog_expiration_interval_ms
> > >>> scripted_user_defined_functions_enabled
> > >>> user_defined_functions_threads_enabled
> > >>> allow_insecure_udfs
> > >>> allow_extra_insecure_udfs
> > >>> user_defined_functions_warn_timeout
> > >>> user_defined_functions_fail_timeout
> > >>> user_function_timeout_policy
> > >>> back_pressure_enabled
> > >>> back_pressure_strategy
> > >>> repair_command_pool_full_strategy
> > >>> repair_command_pool_size
> > >>> block_for_peers_timeout_in_secs
> > >>> block_for_peers_in_remote_dcs
> > >>> skip_stream_disk_space_check
> > >>> snapshot_on_repaired_data_mismatch
> > >>> validation_preview_purge_head_start
> > >>> initial_range_tombstone_list_allocation_size
> > >>> range_tombstone_list_growth_factor
> > >>> snapshot_on_duplicate_row_detection
> > >>> check_for_duplicate_rows_during_reads
> > >>> check_for_duplicate_rows_during_compaction
> > >>> autocompaction_on_startup_enabled
> > >>> auto_optimise_inc_repair_streams
> > >>> auto_optimise_full_repair_streams
> > >>> auto_optimise_preview_repair_streams
> > >>> consecutive_message_errors_threshold
> > >>> internode_error_reporting_exclusions
> > >>> compact_tables_enabled
> > >>> vector_type_enabled
> > >>> intersect_filtering_query_warned
> > >>> intersect_filtering_query_enabled
> > >>> streaming_slow_events_log_timeout
> > >>> repair_state_expires
> > >>> repair_state_size
> > >>> paxos_variant
> > >>> skip_paxos_repair_on_topology_change
> > >>> paxos_purge_grace_period
> > >>> paxos_on_linearizability_violations
> > >>> paxos_state_purging
> > >>> paxos_repair_enabled
> > >>> paxos_topology_repair_no_dc_checks
> > >>> paxos_topology_repair_strict_each_quorum
> > >>> skip_paxos_repair_on_topology_change_keyspaces
> > >>> paxos_contention_wait_randomizer
> > >>> paxos_contention_min_wait
> > >>> paxos_contention_max_wait
> > >>> paxos_contention_min_delta
> > >>> paxos_repair_parallelism
> > >>> sstable_read_rate_persistence_enabled
> > >>> client_request_size_metrics_enabled
> > >>> max_top_size_partition_count
> > >>> max_top_tombstone_partition_count
> > >>> min_tracked_partition_size
> > >>> min_tracked_partition_tombstone_count
> > >>> top_partitions_enabled
> > >>> severity_during_decommission
> > >>> progress_barrier_min_consistency_level
> > >>> progress_barrier_default_consistency_level
> > >>> progress_barrier_timeout
> > >>> progress_barrier_backoff
> > >>> discovery_timeout
> > >>> unsafe_tcm_mode
> > >>> cql_start_time
> > >>> native_transport_throw_on_overload
> > >>> native_transport_queue_max_item_age_threshold
> > >>> native_transport_min_backoff_on_queue_overload
> > >>> native_transport_max_backoff_on_queue_overload
> > >>> native_transport_timeout
> > >>> enforce_native_deadline_for_hints
> > >>> Total missed:112
> > >>>
> > >>>
> > >>>
> > >>> On Fri, 24 Jan 2025 at 15:10, Štefan Miklošovič <
> [email protected]> wrote:
> > >>>
> > >>> It should also work the other way around. If there is a property
> which is commented out in yaml and it is not in Config.java, that should
> fail as well. If it is not commented out and it is not in Config.java, that
> will fail in runtime as it fails on unrecognized property.
> > >>>
> > >>> This will be used in practice very rarely as we seldom remove the
> properties in Config but if we do and a property is commented out, we
> should not ship a dead property name, even commented out.
> > >>>
> > >>> On Fri, Jan 24, 2025 at 3:51 PM Paulo Motta <[email protected]>
> wrote:
> > >>>
> > >>> >  >  If "# my_cool_property: true" is NOT in cassandra.yaml, we
> might indeed add it, also commented out. I think it would be quite easy to
> check against yaml if there is a line starting on "# my_cool_property" or
> just on "my_cool_property". Both cases would satisfy the check.
> > >>>
> > >>> Makes sense, I think this would be good to have as a lint or test to
> easily catch overlooks during review.
> > >>>
> > >>> On Fri, Jan 24, 2025 at 9:44 AM Štefan Miklošovič <
> [email protected]> wrote:
> > >>>
> > >>>
> > >>>
> > >>> On Fri, Jan 24, 2025 at 3:27 PM Paulo Motta <[email protected]>
> wrote:
> > >>>
> > >>> > from time to time I see configuration properties in Config.java
> and they are clearly not in cassandra.yaml. Not every property in Config is
> in cassandra.yaml. I would like to know if there is some specific reason
> behind that.
> > >>>
> > >>> I think one of the original reasons was to "hide" advanced configs
> that are not meant to be updated, unless in very niche circumstances.
> However I think this has been extrapolated to non-advanced settings.
> > >>>
> > >>> > Question related to that is if we could not have a build-time
> check that all properties in Config have to be in cassandra.yaml and fail
> the build if a property in Config does not have its counterpart in yaml.
> > >>>
> > >>> Are you saying every configuration property should be commented-out,
> or do you think that every Config property should be specified in
> cassandra.yaml with their default uncomented ? One issue with that is that
> you could cause user confusion if you "reveal" a niche/advanced config that
> is not meant to be updated. I think this would be addressed by the
> @HiddenInYaml flag you are proposing in a later post.
> > >>>
> > >>>
> > >>> Yes, then can stay hidden, but we should annotate it with @Hidden or
> similar. As of now, if that property is not in yaml, we just don't know if
> it was forgotten to be added or if we have not added it on purpose.
> > >>>
> > >>> They can keep being commented out if they currently are. Imagine a
> property in Config.java
> > >>>
> > >>> public boolean my_cool_property = true;
> > >>>
> > >>> and then this in cassandra.yaml
> > >>>
> > >>> # my_cool_property: true
> > >>>
> > >>> It is completely ok.
> > >>>
> > >>> If "# my_cool_property: true" is NOT in cassandra.yaml, we might
> indeed add it, also commented out. I think it would be quite easy to check
> against yaml if there is a line starting on "# my_cool_property" or just on
> "my_cool_property". Both cases would satisfy the check.
> > >>>
> > >>>
> > >>>
> > >>> > There are dozens of properties in Config and I have a strong
> suspicion that we missed to publish some to yaml so users do not even know
> such a property exists and as of now we do not even know which they are.
> > >>>
> > >>> I believe this is a problem. I think most properties should be in
> cassandra.yaml, unless they are very advanced or not meant to be updated.
> > >>>
> > >>> Another tangential issue is that there are features/settings that
> don't even have a Config entry, but are just controlled by JVM properties.
> > >>>
> > >>> I think that we should attempt to unify Config and jvm properties
> under a predictable structure. For example, if there is a YAML config
> enable_user_defined_functions, then there should be a respective JVM flag
> -Dcassandra.enable_user_defined_functions, and vice versa.
> > >>>
> > >>>
> > >>> Yeah, good idea.
> > >>>
> > >>>
> > >>>
> > >>> On Fri, Jan 24, 2025 at 9:16 AM Štefan Miklošovič <
> [email protected]> wrote:
> > >>>
> > >>> Hello,
> > >>>
> > >>> from time to time I see configuration properties in Config.java and
> they are clearly not in cassandra.yaml. Not every property in Config is in
> cassandra.yaml. I would like to know if there is some specific reason
> behind that.
> > >>>
> > >>> Question related to that is if we could not have a build-time check
> that all properties in Config have to be in cassandra.yaml and fail the
> build if a property in Config does not have its counterpart in yaml.
> > >>>
> > >>> There are dozens of properties in Config and I have a strong
> suspicion that we missed to publish some to yaml so users do not even know
> such a property exists and as of now we do not even know which they are.
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> Dmitry Konstantinov
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> Dmitry Konstantinov
> > >>>
> > >>>
> > >>>
> > >>
> > >>
> > >> --
> > >> Dmitry Konstantinov
>

Reply via email to