I created

https://issues.apache.org/jira/browse/CASSANDRA-21303
https://issues.apache.org/jira/browse/CASSANDRA-21304
https://issues.apache.org/jira/browse/CASSANDRA-21305

For tracking the exposure of Paxos / TCM / Accord properties if we
ever decide so. For now we can just annotate them as hidden, they can
be disclosed later on.

On Mon, Apr 13, 2026 at 1:32 PM Štefan Miklošovič
<[email protected]> wrote:
>
> The quick feedback I gathered pinging TCM / Accord channels:
>
> Paxos - no particular opinion
> Accord - people will probably want to configure some of them
> TCM - we might just hide them for now and create a ticket to expose later
>
> Personally I do not know how to comment on these. Some are very
> specific and require knowledge to explain what they are for which I do
> not have.
>
> So I would say, let's hide them all, create the tickets and if
> somebody finds a need for exposing them they can do so (might be
> really anybody).
>
> On Sun, Apr 12, 2026 at 10:34 PM Pedro Gordo <[email protected]> 
> wrote:
> >
> > I've been working on this ticket. I have a JUnit test 
> > (ConfigYamlCoverageTest) with forward and reverse checks, and an 
> > @HiddenInYaml annotation ready to go. I've triaged all 131 Config.java 
> > fields that are missing from the yaml files and have decision suggestions 
> > for most of them.
> >
> > Before I share the full spreadsheet, I'd like to gauge whether we can 
> > handle some groups at the group level rather than property-by-property:
> >
> > Deprecated fields (13 fields): back_pressure_*, otc_coalescing_*, 
> > otc_backlog_*, windows_timer_interval, max_streaming_retries, 
> > repair_session_max_tree_depth, scripted_user_defined_functions_enabled, 
> > use_deterministic_table_id, cms_default_retry_backoff, 
> > cms_default_max_retry_backoff. These are all @Deprecated in Config.java. 
> > The test already skips them automatically, but should they also be 
> > annotated @HiddenInYaml for clarity?
> > Paxos v2 (16 fields): All the paxos_* and skip_paxos_repair_* fields from 
> > CEP-14. Internal LWT protocol tuning.
> > TCM/CMS (12 fields): CMS timeouts/retries (cms_*), metadata snapshotting, 
> > discovery timeout, unsafe_tcm_mode, progress barrier fields 
> > (progress_barrier_*), and short_rpc_timeout. All internal CEP-21 
> > infrastructure.
> > Accord (3 fields): accord_preaccept_timeout, concurrent_accord_operations, 
> > consensus_migration_cache_size. CEP-15 internals. Still maturing.
> >
> > Does anyone see a reason any of these should be exposed in cassandra.yaml 
> > rather than marked @HiddenInYaml? If we can agree on these groups, it 
> > reduces the remaining discussion to about 13 individual fields which is 
> > much more manageable.
> >
> > On Tue, 4 Mar 2025 at 09:31, Dmitry Konstantinov <[email protected]> wrote:
> >>
> >> >> https://docs.google.com/spreadsheets/d/11MOxhNqwE1tWP4ex2gzKG2pmeAWFaHDKo-CRp25h9BU/edit?gid=0#gid=0
> >> We still have a lot of rows empty. I have added many default values and a 
> >> Cassandra version when a parameter was introduced (to differentiate some 
> >> recent parameters from old ones) based on source code but it would be nice 
> >> to get a description for parameters from the authors as well as 
> >> classification exposed/hidden.
> >> Maybe we should not wait for collecting info about all parameters and 
> >> update what we have + use a threshold in the Ant validation task to fail 
> >> when new missed parameters are added. The logic in the dev branch here 
> >> https://issues.apache.org/jira/browse/CASSANDRA-20249 already supports a 
> >> threshold.
> >>
> >>
> >> On Tue, 28 Jan 2025 at 00:04, Josh McKenzie <[email protected]> wrote:
> >>>
> >>> Good point re: the implications of parsing and durability in the face of 
> >>> seeing unknown or missing parameters. I don't think widening the scope on 
> >>> that would be ideal, especially considering the entire impetus for this 
> >>> conversation is "we've misbehaved with our config and have a bunch of 
> >>> undocumented stuff we're not sure is still useful, or what it's for". =/
> >>>
> >>> On Mon, Jan 27, 2025, at 3:41 PM, Štefan Miklošovič wrote:
> >>>
> >>> "we take "unclaimed" items and move them to their own InternalConfig.java 
> >>> or something"
> >>>
> >>> This is interesting. If we are meant to be still able to put these 
> >>> properties into cassandra.yaml (even they are "internal ones") and they 
> >>> would be just in InternalConfig.java for some basic separation of 
> >>> internal / user-facing configuration, then we would need to have two yaml 
> >>> loaders:
> >>>
> >>> 1) the one as we have now which loads cassandra.yaml it into Config.java
> >>> 2) the second one which would load cassandra.yaml into InternalConfig.java
> >>>
> >>> For both cases, we could not fail when there are unrecognized properties 
> >>> in cassandra.yaml while parsing it (1), because every loader, for 
> >>> Config.java as well as InternalConfig.java, is parsing just some "subset" 
> >>> of yaml.
> >>>
> >>> (1) 
> >>> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/config/YamlConfigurationLoader.java#L443-L444
> >>>
> >>> If we just had "public InternalConfig internal = new InternalConfig" as a 
> >>> field in Config.java, then this would lead to properties being 
> >>> effectively renamed in cassandra.yaml like
> >>>
> >>> internal:
> >>>     some_currently_internal_property: false
> >>>
> >>> instead of just
> >>>
> >>> some_currently_internal_property: false
> >>>
> >>> I do not think we want to have them renamed / under different 
> >>> configuration sections in yaml. I get that they are "internal" etc but we 
> >>> just don't know how / where it is used and deployed and just blindly 
> >>> renaming them is not a good idea imho.
> >>>
> >>> On Mon, Jan 27, 2025 at 8:46 PM Josh McKenzie <[email protected]> 
> >>> wrote:
> >>>
> >>>
> >>> This may be an off-base comparison, but this reminds me of struggles 
> >>> we've had getting to 0 failing unit tests before and the debates on 
> >>> fencing off a snapshot of the current "failure set" so you can have a set 
> >>> point where no further degradation is allowed in a primary data set.
> >>>
> >>> All of which is to say - maybe at the end of the spreadsheet, we take 
> >>> "unclaimed" items and move them to their own InternalConfig.java or 
> >>> something and add an ant target that a) disallows further addition to 
> >>> InternalConfig.java w/out throwing an error / needing whitelist update, 
> >>> and b) disallows further regression in the Config.java <-> cassandra.yaml 
> >>> relationship for non-annotated fields.
> >>>
> >>> That way we can at least halt the progression of the disease even if 
> >>> we're stymied on cleaning up some of the existing symptoms.
> >>>
> >>> On Mon, Jan 27, 2025, at 1:38 PM, Štefan Miklošovič wrote:
> >>>
> >>> Indeed, we need to balance that and thoughtfully choose what is going to 
> >>> be added and what not. However, we should not hide something which is 
> >>> meant to be tweaked by a user. The config is intimidating mostly because 
> >>> everything is just in one file. I merely remember discussions a few years 
> >>> ago which were about splitting cassandra.yaml into multiple files which 
> >>> would be focused just on one subsystem / would cover some logically 
> >>> isolated domain.
> >>>
> >>> Anyway, I think the main goal of this effort for now would be to at least 
> >>> map where we are at. Some of them are genuinely missing. E.g. guardrails, 
> >>> how is a user meant to know about that if it is not even documented ...
> >>>
> >>> On Mon, Jan 27, 2025 at 6:16 PM Chris lohfink <[email protected]> wrote:
> >>>
> >>> Might be a bit of a balance between exposing what people actually are 
> >>> likely to need to modify vs having a super intimidating config file. It's 
> >>> already nearly 2000 lines. Personally I'd rather see some 
> >>> auto-documentation or something that's in the docs than an effort to 
> >>> manually add another 1000 lines.
> >>>
> >>> Chris
> >>>
> >>> On Fri, Jan 24, 2025 at 9:41 AM Dmitry Konstantinov <[email protected]> 
> >>> wrote:
> >>>
> >>> Maybe I missed some patterns but it looks like a pretty good estimation, 
> >>> I did like 10 random checks manually to verify :-)
> >>> I will try to make an ant target with a similar logic (hopefully, during 
> >>> the weekend)
> >>> I will create a ticket to track this activity (to share attachments there 
> >>> to not overload the thread with such outputs in future).
> >>>
> >>> On Fri, 24 Jan 2025 at 15:37, Štefan Miklošovič <[email protected]> 
> >>> wrote:
> >>>
> >>> Oh my god, 112? :DD I was thinking it would be less than 10.
> >>>
> >>> Anyway, I think we need to integrate this to some ant target. If you 
> >>> expanded on this, that would be great.
> >>>
> >>> On Fri, Jan 24, 2025 at 4:31 PM Dmitry Konstantinov <[email protected]> 
> >>> wrote:
> >>>
> >>> A very primitive implementation of the 1st idea below:
> >>>
> >>> String configUrl = 
> >>> "file:///Users/dmitry/IdeaProjects/cassandra-trunk/conf/cassandra.yaml";
> >>> Field[] allFields = Config.class.getFields();
> >>> List<String> topLevelPropertyNames = new ArrayList<>();
> >>> for(Field field : allFields)
> >>> {
> >>>     if (!Modifier.isStatic(field.getModifiers()))
> >>>     {
> >>>         topLevelPropertyNames.add(field.getName());
> >>>     }
> >>> }
> >>>
> >>> URL url = new URL(configUrl);
> >>> List<String> lines = Files.readAllLines(Paths.get(url.toURI()));
> >>>
> >>> int missedCount = 0;
> >>> for (String propertyName : topLevelPropertyNames)
> >>> {
> >>>     boolean found = false;
> >>>     for (String line : lines)
> >>>     {
> >>>         if (line.startsWith(propertyName + ":")
> >>>             || line.startsWith("#" + propertyName + ":")
> >>>             || line.startsWith("# " + propertyName + ":")) {
> >>>             found = true;
> >>>             break;
> >>>         }
> >>>     }
> >>>     if (!found)
> >>>     {
> >>>         missedCount++;
> >>>         System.out.println(propertyName);
> >>>     }
> >>> }
> >>> System.out.println("Total missed:" + missedCount);
> >>>
> >>>
> >>> It prints the following config property names which are defined in 
> >>> Config.java but not present as "property" or "# property " in a file:
> >>>
> >>> permissions_cache_max_entries
> >>> roles_cache_max_entries
> >>> credentials_cache_max_entries
> >>> auto_bootstrap
> >>> force_new_prepared_statement_behaviour
> >>> use_deterministic_table_id
> >>> repair_request_timeout
> >>> stream_transfer_task_timeout
> >>> cms_await_timeout
> >>> cms_default_max_retries
> >>> cms_default_retry_backoff
> >>> epoch_aware_debounce_inflight_tracker_max_size
> >>> metadata_snapshot_frequency
> >>> available_processors
> >>> repair_session_max_tree_depth
> >>> use_offheap_merkle_trees
> >>> internode_max_message_size
> >>> native_transport_max_message_size
> >>> native_transport_max_request_data_in_flight_per_ip
> >>> native_transport_max_request_data_in_flight
> >>> native_transport_receive_queue_capacity
> >>> min_free_space_per_drive
> >>> max_space_usable_for_compactions_in_percentage
> >>> reject_repair_compaction_threshold
> >>> concurrent_index_builders
> >>> max_streaming_retries
> >>> commitlog_max_compression_buffers_in_pool
> >>> max_mutation_size
> >>> dynamic_snitch
> >>> failure_detector
> >>> use_creation_time_for_hint_ttl
> >>> key_cache_migrate_during_compaction
> >>> key_cache_invalidate_after_sstable_deletion
> >>> paxos_cache_size
> >>> file_cache_round_up
> >>> disk_optimization_estimate_percentile
> >>> disk_optimization_page_cross_chance
> >>> purgeable_tobmstones_metric_granularity
> >>> windows_timer_interval
> >>> otc_coalescing_strategy
> >>> otc_coalescing_window_us
> >>> otc_coalescing_enough_coalesced_messages
> >>> otc_backlog_expiration_interval_ms
> >>> scripted_user_defined_functions_enabled
> >>> user_defined_functions_threads_enabled
> >>> allow_insecure_udfs
> >>> allow_extra_insecure_udfs
> >>> user_defined_functions_warn_timeout
> >>> user_defined_functions_fail_timeout
> >>> user_function_timeout_policy
> >>> back_pressure_enabled
> >>> back_pressure_strategy
> >>> repair_command_pool_full_strategy
> >>> repair_command_pool_size
> >>> block_for_peers_timeout_in_secs
> >>> block_for_peers_in_remote_dcs
> >>> skip_stream_disk_space_check
> >>> snapshot_on_repaired_data_mismatch
> >>> validation_preview_purge_head_start
> >>> initial_range_tombstone_list_allocation_size
> >>> range_tombstone_list_growth_factor
> >>> snapshot_on_duplicate_row_detection
> >>> check_for_duplicate_rows_during_reads
> >>> check_for_duplicate_rows_during_compaction
> >>> autocompaction_on_startup_enabled
> >>> auto_optimise_inc_repair_streams
> >>> auto_optimise_full_repair_streams
> >>> auto_optimise_preview_repair_streams
> >>> consecutive_message_errors_threshold
> >>> internode_error_reporting_exclusions
> >>> compact_tables_enabled
> >>> vector_type_enabled
> >>> intersect_filtering_query_warned
> >>> intersect_filtering_query_enabled
> >>> streaming_slow_events_log_timeout
> >>> repair_state_expires
> >>> repair_state_size
> >>> paxos_variant
> >>> skip_paxos_repair_on_topology_change
> >>> paxos_purge_grace_period
> >>> paxos_on_linearizability_violations
> >>> paxos_state_purging
> >>> paxos_repair_enabled
> >>> paxos_topology_repair_no_dc_checks
> >>> paxos_topology_repair_strict_each_quorum
> >>> skip_paxos_repair_on_topology_change_keyspaces
> >>> paxos_contention_wait_randomizer
> >>> paxos_contention_min_wait
> >>> paxos_contention_max_wait
> >>> paxos_contention_min_delta
> >>> paxos_repair_parallelism
> >>> sstable_read_rate_persistence_enabled
> >>> client_request_size_metrics_enabled
> >>> max_top_size_partition_count
> >>> max_top_tombstone_partition_count
> >>> min_tracked_partition_size
> >>> min_tracked_partition_tombstone_count
> >>> top_partitions_enabled
> >>> severity_during_decommission
> >>> progress_barrier_min_consistency_level
> >>> progress_barrier_default_consistency_level
> >>> progress_barrier_timeout
> >>> progress_barrier_backoff
> >>> discovery_timeout
> >>> unsafe_tcm_mode
> >>> cql_start_time
> >>> native_transport_throw_on_overload
> >>> native_transport_queue_max_item_age_threshold
> >>> native_transport_min_backoff_on_queue_overload
> >>> native_transport_max_backoff_on_queue_overload
> >>> native_transport_timeout
> >>> enforce_native_deadline_for_hints
> >>> Total missed:112
> >>>
> >>>
> >>>
> >>> On Fri, 24 Jan 2025 at 15:10, Štefan Miklošovič <[email protected]> 
> >>> wrote:
> >>>
> >>> It should also work the other way around. If there is a property which is 
> >>> commented out in yaml and it is not in Config.java, that should fail as 
> >>> well. If it is not commented out and it is not in Config.java, that will 
> >>> fail in runtime as it fails on unrecognized property.
> >>>
> >>> This will be used in practice very rarely as we seldom remove the 
> >>> properties in Config but if we do and a property is commented out, we 
> >>> should not ship a dead property name, even commented out.
> >>>
> >>> On Fri, Jan 24, 2025 at 3:51 PM Paulo Motta <[email protected]> wrote:
> >>>
> >>> >  >  If "# my_cool_property: true" is NOT in cassandra.yaml, we might 
> >>> > indeed add it, also commented out. I think it would be quite easy to 
> >>> > check against yaml if there is a line starting on "# my_cool_property" 
> >>> > or just on "my_cool_property". Both cases would satisfy the check.
> >>>
> >>> Makes sense, I think this would be good to have as a lint or test to 
> >>> easily catch overlooks during review.
> >>>
> >>> On Fri, Jan 24, 2025 at 9:44 AM Štefan Miklošovič 
> >>> <[email protected]> wrote:
> >>>
> >>>
> >>>
> >>> On Fri, Jan 24, 2025 at 3:27 PM Paulo Motta <[email protected]> wrote:
> >>>
> >>> > from time to time I see configuration properties in Config.java and 
> >>> > they are clearly not in cassandra.yaml. Not every property in Config is 
> >>> > in cassandra.yaml. I would like to know if there is some specific 
> >>> > reason behind that.
> >>>
> >>> I think one of the original reasons was to "hide" advanced configs that 
> >>> are not meant to be updated, unless in very niche circumstances. However 
> >>> I think this has been extrapolated to non-advanced settings.
> >>>
> >>> > Question related to that is if we could not have a build-time check 
> >>> > that all properties in Config have to be in cassandra.yaml and fail the 
> >>> > build if a property in Config does not have its counterpart in yaml.
> >>>
> >>> Are you saying every configuration property should be commented-out, or 
> >>> do you think that every Config property should be specified in 
> >>> cassandra.yaml with their default uncomented ? One issue with that is 
> >>> that you could cause user confusion if you "reveal" a niche/advanced 
> >>> config that is not meant to be updated. I think this would be addressed 
> >>> by the @HiddenInYaml flag you are proposing in a later post.
> >>>
> >>>
> >>> Yes, then can stay hidden, but we should annotate it with @Hidden or 
> >>> similar. As of now, if that property is not in yaml, we just don't know 
> >>> if it was forgotten to be added or if we have not added it on purpose.
> >>>
> >>> They can keep being commented out if they currently are. Imagine a 
> >>> property in Config.java
> >>>
> >>> public boolean my_cool_property = true;
> >>>
> >>> and then this in cassandra.yaml
> >>>
> >>> # my_cool_property: true
> >>>
> >>> It is completely ok.
> >>>
> >>> If "# my_cool_property: true" is NOT in cassandra.yaml, we might indeed 
> >>> add it, also commented out. I think it would be quite easy to check 
> >>> against yaml if there is a line starting on "# my_cool_property" or just 
> >>> on "my_cool_property". Both cases would satisfy the check.
> >>>
> >>>
> >>>
> >>> > There are dozens of properties in Config and I have a strong suspicion 
> >>> > that we missed to publish some to yaml so users do not even know such a 
> >>> > property exists and as of now we do not even know which they are.
> >>>
> >>> I believe this is a problem. I think most properties should be in 
> >>> cassandra.yaml, unless they are very advanced or not meant to be updated.
> >>>
> >>> Another tangential issue is that there are features/settings that don't 
> >>> even have a Config entry, but are just controlled by JVM properties.
> >>>
> >>> I think that we should attempt to unify Config and jvm properties under a 
> >>> predictable structure. For example, if there is a YAML config 
> >>> enable_user_defined_functions, then there should be a respective JVM flag 
> >>> -Dcassandra.enable_user_defined_functions, and vice versa.
> >>>
> >>>
> >>> Yeah, good idea.
> >>>
> >>>
> >>>
> >>> On Fri, Jan 24, 2025 at 9:16 AM Štefan Miklošovič 
> >>> <[email protected]> wrote:
> >>>
> >>> Hello,
> >>>
> >>> from time to time I see configuration properties in Config.java and they 
> >>> are clearly not in cassandra.yaml. Not every property in Config is in 
> >>> cassandra.yaml. I would like to know if there is some specific reason 
> >>> behind that.
> >>>
> >>> Question related to that is if we could not have a build-time check that 
> >>> all properties in Config have to be in cassandra.yaml and fail the build 
> >>> if a property in Config does not have its counterpart in yaml.
> >>>
> >>> There are dozens of properties in Config and I have a strong suspicion 
> >>> that we missed to publish some to yaml so users do not even know such a 
> >>> property exists and as of now we do not even know which they are.
> >>>
> >>>
> >>>
> >>> --
> >>> Dmitry Konstantinov
> >>>
> >>>
> >>>
> >>> --
> >>> Dmitry Konstantinov
> >>>
> >>>
> >>>
> >>
> >>
> >> --
> >> Dmitry Konstantinov

Reply via email to