I've been working on this ticket. I have a JUnit test
(ConfigYamlCoverageTest) with forward and reverse checks, and an
@HiddenInYaml annotation ready to go. I've triaged all 131 Config.java
fields that are missing from the yaml files and have decision suggestions
for most of them.

Before I share the full spreadsheet, I'd like to gauge whether we can
handle some groups at the group level rather than property-by-property:

   - *Deprecated fields (13 fields):* back_pressure_*, otc_coalescing_*,
   otc_backlog_*, windows_timer_interval, max_streaming_retries,
   repair_session_max_tree_depth, scripted_user_defined_functions_enabled,
   use_deterministic_table_id, cms_default_retry_backoff,
   cms_default_max_retry_backoff. These are all @Deprecated in Config.java.
   The test already skips them automatically, but should they also be
   annotated @HiddenInYaml for clarity?
   - *Paxos v2 (16 fields):* All the paxos_* and skip_paxos_repair_* fields
   from CEP-14. Internal LWT protocol tuning.
   - *TCM/CMS (12 fields):* CMS timeouts/retries (cms_*), metadata
   snapshotting, discovery timeout, unsafe_tcm_mode, progress barrier fields
   (progress_barrier_*), and short_rpc_timeout. All internal CEP-21
   infrastructure.
   - *Accord (3 fields):* accord_preaccept_timeout,
   concurrent_accord_operations, consensus_migration_cache_size. CEP-15
   internals. Still maturing.

Does anyone see a reason any of these should be exposed in cassandra.yaml
rather than marked @HiddenInYaml? If we can agree on these groups, it
reduces the remaining discussion to about 13 individual fields which is
much more manageable.

On Tue, 4 Mar 2025 at 09:31, Dmitry Konstantinov <[email protected]> wrote:

> >>
> https://docs.google.com/spreadsheets/d/11MOxhNqwE1tWP4ex2gzKG2pmeAWFaHDKo-CRp25h9BU/edit?gid=0#gid=0
> We still have a lot of rows empty. I have added many default values and a
> Cassandra version when a parameter was introduced (to differentiate some
> recent parameters from old ones) based on source code but it would be nice
> to get a description for parameters from the authors as well as
> classification exposed/hidden.
> Maybe we should not wait for collecting info about all parameters and
> update what we have + use a threshold in the Ant validation task to fail
> when new missed parameters are added. The logic in the dev branch here
> https://issues.apache.org/jira/browse/CASSANDRA-20249 already supports a
> threshold.
>
>
> On Tue, 28 Jan 2025 at 00:04, Josh McKenzie <[email protected]> wrote:
>
>> Good point re: the implications of parsing and durability in the face of
>> seeing unknown or missing parameters. I don't think widening the scope on
>> that would be ideal, especially considering the entire impetus for this
>> conversation is "we've misbehaved with our config and have a bunch of
>> undocumented stuff we're not sure is still useful, or what it's for". =/
>>
>> On Mon, Jan 27, 2025, at 3:41 PM, Štefan Miklošovič wrote:
>>
>> "we take "unclaimed" items and move them to their own InternalConfig.java
>> or something"
>>
>> This is interesting. If we are meant to be still able to put these
>> properties into cassandra.yaml (even they are "internal ones") and they
>> would be just in InternalConfig.java for some basic separation of internal
>> / user-facing configuration, then we would need to have two yaml loaders:
>>
>> 1) the one as we have now which loads cassandra.yaml it into Config.java
>> 2) the second one which would load cassandra.yaml into InternalConfig.java
>>
>> For both cases, we could not fail when there are unrecognized properties
>> in cassandra.yaml while parsing it (1), because every loader, for
>> Config.java as well as InternalConfig.java, is parsing just some "subset"
>> of yaml.
>>
>> (1)
>> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/config/YamlConfigurationLoader.java#L443-L444
>>
>> If we just had "public InternalConfig internal = new InternalConfig" as a
>> field in Config.java, then this would lead to properties being effectively
>> renamed in cassandra.yaml like
>>
>> internal:
>>     some_currently_internal_property: false
>>
>> instead of just
>>
>> some_currently_internal_property: false
>>
>> I do not think we want to have them renamed / under different
>> configuration sections in yaml. I get that they are "internal" etc but we
>> just don't know how / where it is used and deployed and just blindly
>> renaming them is not a good idea imho.
>>
>> On Mon, Jan 27, 2025 at 8:46 PM Josh McKenzie <[email protected]>
>> wrote:
>>
>>
>> This may be an off-base comparison, but this reminds me of struggles
>> we've had getting to 0 failing unit tests before and the debates on fencing
>> off a snapshot of the current "failure set" so you can have a set point
>> where no further degradation is allowed in a primary data set.
>>
>> All of which is to say - maybe at the end of the spreadsheet, we take
>> "unclaimed" items and move them to their own InternalConfig.java or
>> something and add an ant target that a) disallows further addition to
>> InternalConfig.java w/out throwing an error / needing whitelist update, and
>> b) disallows further regression in the Config.java <-> cassandra.yaml
>> relationship for non-annotated fields.
>>
>> That way we can at least halt the progression of the disease even if
>> we're stymied on cleaning up some of the existing symptoms.
>>
>> On Mon, Jan 27, 2025, at 1:38 PM, Štefan Miklošovič wrote:
>>
>> Indeed, we need to balance that and thoughtfully choose what is going to
>> be added and what not. However, we should not hide something which is meant
>> to be tweaked by a user. The config is intimidating mostly because
>> everything is just in one file. I merely remember discussions a few years
>> ago which were about splitting cassandra.yaml into multiple files which
>> would be focused just on one subsystem / would cover some logically
>> isolated domain.
>>
>> Anyway, I think the main goal of this effort for now would be to at least
>> map where we are at. Some of them are genuinely missing. E.g. guardrails,
>> how is a user meant to know about that if it is not even documented ...
>>
>> On Mon, Jan 27, 2025 at 6:16 PM Chris lohfink <[email protected]> wrote:
>>
>> Might be a bit of a balance between exposing what people actually are
>> likely to need to modify vs having a super intimidating config file. It's
>> already nearly 2000 lines. Personally I'd rather see some
>> auto-documentation or something that's in the docs
>> <https://cassandra.apache.org/doc/latest/cassandra/managing/configuration/cass_yaml_file.html>
>> than an effort to manually add another 1000 lines.
>>
>> Chris
>>
>> On Fri, Jan 24, 2025 at 9:41 AM Dmitry Konstantinov <[email protected]>
>> wrote:
>>
>> Maybe I missed some patterns but it looks like a pretty good estimation,
>> I did like 10 random checks manually to verify :-)
>> I will try to make an ant target with a similar logic (hopefully, during
>> the weekend)
>> I will create a ticket to track this activity (to share attachments there
>> to not overload the thread with such outputs in future).
>>
>> On Fri, 24 Jan 2025 at 15:37, Štefan Miklošovič <[email protected]>
>> wrote:
>>
>> Oh my god, 112? :DD I was thinking it would be less than 10.
>>
>> Anyway, I think we need to integrate this to some ant target. If you
>> expanded on this, that would be great.
>>
>> On Fri, Jan 24, 2025 at 4:31 PM Dmitry Konstantinov <[email protected]>
>> wrote:
>>
>> A very primitive implementation of the 1st idea below:
>>
>> String configUrl = 
>> "file:///Users/dmitry/IdeaProjects/cassandra-trunk/conf/cassandra.yaml";
>> Field[] allFields = Config.class.getFields();
>> List<String> topLevelPropertyNames = new ArrayList<>();
>> for(Field field : allFields)
>> {
>>     if (!Modifier.*isStatic*(field.getModifiers()))
>>     {
>>         topLevelPropertyNames.add(field.getName());
>>     }
>> }
>>
>> URL url = new URL(configUrl);
>> List<String> lines = Files.*readAllLines*(Paths.*get*(url.toURI()));
>>
>> int missedCount = 0;
>> for (String propertyName : topLevelPropertyNames)
>> {
>>     boolean found = false;
>>     for (String line : lines)
>>     {
>>         if (line.startsWith(propertyName + ":")
>>             || line.startsWith("#" + propertyName + ":")
>>             || line.startsWith("# " + propertyName + ":")) {
>>             found = true;
>>             break;
>>         }
>>     }
>>     if (!found)
>>     {
>>         missedCount++;
>>         System.*out*.println(propertyName);
>>     }
>> }
>> System.*out*.println("Total missed:" + missedCount);
>>
>>
>> It prints the following config property names which are defined in 
>> Config.java but not present as "property" or "# property " in a file:
>>
>> permissions_cache_max_entries
>> roles_cache_max_entries
>> credentials_cache_max_entries
>> auto_bootstrap
>> force_new_prepared_statement_behaviour
>> use_deterministic_table_id
>> repair_request_timeout
>> stream_transfer_task_timeout
>> cms_await_timeout
>> cms_default_max_retries
>> cms_default_retry_backoff
>> epoch_aware_debounce_inflight_tracker_max_size
>> metadata_snapshot_frequency
>> available_processors
>> repair_session_max_tree_depth
>> use_offheap_merkle_trees
>> internode_max_message_size
>> native_transport_max_message_size
>> native_transport_max_request_data_in_flight_per_ip
>> native_transport_max_request_data_in_flight
>> native_transport_receive_queue_capacity
>> min_free_space_per_drive
>> max_space_usable_for_compactions_in_percentage
>> reject_repair_compaction_threshold
>> concurrent_index_builders
>> max_streaming_retries
>> commitlog_max_compression_buffers_in_pool
>> max_mutation_size
>> dynamic_snitch
>> failure_detector
>> use_creation_time_for_hint_ttl
>> key_cache_migrate_during_compaction
>> key_cache_invalidate_after_sstable_deletion
>> paxos_cache_size
>> file_cache_round_up
>> disk_optimization_estimate_percentile
>> disk_optimization_page_cross_chance
>> purgeable_tobmstones_metric_granularity
>> windows_timer_interval
>> otc_coalescing_strategy
>> otc_coalescing_window_us
>> otc_coalescing_enough_coalesced_messages
>> otc_backlog_expiration_interval_ms
>> scripted_user_defined_functions_enabled
>> user_defined_functions_threads_enabled
>> allow_insecure_udfs
>> allow_extra_insecure_udfs
>> user_defined_functions_warn_timeout
>> user_defined_functions_fail_timeout
>> user_function_timeout_policy
>> back_pressure_enabled
>> back_pressure_strategy
>> repair_command_pool_full_strategy
>> repair_command_pool_size
>> block_for_peers_timeout_in_secs
>> block_for_peers_in_remote_dcs
>> skip_stream_disk_space_check
>> snapshot_on_repaired_data_mismatch
>> validation_preview_purge_head_start
>> initial_range_tombstone_list_allocation_size
>> range_tombstone_list_growth_factor
>> snapshot_on_duplicate_row_detection
>> check_for_duplicate_rows_during_reads
>> check_for_duplicate_rows_during_compaction
>> autocompaction_on_startup_enabled
>> auto_optimise_inc_repair_streams
>> auto_optimise_full_repair_streams
>> auto_optimise_preview_repair_streams
>> consecutive_message_errors_threshold
>> internode_error_reporting_exclusions
>> compact_tables_enabled
>> vector_type_enabled
>> intersect_filtering_query_warned
>> intersect_filtering_query_enabled
>> streaming_slow_events_log_timeout
>> repair_state_expires
>> repair_state_size
>> paxos_variant
>> skip_paxos_repair_on_topology_change
>> paxos_purge_grace_period
>> paxos_on_linearizability_violations
>> paxos_state_purging
>> paxos_repair_enabled
>> paxos_topology_repair_no_dc_checks
>> paxos_topology_repair_strict_each_quorum
>> skip_paxos_repair_on_topology_change_keyspaces
>> paxos_contention_wait_randomizer
>> paxos_contention_min_wait
>> paxos_contention_max_wait
>> paxos_contention_min_delta
>> paxos_repair_parallelism
>> sstable_read_rate_persistence_enabled
>> client_request_size_metrics_enabled
>> max_top_size_partition_count
>> max_top_tombstone_partition_count
>> min_tracked_partition_size
>> min_tracked_partition_tombstone_count
>> top_partitions_enabled
>> severity_during_decommission
>> progress_barrier_min_consistency_level
>> progress_barrier_default_consistency_level
>> progress_barrier_timeout
>> progress_barrier_backoff
>> discovery_timeout
>> unsafe_tcm_mode
>> cql_start_time
>> native_transport_throw_on_overload
>> native_transport_queue_max_item_age_threshold
>> native_transport_min_backoff_on_queue_overload
>> native_transport_max_backoff_on_queue_overload
>> native_transport_timeout
>> enforce_native_deadline_for_hints
>> Total missed:112
>>
>>
>>
>> On Fri, 24 Jan 2025 at 15:10, Štefan Miklošovič <[email protected]>
>> wrote:
>>
>> It should also work the other way around. If there is a property which is
>> commented out in yaml and it is not in Config.java, that should fail as
>> well. If it is not commented out and it is not in Config.java, that will
>> fail in runtime as it fails on unrecognized property.
>>
>> This will be used in practice very rarely as we seldom remove the
>> properties in Config but if we do and a property is commented out, we
>> should not ship a dead property name, even commented out.
>>
>> On Fri, Jan 24, 2025 at 3:51 PM Paulo Motta <[email protected]> wrote:
>>
>> >  >  If "# my_cool_property: true" is NOT in cassandra.yaml, we might
>> indeed add it, also commented out. I think it would be quite easy to check
>> against yaml if there is a line starting on "# my_cool_property" or just on
>> "my_cool_property". Both cases would satisfy the check.
>>
>> Makes sense, I think this would be good to have as a lint or test to
>> easily catch overlooks during review.
>>
>> On Fri, Jan 24, 2025 at 9:44 AM Štefan Miklošovič <[email protected]>
>> wrote:
>>
>>
>>
>> On Fri, Jan 24, 2025 at 3:27 PM Paulo Motta <[email protected]> wrote:
>>
>> > from time to time I see configuration properties in Config.java and
>> they are clearly not in cassandra.yaml. Not every property in Config is in
>> cassandra.yaml. I would like to know if there is some specific reason
>> behind that.
>>
>> I think one of the original reasons was to "hide" advanced configs that
>> are not meant to be updated, unless in very niche circumstances. However I
>> think this has been extrapolated to non-advanced settings.
>>
>> > Question related to that is if we could not have a build-time check
>> that all properties in Config have to be in cassandra.yaml and fail the
>> build if a property in Config does not have its counterpart in yaml.
>>
>> Are you saying every configuration property should be commented-out, or
>> do you think that every Config property should be specified in
>> cassandra.yaml with their default uncomented ? One issue with that is that
>> you could cause user confusion if you "reveal" a niche/advanced config that
>> is not meant to be updated. I think this would be addressed by
>> the @HiddenInYaml flag you are proposing in a later post.
>>
>>
>> Yes, then can stay hidden, but we should annotate it with @Hidden or
>> similar. As of now, if that property is not in yaml, we just don't know if
>> it was forgotten to be added or if we have not added it on purpose.
>>
>> They can keep being commented out if they currently are. Imagine a
>> property in Config.java
>>
>> public boolean my_cool_property = true;
>>
>> and then this in cassandra.yaml
>>
>> # my_cool_property: true
>>
>> It is completely ok.
>>
>> If "# my_cool_property: true" is NOT in cassandra.yaml, we might indeed
>> add it, also commented out. I think it would be quite easy to check against
>> yaml if there is a line starting on "# my_cool_property" or just on
>> "my_cool_property". Both cases would satisfy the check.
>>
>>
>>
>> > There are dozens of properties in Config and I have a strong suspicion
>> that we missed to publish some to yaml so users do not even know such a
>> property exists and as of now we do not even know which they are.
>>
>> I believe this is a problem. I think most properties should be in
>> cassandra.yaml, unless they are very advanced or not meant to be updated.
>>
>> Another tangential issue is that there are features/settings that don't
>> even have a Config entry, but are just controlled by JVM properties.
>>
>> I think that we should attempt to unify Config and jvm properties under a
>> predictable structure. For example, if there is a YAML config
>> enable_user_defined_functions, then there should be a respective JVM flag
>> -Dcassandra.enable_user_defined_functions, and vice versa.
>>
>>
>> Yeah, good idea.
>>
>>
>>
>> On Fri, Jan 24, 2025 at 9:16 AM Štefan Miklošovič <[email protected]>
>> wrote:
>>
>> Hello,
>>
>> from time to time I see configuration properties in Config.java and they
>> are clearly not in cassandra.yaml. Not every property in Config is in
>> cassandra.yaml. I would like to know if there is some specific reason
>> behind that.
>>
>> Question related to that is if we could not have a build-time check that
>> all properties in Config have to be in cassandra.yaml and fail the build if
>> a property in Config does not have its counterpart in yaml.
>>
>> There are dozens of properties in Config and I have a strong suspicion
>> that we missed to publish some to yaml so users do not even know such a
>> property exists and as of now we do not even know which they are.
>>
>>
>>
>> --
>> Dmitry Konstantinov
>>
>>
>>
>> --
>> Dmitry Konstantinov
>>
>>
>>
>>
>
> --
> Dmitry Konstantinov
>

Reply via email to