Regarding schema spec languages, I was looking for a shared resource given a need within multiple services (operators, a browser-based user interface, etc) for validation.
For manual configuration changes there are a number of parameters that are hot-reloaded from cassandra.yaml today as well as runtime changes expressed via nodetool invocations. Even if the process was started with the configuration from the Sidecar that may not be what's actively being used. ~Chris Christopher Bradford On Wed, Mar 18, 2026 at 1:32 PM Paulo Motta <[email protected]> wrote: > Thanks for your input, Chris! See answers below. > > > *Do you have any thoughts on what the schema may look like?* > > My idea is to use Cassandra's own Java validation mechanism to check the > updated setting against the schema, for example: > > ``` > YamlConfigurationLoader loader = new YamlConfigurationLoader(); > Map<String, Object> settings = Map.of("setting_name", value); > Config config = loader.fromMap(settings, Config.class); // throws > ConfigurationException on invalid > ``` > > The downside is that we will need to maintain a copy of Cassandra's > Config.java class in sidecar and keep this in sync with Cassandra's > everytime a new major is released, but we would have to update the schema > anyway even if using a schema spec language. > > > *Are there any plans to support externally provided schemas?* > > I haven't thought about externally provided schemas, since this doesn't > really fit the schema model I'm planning to use. I have added a parameter > `skipValidation` to bypass schema validation for lesser known or > not-yet-supported settings. > > > *What is the scope of a configuration?* > > The scope of a configuration is all instances managed by a single sidecar > instance, which means they're collocated in the same physical machine. The > scope of the CEP is to provide only instance-based configuration, > wider-scope configuration should be provided by an external configuration > manager or orchestrator. > > > *What happens when someone logs into a node and changes a configuration > file manually?* > > This scenario will not be possible if sidecar APIs are used to start the > node, because the node startup endpoint will refresh the disk configuration > with the base template + provider overlay prior to starting the node > overwriting any manual changes and ensuring only the configuration modified > by the API will be loaded. > > I hope I have provided answers to your questions, please let me know if > anything is not clear. > > On Tue, 17 Mar 2026 at 16:20 Christopher Bradford <[email protected]> > wrote: > >> Thank you for sharing Paulo, this is awesome! It's interesting to look at >> the myriad of ways users implement configuration for cassandra (manual, >> automation templates, etc). I'm very interested in the version specific >> schema validation for configuration, configuration drift and uniformity. >> Some questions: >> >> *Do you have any thoughts on what this may look like?* >> For example OpenAPI spec, JSON Schema etc? DataStax maintained a set of >> configuration >> definitions <https://github.com/datastax/cass-config-definitions> and >> templates for a time. It felt like there was a significant barrier to entry >> around updating them. These were leveraged by the cass-config-builder >> <https://github.com/datastax/cass-config-builder> and K8ssandra, but >> that has been deprecated in favor of a different approach today. >> >> *Are there any plans to support externally provided schemas?* >> Can I provide my own schema file and validation parameters and expect it >> to work with Sidecar? We've encountered a few examples of hidden parameters >> in cassandra.yaml (depending on the flavor of C* being run) that would have >> tripped up strict validation. >> >> *What is the scope of a configuration?* >> It's fairly common to provide a unique configuration per DC based on >> hardware, workload, etc (not to mention any changes to rackdc.properties). >> As a user of this system are we pushing overlays (and or configuration HTTP >> requests) to the sidecar running on each node, each DC, per cluster with >> flags? Looking at the current design it appears as though configuration is >> handled per node. >> >> *What happens when someone logs into a node and changes a configuration >> file manually?* >> I wonder if it would be helpful to provide an endpoint (possibly >> enriching an existing one) to report when the configuration files on disk >> differ from the values materialized by sidecar. I remember generating a >> number of reports for users highlighting the inconsistency in configuration >> between nodes. Ideally Sidecar resolves the possibility of this happening, >> but a user with ssh and a bit of determination will always find a way. I >> could see this being leveraged for determining which nodes may need a >> restart to pick up configuration changes (assuming lack of hot-reloading >> support for specific fields) as well as running reports for misconfigured >> instances. >> >> I see there may be a little overlap with Josh's questions, my apologies >> for any duplication. Again, thank you for the CEP. >> >> Cheers, >> ~Chris >> Christopher Bradford >> >> >> >> On Tue, Mar 17, 2026 at 3:04 PM Josh McKenzie <[email protected]> >> wrote: >> >>> Fantastic work here Paulo et al. Just finished first read through and >>> have some questions and observations. >>> >>> Operators can update the base template for all instances by modifying >>> sidecar.yaml and restarting Sidecar, while individual instance >>> customizations remain preserved in their respective overlays. >>> >>> Could we allow hot reloading via API endpoint trigger rather than >>> requiring bouncing the sidecar? >>> >>> It then validates this result against a version-aware configuration >>> schema maintained per Cassandra major version to ensure that all >>> cassandraYaml keys being updated are recognized for that version. >>> Updates of unknown keys are rejected to prevent Cassandra startup failures. >>> >>> Am I reading this correctly that this design would be a single >>> cassandra.yaml per MAJOR Cassandra version? If we introduce new Config.java >>> + cassandra.yaml values in a patch release, won't that run afoul of this >>> design? >>> >>> Q: I didn't see anything about polling an external config provider for >>> updates to configuration; config on a sidecar could end up stale relative >>> to the config stored in an external system if not restarted. Is that >>> something we're going to leave for future work or leave in the hands of >>> operators? >>> >>> Q: If a new config param is added in a patch version and we don't have >>> an updated base cassandra.yaml, we'd be moving from "cassandra.yaml >>> w/default overrides what's in Config.java" to "we use what's in >>> Config.java". *In theory* we'd always keep our values in cassandra.yaml >>> in sync w/Config.java, but this change to the sidecar base .yaml as >>> authoritative means we'd need a new mechanism (human toil, automated >>> generation, ?) to keep the base sidecar C* .yaml config in sync w/whatever >>> MINOR we're running I think. >>> >>> Q: What happens if someone mutates parameters via the REST API that are >>> Bad News? Thinking things like num_tokens, partitioner, cluster_name, >>> initial_token, etc. Maybe we have a built-in blocklist for .yaml values you >>> can't mutate remotely and 422 on them, or provide a base list of >>> blocklisted .yaml params disallowed in overlay PATCH to prevent these kinds >>> of issues and make them more "durable" in the base .yaml config file? >>> There's a tension here between wanting to be able to setup the normal base >>> config via API and wanting to prevent disastrous changes via the API. >>> >>> Q: Which leads to the question: how do operators push out the base .yaml >>> config across a fleet? We have the ConfigurationProvider abstraction for >>> overlays, but how do operators get the base template out there? Especially >>> since you can / will have a different base template per instance on the >>> node, this seems like a gap where operators would still have a lot of toil. >>> Maybe some kind of "pull the base cassandra.yaml from the local node as >>> base template if none is provided" so the default is to defer to what's in >>> the local node, lock it in, then add overlays? I think a mechanism like >>> this could help w/the MINOR version .yaml drift as well, if we had a >>> mechanism to defer to the node local base .yaml file and then overlay >>> things on top of it. >>> >>> And lastly - apologies if I misread or misunderstood anything in the >>> CEP. If nothing else it'll give us insight into areas to flesh out verbiage >>> for clarity. >>> >>> On Tue, Mar 17, 2026, at 12:32 PM, Paulo Motta wrote: >>> >>> Hi everyone, >>> >>> I'd like to propose CEP-62: Cassandra Configuration Management via >>> Sidecar for discussion by the community. >>> >>> CASSSIDECAR-266[1] introduced Cassandra process lifecycle management >>> capabilities to Sidecar, giving operators the ability to start and stop >>> Cassandra instances programmatically. However, Sidecar currently has no way >>> to manipulate the configuration files that those instances consume at >>> startup. >>> >>> Many Cassandra settings (memtable configuration, SSTable settings, >>> storage_compatibility_mode) cannot be modified at runtime via JMX/CQL and >>> must be set in cassandra.yaml or JVM options files, requiring a restart to >>> take effect. Managing these files manually or through custom tooling is >>> cumbersome and lacks a stable API. >>> >>> This CEP extends Sidecar's lifecycle management by adding configuration >>> management capabilities for persisted configuration artifacts. It >>> introduces a REST API for reading and updating cassandra.yaml and JVM >>> options, a pluggable ConfigurationProvider abstraction for integration with >>> centralized configuration systems (etcd, Consul, or custom backends), and >>> version-aware validation to prevent startup failures. >>> >>> This CEP also serves as a prerequisite for future Cassandra upgrades via >>> Sidecar. For example, upgrading from Cassandra 4 to Cassandra 5 requires >>> updating storage_compatibility_mode in cassandra.yaml. The configuration >>> management capabilities introduced here will enable Sidecar to orchestrate >>> such upgrades by updating configuration artifacts alongside binary version >>> changes. >>> >>> The CEP is linked here: >>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-62%3A+Cassandra+Configuration+Management+via+Sidecar >>> >>> Looking forward to your feedback! >>> >>> Thanks, >>> >>> Paulo >>> >>> [1] - https://issues.apache.org/jira/browse/CASSSIDECAR-266 >>> >>> >>>
