Re: [DISCUSS] CEP-62: Cassandra Configuration Management via Sidecar

Josh McKenzie Thu, 19 Mar 2026 09:32:13 -0700

> The downside is that we will need to maintain a copy of Cassandra's 
> Config.java class in sidecar and keep this in sync with Cassandra's everytime 
> a new major is released, but we would have to update the schema anyway even 
> if using a schema spec language.
Rather than copying, can we automate pulling this file down from each supported 
C* version as part of the build process? We need to unwind a bunch of code 
duplication in the ecosystem already.


Ultimately, I'd like to see Config and some shared utility classes (CommitLog 
reading, SSTable reading/writing, etc) end up in a shared artifact which would 
make a lot of this kind of stuff (cross-ecosystem understanding on formats, 
versioning, and re-use) a *lot* simpler and cleaner for all of us. Never-mind 
versioning on Hints / CommitLog / MessagingService and the embedding of MS 
versions inside CommitLog files that's given rise to the compounding complexity 
and confusion around storage_compatibility_mode setups.

On Wed, Mar 18, 2026, at 1:30 PM, Paulo Motta wrote:
> Thanks for your input, Chris! See answers below.
> 
> > *Do you have any thoughts on what the schema may look like?*
> 
> My idea is to use Cassandra's own Java validation mechanism to check the 
> updated setting against the schema, for example:
> 
> ```
> YamlConfigurationLoader loader = new YamlConfigurationLoader();
> Map<String, Object> settings = Map.of("setting_name", value);
> Config config = loader.fromMap(settings, Config.class); // throws 
> ConfigurationException on invalid
> ```
> 
> The downside is that we will need to maintain a copy of Cassandra's 
> Config.java class in sidecar and keep this in sync with Cassandra's everytime 
> a new major is released, but we would have to update the schema anyway even 
> if using a schema spec language.
> 
> > *Are there any plans to support externally provided schemas?*
> 
> I haven't thought about externally provided schemas, since this doesn't 
> really fit the schema model I'm planning to use. I have added a parameter 
> `skipValidation` to bypass schema validation for lesser known or 
> not-yet-supported settings.
> 
> > *What is the scope of a configuration?*
> 
> The scope of a configuration is all instances managed by a single sidecar 
> instance, which means they're collocated in the same physical machine. The 
> scope of the CEP is to provide only instance-based configuration, wider-scope 
> configuration should be provided by an external configuration manager or 
> orchestrator.
> 
> > *What happens when someone logs into a node and changes a configuration
> file manually?*
> 
> This scenario will not be possible if sidecar APIs are used to start the 
> node, because the node startup endpoint will refresh the disk configuration 
> with the base template + provider overlay prior to starting the node 
> overwriting any manual changes and ensuring only the configuration modified 
> by the API will be loaded.
> 
> I hope I have provided answers to your questions, please let me know if 
> anything is not clear.
> 
> On Tue, 17 Mar 2026 at 16:20 Christopher Bradford <[email protected]> 
> wrote:
>> Thank you for sharing Paulo, this is awesome! It's interesting to look at 
>> the myriad of ways users implement configuration for cassandra (manual, 
>> automation templates, etc). I'm very interested in the version specific 
>> schema validation for configuration, configuration drift and uniformity. 
>> Some questions:
>> 
>> *Do you have any thoughts on what this may look like?*
>> For example OpenAPI spec, JSON Schema etc? DataStax maintained a set of 
>> configuration definitions 
>> <https://github.com/datastax/cass-config-definitions> and templates for a 
>> time. It felt like there was a significant barrier to entry around updating 
>> them. These were leveraged by the cass-config-builder 
>> <https://github.com/datastax/cass-config-builder> and K8ssandra, but that 
>> has been deprecated in favor of a different approach today.
>> 
>> *Are there any plans to support externally provided schemas?*
>> Can I provide my own schema file and validation parameters and expect it to 
>> work with Sidecar? We've encountered a few examples of hidden parameters in 
>> cassandra.yaml (depending on the flavor of C* being run) that would have 
>> tripped up strict validation.
>> 
>> *What is the scope of a configuration?*
>> It's fairly common to provide a unique configuration per DC based on 
>> hardware, workload, etc (not to mention any changes to rackdc.properties). 
>> As a user of this system are we pushing overlays (and or configuration HTTP 
>> requests) to the sidecar running on each node, each DC, per cluster with 
>> flags? Looking at the current design it appears as though configuration is 
>> handled per node.
>> 
>> *What happens when someone logs into a node and changes a configuration file 
>> manually?*
>> I wonder if it would be helpful to provide an endpoint (possibly enriching 
>> an existing one) to report when the configuration files on disk differ from 
>> the values materialized by sidecar. I remember generating a number of 
>> reports for users highlighting the inconsistency in configuration between 
>> nodes. Ideally Sidecar resolves the possibility of this happening, but a 
>> user with ssh and a bit of determination will always find a way. I could see 
>> this being leveraged for determining which nodes may need a restart to pick 
>> up configuration changes (assuming lack of hot-reloading support for 
>> specific fields) as well as running reports for misconfigured instances.
>> 
>> I see there may be a little overlap with Josh's questions, my apologies for 
>> any duplication. Again, thank you for the CEP.
>> 
>> Cheers,
>> ~Chris
>> Christopher Bradford
>> 
>> 
>> 
>> On Tue, Mar 17, 2026 at 3:04 PM Josh McKenzie <[email protected]> wrote:
>>> __
>>> Fantastic work here Paulo et al. Just finished first read through and have 
>>> some questions and observations.
>>> 
>>>> Operators can update the base template for all instances by modifying 
>>>> sidecar.yaml and restarting Sidecar, while individual instance 
>>>> customizations remain preserved in their respective overlays.
>>> Could we allow hot reloading via API endpoint trigger rather than requiring 
>>> bouncing the sidecar?
>>> 
>>>> It then validates this result against a version-aware configuration schema 
>>>> maintained per Cassandra major version to ensure that all `cassandraYaml` 
>>>> keys being updated are recognized for that version. Updates of unknown 
>>>> keys are rejected to prevent Cassandra startup failures.
>>> Am I reading this correctly that this design would be a single 
>>> cassandra.yaml per MAJOR Cassandra version? If we introduce new Config.java 
>>> + cassandra.yaml values in a patch release, won't that run afoul of this 
>>> design?
>>> 
>>> Q: I didn't see anything about polling an external config provider for 
>>> updates to configuration; config on a sidecar could end up stale relative 
>>> to the config stored in an external system if not restarted. Is that 
>>> something we're going to leave for future work or leave in the hands of 
>>> operators?
>>> 
>>> Q: If a new config param is added in a patch version and we don't have an 
>>> updated base cassandra.yaml, we'd be moving from "cassandra.yaml w/default 
>>> overrides what's in Config.java" to "we use what's in Config.java". *In 
>>> theory* we'd always keep our values in cassandra.yaml in sync 
>>> w/Config.java, but this change to the sidecar base .yaml as authoritative 
>>> means we'd need a new mechanism (human toil, automated generation, ?) to 
>>> keep the base sidecar C* .yaml config in sync w/whatever MINOR we're 
>>> running I think.
>>> 
>>> Q: What happens if someone mutates parameters via the REST API that are Bad 
>>> News? Thinking things like num_tokens, partitioner, cluster_name, 
>>> initial_token, etc. Maybe we have a built-in blocklist for .yaml values you 
>>> can't mutate remotely and 422 on them, or provide a base list of 
>>> blocklisted .yaml params disallowed in overlay PATCH to prevent these kinds 
>>> of issues and make them more "durable" in the base .yaml config file? 
>>> There's a tension here between wanting to be able to setup the normal base 
>>> config via API and wanting to prevent disastrous changes via the API.
>>> 
>>> Q: Which leads to the question: how do operators push out the base .yaml 
>>> config across a fleet? We have the ConfigurationProvider abstraction for 
>>> overlays, but how do operators get the base template out there? Especially 
>>> since you can / will have a different base template per instance on the 
>>> node, this seems like a gap where operators would still have a lot of toil. 
>>> Maybe some kind of "pull the base cassandra.yaml from the local node as 
>>> base template if none is provided" so the default is to defer to what's in 
>>> the local node, lock it in, then add overlays? I think a mechanism like 
>>> this could help w/the MINOR version .yaml drift as well, if we had a 
>>> mechanism to defer to the node local base .yaml file and then overlay 
>>> things on top of it.
>>> 
>>> And lastly - apologies if I misread or misunderstood anything in the CEP. 
>>> If nothing else it'll give us insight into areas to flesh out verbiage for 
>>> clarity.
>>> 
>>> On Tue, Mar 17, 2026, at 12:32 PM, Paulo Motta wrote:
>>>> Hi everyone,
>>>> 
>>>> I'd like to propose CEP-62: Cassandra Configuration Management via Sidecar 
>>>> for discussion by the community.
>>>> 
>>>> CASSSIDECAR-266[1] introduced Cassandra process lifecycle management 
>>>> capabilities to Sidecar, giving operators the ability to start and stop 
>>>> Cassandra instances programmatically. However, Sidecar currently has no 
>>>> way to manipulate the configuration files that those instances consume at 
>>>> startup.
>>>> 
>>>> Many Cassandra settings (memtable configuration, SSTable settings, 
>>>> storage_compatibility_mode) cannot be modified at runtime via JMX/CQL and 
>>>> must be set in cassandra.yaml or JVM options files, requiring a restart to 
>>>> take effect. Managing these files manually or through custom tooling is 
>>>> cumbersome and lacks a stable API.
>>>> 
>>>> This CEP extends Sidecar's lifecycle management by adding configuration 
>>>> management capabilities for persisted configuration artifacts. It 
>>>> introduces a REST API for reading and updating cassandra.yaml and JVM 
>>>> options, a pluggable ConfigurationProvider abstraction for integration 
>>>> with centralized configuration systems (etcd, Consul, or custom backends), 
>>>> and version-aware validation to prevent startup failures.
>>>> 
>>>> This CEP also serves as a prerequisite for future Cassandra upgrades via 
>>>> Sidecar. For example, upgrading from Cassandra 4 to Cassandra 5 requires 
>>>> updating storage_compatibility_mode in cassandra.yaml. The configuration 
>>>> management capabilities introduced here will enable Sidecar to orchestrate 
>>>> such upgrades by updating configuration artifacts alongside binary version 
>>>> changes.
>>>> 
>>>> The CEP is linked here: 
>>>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-62%3A+Cassandra+Configuration+Management+via+Sidecar
>>>> 
>>>> Looking forward to your feedback!
>>>> 
>>>> Thanks,
>>>> 
>>>> Paulo
>>>> 
>>>> [1] - https://issues.apache.org/jira/browse/CASSSIDECAR-266
>>>

Re: [DISCUSS] CEP-62: Cassandra Configuration Management via Sidecar

Reply via email to