Re: [DISCUSS] CEP-62: Cassandra Configuration Management via Sidecar

Štefan Miklošovič Thu, 19 Mar 2026 09:52:21 -0700

Extracting Config as suggested into its own artifact is definitely
something to strive for. If we want to just verify that names / types
are OK without further validation of values, which might be some kind
of a poor's guy validator, check classes in tests we have such as
ConfigCompatibilityTestGenerate, ConfigCompatibilityTest and files in
test/data/config directory which contain dumps of configuration keys
with their types as a simple text file. We could just copy these over
and check against that. At least something, without having Config as
an artifact which is a Holy Grail here, albeit way more involved.


On Thu, Mar 19, 2026 at 5:32 PM Josh McKenzie <[email protected]> wrote:
>
> The downside is that we will need to maintain a copy of Cassandra's 
> Config.java class in sidecar and keep this in sync with Cassandra's everytime 
> a new major is released, but we would have to update the schema anyway even 
> if using a schema spec language.
>
> Rather than copying, can we automate pulling this file down from each 
> supported C* version as part of the build process? We need to unwind a bunch 
> of code duplication in the ecosystem already.
>
> Ultimately, I'd like to see Config and some shared utility classes (CommitLog 
> reading, SSTable reading/writing, etc) end up in a shared artifact which 
> would make a lot of this kind of stuff (cross-ecosystem understanding on 
> formats, versioning, and re-use) a lot simpler and cleaner for all of us. 
> Never-mind versioning on Hints / CommitLog / MessagingService and the 
> embedding of MS versions inside CommitLog files that's given rise to the 
> compounding complexity and confusion around storage_compatibility_mode setups.
>
> On Wed, Mar 18, 2026, at 1:30 PM, Paulo Motta wrote:
>
> Thanks for your input, Chris! See answers below.
>
> > *Do you have any thoughts on what the schema may look like?*
>
> My idea is to use Cassandra's own Java validation mechanism to check the 
> updated setting against the schema, for example:
>
> ```
> YamlConfigurationLoader loader = new YamlConfigurationLoader();
> Map<String, Object> settings = Map.of("setting_name", value);
> Config config = loader.fromMap(settings, Config.class); // throws 
> ConfigurationException on invalid
> ```
>
> The downside is that we will need to maintain a copy of Cassandra's 
> Config.java class in sidecar and keep this in sync with Cassandra's everytime 
> a new major is released, but we would have to update the schema anyway even 
> if using a schema spec language.
>
> > *Are there any plans to support externally provided schemas?*
>
> I haven't thought about externally provided schemas, since this doesn't 
> really fit the schema model I'm planning to use. I have added a parameter 
> `skipValidation` to bypass schema validation for lesser known or 
> not-yet-supported settings.
>
> > *What is the scope of a configuration?*
>
> The scope of a configuration is all instances managed by a single sidecar 
> instance, which means they're collocated in the same physical machine. The 
> scope of the CEP is to provide only instance-based configuration, wider-scope 
> configuration should be provided by an external configuration manager or 
> orchestrator.
>
> > *What happens when someone logs into a node and changes a configuration
> file manually?*
>
> This scenario will not be possible if sidecar APIs are used to start the 
> node, because the node startup endpoint will refresh the disk configuration 
> with the base template + provider overlay prior to starting the node 
> overwriting any manual changes and ensuring only the configuration modified 
> by the API will be loaded.
>
> I hope I have provided answers to your questions, please let me know if 
> anything is not clear.
>
> On Tue, 17 Mar 2026 at 16:20 Christopher Bradford <[email protected]> 
> wrote:
>
> Thank you for sharing Paulo, this is awesome! It's interesting to look at the 
> myriad of ways users implement configuration for cassandra (manual, 
> automation templates, etc). I'm very interested in the version specific 
> schema validation for configuration, configuration drift and uniformity. Some 
> questions:
>
> Do you have any thoughts on what this may look like?
> For example OpenAPI spec, JSON Schema etc? DataStax maintained a set of 
> configuration definitions and templates for a time. It felt like there was a 
> significant barrier to entry around updating them. These were leveraged by 
> the cass-config-builder and K8ssandra, but that has been deprecated in favor 
> of a different approach today.
>
> Are there any plans to support externally provided schemas?
> Can I provide my own schema file and validation parameters and expect it to 
> work with Sidecar? We've encountered a few examples of hidden parameters in 
> cassandra.yaml (depending on the flavor of C* being run) that would have 
> tripped up strict validation.
>
> What is the scope of a configuration?
> It's fairly common to provide a unique configuration per DC based on 
> hardware, workload, etc (not to mention any changes to rackdc.properties). As 
> a user of this system are we pushing overlays (and or configuration HTTP 
> requests) to the sidecar running on each node, each DC, per cluster with 
> flags? Looking at the current design it appears as though configuration is 
> handled per node.
>
> What happens when someone logs into a node and changes a configuration file 
> manually?
> I wonder if it would be helpful to provide an endpoint (possibly enriching an 
> existing one) to report when the configuration files on disk differ from the 
> values materialized by sidecar. I remember generating a number of reports for 
> users highlighting the inconsistency in configuration between nodes. Ideally 
> Sidecar resolves the possibility of this happening, but a user with ssh and a 
> bit of determination will always find a way. I could see this being leveraged 
> for determining which nodes may need a restart to pick up configuration 
> changes (assuming lack of hot-reloading support for specific fields) as well 
> as running reports for misconfigured instances.
>
> I see there may be a little overlap with Josh's questions, my apologies for 
> any duplication. Again, thank you for the CEP.
>
> Cheers,
> ~Chris
> Christopher Bradford
>
>
>
> On Tue, Mar 17, 2026 at 3:04 PM Josh McKenzie <[email protected]> wrote:
>
>
> Fantastic work here Paulo et al. Just finished first read through and have 
> some questions and observations.
>
> Operators can update the base template for all instances by modifying 
> sidecar.yaml and restarting Sidecar, while individual instance customizations 
> remain preserved in their respective overlays.
>
> Could we allow hot reloading via API endpoint trigger rather than requiring 
> bouncing the sidecar?
>
> It then validates this result against a version-aware configuration schema 
> maintained per Cassandra major version to ensure that all cassandraYaml keys 
> being updated are recognized for that version. Updates of unknown keys are 
> rejected to prevent Cassandra startup failures.
>
> Am I reading this correctly that this design would be a single cassandra.yaml 
> per MAJOR Cassandra version? If we introduce new Config.java + cassandra.yaml 
> values in a patch release, won't that run afoul of this design?
>
> Q: I didn't see anything about polling an external config provider for 
> updates to configuration; config on a sidecar could end up stale relative to 
> the config stored in an external system if not restarted. Is that something 
> we're going to leave for future work or leave in the hands of operators?
>
> Q: If a new config param is added in a patch version and we don't have an 
> updated base cassandra.yaml, we'd be moving from "cassandra.yaml w/default 
> overrides what's in Config.java" to "we use what's in Config.java". In theory 
> we'd always keep our values in cassandra.yaml in sync w/Config.java, but this 
> change to the sidecar base .yaml as authoritative means we'd need a new 
> mechanism (human toil, automated generation, ?) to keep the base sidecar C* 
> .yaml config in sync w/whatever MINOR we're running I think.
>
> Q: What happens if someone mutates parameters via the REST API that are Bad 
> News? Thinking things like num_tokens, partitioner, cluster_name, 
> initial_token, etc. Maybe we have a built-in blocklist for .yaml values you 
> can't mutate remotely and 422 on them, or provide a base list of blocklisted 
> .yaml params disallowed in overlay PATCH to prevent these kinds of issues and 
> make them more "durable" in the base .yaml config file? There's a tension 
> here between wanting to be able to setup the normal base config via API and 
> wanting to prevent disastrous changes via the API.
>
> Q: Which leads to the question: how do operators push out the base .yaml 
> config across a fleet? We have the ConfigurationProvider abstraction for 
> overlays, but how do operators get the base template out there? Especially 
> since you can / will have a different base template per instance on the node, 
> this seems like a gap where operators would still have a lot of toil. Maybe 
> some kind of "pull the base cassandra.yaml from the local node as base 
> template if none is provided" so the default is to defer to what's in the 
> local node, lock it in, then add overlays? I think a mechanism like this 
> could help w/the MINOR version .yaml drift as well, if we had a mechanism to 
> defer to the node local base .yaml file and then overlay things on top of it.
>
> And lastly - apologies if I misread or misunderstood anything in the CEP. If 
> nothing else it'll give us insight into areas to flesh out verbiage for 
> clarity.
>
> On Tue, Mar 17, 2026, at 12:32 PM, Paulo Motta wrote:
>
> Hi everyone,
>
> I'd like to propose CEP-62: Cassandra Configuration Management via Sidecar 
> for discussion by the community.
>
> CASSSIDECAR-266[1] introduced Cassandra process lifecycle management 
> capabilities to Sidecar, giving operators the ability to start and stop 
> Cassandra instances programmatically. However, Sidecar currently has no way 
> to manipulate the configuration files that those instances consume at startup.
>
> Many Cassandra settings (memtable configuration, SSTable settings, 
> storage_compatibility_mode) cannot be modified at runtime via JMX/CQL and 
> must be set in cassandra.yaml or JVM options files, requiring a restart to 
> take effect. Managing these files manually or through custom tooling is 
> cumbersome and lacks a stable API.
>
> This CEP extends Sidecar's lifecycle management by adding configuration 
> management capabilities for persisted configuration artifacts. It introduces 
> a REST API for reading and updating cassandra.yaml and JVM options, a 
> pluggable ConfigurationProvider abstraction for integration with centralized 
> configuration systems (etcd, Consul, or custom backends), and version-aware 
> validation to prevent startup failures.
>
> This CEP also serves as a prerequisite for future Cassandra upgrades via 
> Sidecar. For example, upgrading from Cassandra 4 to Cassandra 5 requires 
> updating storage_compatibility_mode in cassandra.yaml. The configuration 
> management capabilities introduced here will enable Sidecar to orchestrate 
> such upgrades by updating configuration artifacts alongside binary version 
> changes.
>
> The CEP is linked here: 
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-62%3A+Cassandra+Configuration+Management+via+Sidecar
>
> Looking forward to your feedback!
>
> Thanks,
>
> Paulo
>
> [1] - https://issues.apache.org/jira/browse/CASSSIDECAR-266
>
>
>

Re: [DISCUSS] CEP-62: Cassandra Configuration Management via Sidecar

Reply via email to