Re: [DISCUSS] CEP-62: Cassandra Configuration Management via Sidecar

Josh McKenzie Thu, 19 Mar 2026 09:28:33 -0700

> When a new setting is added in a minor, it will not be specified in the base 
> template so the Config.java default will be picked up, no sync will be needed 
> unless the user wants to update the setting. Is this what you meant ?
Right. The more I think about this, the more I think this won't actually be a 
change from the status quo. Everyone already has to either a) not get new 
params from a MINOR in .yaml and rely on Config.java defaults, or b) have some 
kind of logic that merges in new params to their existing modified .yaml files 
which would need to be updated for this new paradigm and provide the same 
guarantees.


Some kind of build-time check / static analysis that confirms the defaults in 
Config.java match what is provided in cassandra.yaml and fails a build if they 
diverge would help tighten this window so there aren't any surprises or defects 
introduced by oversights on this config. Outside the scope of this CEP though.

On Wed, Mar 18, 2026, at 12:33 PM, Paulo Motta wrote:
> Thanks Josh for the comments! See follow-up below.
> 
> > Could we allow hot reloading via API endpoint trigger rather than requiring 
> > bouncing the sidecar?
> 
> Good idea, added a POST /api/v1/cassandra/configuration/reloadConfig endpoint 
> to perform hot template reloading as well as refresh the config from the 
> configuration provider.
> 
> > Am I reading this correctly that this design would be a single 
> > cassandra.yaml per MAJOR Cassandra version? If we introduce new Config.java 
> > + cassandra.yaml values in a patch release, won't that run afoul of this 
> > design?
> 
> In order to reduce the schema maintenance overhead (otherwise we would need 
> to track settings introduced on each minor), the guarantee provided is that 
> all settings supported by the latest patch version of a major are accepted, 
> with the rationale that the majority of the valid settings are introduced in 
> a major and configurations that go into a minor are the exception. It would 
> be the operator responsibility to ensure a valid setting is used in the 
> version actually deployed, otherwise the node may fail to start if an 
> unsupported setting is updated. In which case the operator would have to 
> either upgrade the node to the supported version or submit another update 
> operation to remove the invalid setting.
> 
> Q: I didn't see anything about polling an external config provider for 
> updates to configuration; config on a sidecar could end up stale relative to 
> the config stored in an external system if not restarted. Is that something 
> we're going to leave for future work or leave in the hands of operators?
> 
> Initially the operator can use the /reloadConfig endpoint but I have added a 
> future work point for automic config refresh.
> 
> > Q: If a new config param is added in a patch version and we don't have an 
> > updated base cassandra.yaml, we'd be moving from "cassandra.yaml w/default 
> > overrides what's in Config.java" to "we use what's in Config.java". *In 
> > theory* we'd always keep our values in cassandra.yaml in sync 
> > w/Config.java, but this change to the sidecar base .yaml as authoritative 
> > means we'd need a new mechanism (human toil, automated generation, ?) to 
> > keep the base sidecar C* .yaml config in sync w/whatever MINOR we're 
> > running I think.
> 
> The sidecar base cassandra.yaml will not be an exhaustive list of all 
> settings, but just a explicitly defined settings like the shipped 
> cassandra.yaml. When a new setting is added in a minor, it will not be 
> specified in the base template so the Config.java default will be picked up, 
> no sync will be needed unless the user wants to update the setting. Is this 
> what you meant ?
> 
> > Q: What happens if someone mutates parameters via the REST API that are Bad 
> > News? Thinking things like num_tokens, partitioner, cluster_name, 
> > initial_token, etc. Maybe we have a built-in blocklist for .yaml values you 
> > can't mutate remotely and 422 on them, or provide a base list of 
> > blocklisted .yaml params disallowed in overlay PATCH to prevent these kinds 
> > of issues and make them more "durable" in the base .yaml config file? 
> > There's a tension here between wanting to be able to setup the normal base 
> > config via API and wanting to prevent disastrous changes via the API.
> 
> I added a blocklist concept to prevent modifying unsafe settings. A PATCH 
> targeting a blocklisted key is allowed only if the key is absent from both 
> the base template and the current overlay, so the initial configuration can 
> be performed via the API.
> 
> > Q: Which leads to the question: how do operators push out the base .yaml 
> > config across a fleet? We have the ConfigurationProvider abstraction for 
> > overlays, but how do operators get the base template out there? Especially 
> > since you can / will have a different base template per instance on the 
> > node, this seems like a gap where operators would still have a lot of toil. 
> > Maybe some kind of "pull the base cassandra.yaml from the local node as 
> > base template if none is provided" so the default is to defer to what's in 
> > the local node, lock it in, then add overlays? I think a mechanism like 
> > this could help w/the MINOR version .yaml drift as well, if we had a 
> > mechanism to defer to the node local base .yaml file and then overlay 
> > things on top of it.
> 
> Bootstrapping the config template is out of the scope of this CEP, this will 
> need to be performed by an external orchestrator/config manager or manual 
> configuration. I think one possible deployment model is to ship a single 
> stock cassandra.yaml template for all the instances and perform 
> customizations per instance via API or remote config provider overlay via an 
> external orchestrator.
> 
> On Tue, 17 Mar 2026 at 15:04 Josh McKenzie <[email protected]> wrote:
>> __
>> Fantastic work here Paulo et al. Just finished first read through and have 
>> some questions and observations.
>> 
>>> Operators can update the base template for all instances by modifying 
>>> sidecar.yaml and restarting Sidecar, while individual instance 
>>> customizations remain preserved in their respective overlays.
>> Could we allow hot reloading via API endpoint trigger rather than requiring 
>> bouncing the sidecar?
>> 
>>> It then validates this result against a version-aware configuration schema 
>>> maintained per Cassandra major version to ensure that all `cassandraYaml` 
>>> keys being updated are recognized for that version. Updates of unknown keys 
>>> are rejected to prevent Cassandra startup failures.
>> Am I reading this correctly that this design would be a single 
>> cassandra.yaml per MAJOR Cassandra version? If we introduce new Config.java 
>> + cassandra.yaml values in a patch release, won't that run afoul of this 
>> design?
>> 
>> Q: I didn't see anything about polling an external config provider for 
>> updates to configuration; config on a sidecar could end up stale relative to 
>> the config stored in an external system if not restarted. Is that something 
>> we're going to leave for future work or leave in the hands of operators?
>> 
>> Q: If a new config param is added in a patch version and we don't have an 
>> updated base cassandra.yaml, we'd be moving from "cassandra.yaml w/default 
>> overrides what's in Config.java" to "we use what's in Config.java". *In 
>> theory* we'd always keep our values in cassandra.yaml in sync w/Config.java, 
>> but this change to the sidecar base .yaml as authoritative means we'd need a 
>> new mechanism (human toil, automated generation, ?) to keep the base sidecar 
>> C* .yaml config in sync w/whatever MINOR we're running I think.
>> 
>> Q: What happens if someone mutates parameters via the REST API that are Bad 
>> News? Thinking things like num_tokens, partitioner, cluster_name, 
>> initial_token, etc. Maybe we have a built-in blocklist for .yaml values you 
>> can't mutate remotely and 422 on them, or provide a base list of blocklisted 
>> .yaml params disallowed in overlay PATCH to prevent these kinds of issues 
>> and make them more "durable" in the base .yaml config file? There's a 
>> tension here between wanting to be able to setup the normal base config via 
>> API and wanting to prevent disastrous changes via the API.
>> 
>> Q: Which leads to the question: how do operators push out the base .yaml 
>> config across a fleet? We have the ConfigurationProvider abstraction for 
>> overlays, but how do operators get the base template out there? Especially 
>> since you can / will have a different base template per instance on the 
>> node, this seems like a gap where operators would still have a lot of toil. 
>> Maybe some kind of "pull the base cassandra.yaml from the local node as base 
>> template if none is provided" so the default is to defer to what's in the 
>> local node, lock it in, then add overlays? I think a mechanism like this 
>> could help w/the MINOR version .yaml drift as well, if we had a mechanism to 
>> defer to the node local base .yaml file and then overlay things on top of it.
>> 
>> And lastly - apologies if I misread or misunderstood anything in the CEP. If 
>> nothing else it'll give us insight into areas to flesh out verbiage for 
>> clarity.
>> 
>> On Tue, Mar 17, 2026, at 12:32 PM, Paulo Motta wrote:
>>> Hi everyone,
>>> 
>>> I'd like to propose CEP-62: Cassandra Configuration Management via Sidecar 
>>> for discussion by the community.
>>> 
>>> CASSSIDECAR-266[1] introduced Cassandra process lifecycle management 
>>> capabilities to Sidecar, giving operators the ability to start and stop 
>>> Cassandra instances programmatically. However, Sidecar currently has no way 
>>> to manipulate the configuration files that those instances consume at 
>>> startup.
>>> 
>>> Many Cassandra settings (memtable configuration, SSTable settings, 
>>> storage_compatibility_mode) cannot be modified at runtime via JMX/CQL and 
>>> must be set in cassandra.yaml or JVM options files, requiring a restart to 
>>> take effect. Managing these files manually or through custom tooling is 
>>> cumbersome and lacks a stable API.
>>> 
>>> This CEP extends Sidecar's lifecycle management by adding configuration 
>>> management capabilities for persisted configuration artifacts. It 
>>> introduces a REST API for reading and updating cassandra.yaml and JVM 
>>> options, a pluggable ConfigurationProvider abstraction for integration with 
>>> centralized configuration systems (etcd, Consul, or custom backends), and 
>>> version-aware validation to prevent startup failures.
>>> 
>>> This CEP also serves as a prerequisite for future Cassandra upgrades via 
>>> Sidecar. For example, upgrading from Cassandra 4 to Cassandra 5 requires 
>>> updating storage_compatibility_mode in cassandra.yaml. The configuration 
>>> management capabilities introduced here will enable Sidecar to orchestrate 
>>> such upgrades by updating configuration artifacts alongside binary version 
>>> changes.
>>> 
>>> The CEP is linked here: 
>>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-62%3A+Cassandra+Configuration+Management+via+Sidecar
>>> 
>>> Looking forward to your feedback!
>>> 
>>> Thanks,
>>> 
>>> Paulo
>>> 
>>> [1] - https://issues.apache.org/jira/browse/CASSSIDECAR-266
>>

Re: [DISCUSS] CEP-62: Cassandra Configuration Management via Sidecar

Reply via email to