Re: [DISCUSS] CEP-62: Cassandra Configuration Management via Sidecar

Paulo Motta Wed, 18 Mar 2026 09:33:47 -0700

Thanks Josh for the comments! See follow-up below.

> Could we allow hot reloading via API endpoint trigger rather than
requiring bouncing the sidecar?


Good idea, added a POST /api/v1/cassandra/configuration/reloadConfig
endpoint to perform hot template reloading as well as refresh the config
from the configuration provider.

> Am I reading this correctly that this design would be a single
cassandra.yaml per MAJOR Cassandra version? If we introduce new Config.java
+ cassandra.yaml values in a patch release, won't that run afoul of this
design?

In order to reduce the schema maintenance overhead (otherwise we would need
to track settings introduced on each minor), the guarantee provided is that
all settings supported by the latest patch version of a major are accepted,
with the rationale that the majority of the valid settings are introduced
in a major and configurations that go into a minor are the exception. It
would be the operator responsibility to ensure a valid setting is used in
the version actually deployed, otherwise the node may fail to start if an
unsupported setting is updated. In which case the operator would have to
either upgrade the node to the supported version or submit another update
operation to remove the invalid setting.

Q: I didn't see anything about polling an external config provider for
updates to configuration; config on a sidecar could end up stale relative
to the config stored in an external system if not restarted. Is that
something we're going to leave for future work or leave in the hands of
operators?

Initially the operator can use the /reloadConfig endpoint but I have added
a future work point for automic config refresh.

> Q: If a new config param is added in a patch version and we don't have an
updated base cassandra.yaml, we'd be moving from "cassandra.yaml w/default
overrides what's in Config.java" to "we use what's in Config.java". *In
theory* we'd always keep our values in cassandra.yaml in sync
w/Config.java, but this change to the sidecar base .yaml as authoritative
means we'd need a new mechanism (human toil, automated generation, ?) to
keep the base sidecar C* .yaml config in sync w/whatever MINOR we're
running I think.

The sidecar base cassandra.yaml will not be an exhaustive list of all
settings, but just a explicitly defined settings like the shipped
cassandra.yaml. When a new setting is added in a minor, it will not be
specified in the base template so the Config.java default will be picked
up, no sync will be needed unless the user wants to update the setting. Is
this what you meant ?

> Q: What happens if someone mutates parameters via the REST API that are
Bad News? Thinking things like num_tokens, partitioner, cluster_name,
initial_token, etc. Maybe we have a built-in blocklist for .yaml values you
can't mutate remotely and 422 on them, or provide a base list of
blocklisted .yaml params disallowed in overlay PATCH to prevent these kinds
of issues and make them more "durable" in the base .yaml config file?
There's a tension here between wanting to be able to setup the normal base
config via API and wanting to prevent disastrous changes via the API.

I added a blocklist concept to prevent modifying unsafe settings. A PATCH
targeting a blocklisted key is allowed only if the key is absent from both
the base template and the current overlay, so the initial configuration can
be performed via the API.

> Q: Which leads to the question: how do operators push out the base .yaml
config across a fleet? We have the ConfigurationProvider abstraction for
overlays, but how do operators get the base template out there? Especially
since you can / will have a different base template per instance on the
node, this seems like a gap where operators would still have a lot of toil.
Maybe some kind of "pull the base cassandra.yaml from the local node as
base template if none is provided" so the default is to defer to what's in
the local node, lock it in, then add overlays? I think a mechanism like
this could help w/the MINOR version .yaml drift as well, if we had a
mechanism to defer to the node local base .yaml file and then overlay
things on top of it.

Bootstrapping the config template is out of the scope of this CEP, this
will need to be performed by an external orchestrator/config manager or
manual configuration. I think one possible deployment model is to ship a
single stock cassandra.yaml template for all the instances and perform
customizations per instance via API or remote config provider overlay via
an external orchestrator.


On Tue, 17 Mar 2026 at 15:04 Josh McKenzie <[email protected]> wrote:

> Fantastic work here Paulo et al. Just finished first read through and have
> some questions and observations.
>
> Operators can update the base template for all instances by modifying
> sidecar.yaml and restarting Sidecar, while individual instance
> customizations remain preserved in their respective overlays.
>
> Could we allow hot reloading via API endpoint trigger rather than
> requiring bouncing the sidecar?
>
> It then validates this result against a version-aware configuration schema
> maintained per Cassandra major version to ensure that all cassandraYaml keys
> being updated are recognized for that version. Updates of unknown keys are
> rejected to prevent Cassandra startup failures.
>
> Am I reading this correctly that this design would be a single
> cassandra.yaml per MAJOR Cassandra version? If we introduce new Config.java
> + cassandra.yaml values in a patch release, won't that run afoul of this
> design?
>
> Q: I didn't see anything about polling an external config provider for
> updates to configuration; config on a sidecar could end up stale relative
> to the config stored in an external system if not restarted. Is that
> something we're going to leave for future work or leave in the hands of
> operators?
>
> Q: If a new config param is added in a patch version and we don't have an
> updated base cassandra.yaml, we'd be moving from "cassandra.yaml w/default
> overrides what's in Config.java" to "we use what's in Config.java". *In
> theory* we'd always keep our values in cassandra.yaml in sync
> w/Config.java, but this change to the sidecar base .yaml as authoritative
> means we'd need a new mechanism (human toil, automated generation, ?) to
> keep the base sidecar C* .yaml config in sync w/whatever MINOR we're
> running I think.
>
> Q: What happens if someone mutates parameters via the REST API that are
> Bad News? Thinking things like num_tokens, partitioner, cluster_name,
> initial_token, etc. Maybe we have a built-in blocklist for .yaml values you
> can't mutate remotely and 422 on them, or provide a base list of
> blocklisted .yaml params disallowed in overlay PATCH to prevent these kinds
> of issues and make them more "durable" in the base .yaml config file?
> There's a tension here between wanting to be able to setup the normal base
> config via API and wanting to prevent disastrous changes via the API.
>
> Q: Which leads to the question: how do operators push out the base .yaml
> config across a fleet? We have the ConfigurationProvider abstraction for
> overlays, but how do operators get the base template out there? Especially
> since you can / will have a different base template per instance on the
> node, this seems like a gap where operators would still have a lot of toil.
> Maybe some kind of "pull the base cassandra.yaml from the local node as
> base template if none is provided" so the default is to defer to what's in
> the local node, lock it in, then add overlays? I think a mechanism like
> this could help w/the MINOR version .yaml drift as well, if we had a
> mechanism to defer to the node local base .yaml file and then overlay
> things on top of it.
>
> And lastly - apologies if I misread or misunderstood anything in the CEP.
> If nothing else it'll give us insight into areas to flesh out verbiage for
> clarity.
>
> On Tue, Mar 17, 2026, at 12:32 PM, Paulo Motta wrote:
>
> Hi everyone,
>
> I'd like to propose CEP-62: Cassandra Configuration Management via Sidecar
> for discussion by the community.
>
> CASSSIDECAR-266[1] introduced Cassandra process lifecycle management
> capabilities to Sidecar, giving operators the ability to start and stop
> Cassandra instances programmatically. However, Sidecar currently has no way
> to manipulate the configuration files that those instances consume at
> startup.
>
> Many Cassandra settings (memtable configuration, SSTable settings,
> storage_compatibility_mode) cannot be modified at runtime via JMX/CQL and
> must be set in cassandra.yaml or JVM options files, requiring a restart to
> take effect. Managing these files manually or through custom tooling is
> cumbersome and lacks a stable API.
>
> This CEP extends Sidecar's lifecycle management by adding configuration
> management capabilities for persisted configuration artifacts. It
> introduces a REST API for reading and updating cassandra.yaml and JVM
> options, a pluggable ConfigurationProvider abstraction for integration with
> centralized configuration systems (etcd, Consul, or custom backends), and
> version-aware validation to prevent startup failures.
>
> This CEP also serves as a prerequisite for future Cassandra upgrades via
> Sidecar. For example, upgrading from Cassandra 4 to Cassandra 5 requires
> updating storage_compatibility_mode in cassandra.yaml. The configuration
> management capabilities introduced here will enable Sidecar to orchestrate
> such upgrades by updating configuration artifacts alongside binary version
> changes.
>
> The CEP is linked here:
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-62%3A+Cassandra+Configuration+Management+via+Sidecar
>
> Looking forward to your feedback!
>
> Thanks,
>
> Paulo
>
> [1] - https://issues.apache.org/jira/browse/CASSSIDECAR-266
>
>
>

Re: [DISCUSS] CEP-62: Cassandra Configuration Management via Sidecar

Reply via email to