Extracting Config as suggested into its own artifact is definitely something to strive for. If we want to just verify that names / types are OK without further validation of values, which might be some kind of a poor's guy validator, check classes in tests we have such as ConfigCompatibilityTestGenerate, ConfigCompatibilityTest and files in test/data/config directory which contain dumps of configuration keys with their types as a simple text file. We could just copy these over and check against that. At least something, without having Config as an artifact which is a Holy Grail here, albeit way more involved.
On Thu, Mar 19, 2026 at 5:32 PM Josh McKenzie <[email protected]> wrote: > > The downside is that we will need to maintain a copy of Cassandra's > Config.java class in sidecar and keep this in sync with Cassandra's everytime > a new major is released, but we would have to update the schema anyway even > if using a schema spec language. > > Rather than copying, can we automate pulling this file down from each > supported C* version as part of the build process? We need to unwind a bunch > of code duplication in the ecosystem already. > > Ultimately, I'd like to see Config and some shared utility classes (CommitLog > reading, SSTable reading/writing, etc) end up in a shared artifact which > would make a lot of this kind of stuff (cross-ecosystem understanding on > formats, versioning, and re-use) a lot simpler and cleaner for all of us. > Never-mind versioning on Hints / CommitLog / MessagingService and the > embedding of MS versions inside CommitLog files that's given rise to the > compounding complexity and confusion around storage_compatibility_mode setups. > > On Wed, Mar 18, 2026, at 1:30 PM, Paulo Motta wrote: > > Thanks for your input, Chris! See answers below. > > > *Do you have any thoughts on what the schema may look like?* > > My idea is to use Cassandra's own Java validation mechanism to check the > updated setting against the schema, for example: > > ``` > YamlConfigurationLoader loader = new YamlConfigurationLoader(); > Map<String, Object> settings = Map.of("setting_name", value); > Config config = loader.fromMap(settings, Config.class); // throws > ConfigurationException on invalid > ``` > > The downside is that we will need to maintain a copy of Cassandra's > Config.java class in sidecar and keep this in sync with Cassandra's everytime > a new major is released, but we would have to update the schema anyway even > if using a schema spec language. > > > *Are there any plans to support externally provided schemas?* > > I haven't thought about externally provided schemas, since this doesn't > really fit the schema model I'm planning to use. I have added a parameter > `skipValidation` to bypass schema validation for lesser known or > not-yet-supported settings. > > > *What is the scope of a configuration?* > > The scope of a configuration is all instances managed by a single sidecar > instance, which means they're collocated in the same physical machine. The > scope of the CEP is to provide only instance-based configuration, wider-scope > configuration should be provided by an external configuration manager or > orchestrator. > > > *What happens when someone logs into a node and changes a configuration > file manually?* > > This scenario will not be possible if sidecar APIs are used to start the > node, because the node startup endpoint will refresh the disk configuration > with the base template + provider overlay prior to starting the node > overwriting any manual changes and ensuring only the configuration modified > by the API will be loaded. > > I hope I have provided answers to your questions, please let me know if > anything is not clear. > > On Tue, 17 Mar 2026 at 16:20 Christopher Bradford <[email protected]> > wrote: > > Thank you for sharing Paulo, this is awesome! It's interesting to look at the > myriad of ways users implement configuration for cassandra (manual, > automation templates, etc). I'm very interested in the version specific > schema validation for configuration, configuration drift and uniformity. Some > questions: > > Do you have any thoughts on what this may look like? > For example OpenAPI spec, JSON Schema etc? DataStax maintained a set of > configuration definitions and templates for a time. It felt like there was a > significant barrier to entry around updating them. These were leveraged by > the cass-config-builder and K8ssandra, but that has been deprecated in favor > of a different approach today. > > Are there any plans to support externally provided schemas? > Can I provide my own schema file and validation parameters and expect it to > work with Sidecar? We've encountered a few examples of hidden parameters in > cassandra.yaml (depending on the flavor of C* being run) that would have > tripped up strict validation. > > What is the scope of a configuration? > It's fairly common to provide a unique configuration per DC based on > hardware, workload, etc (not to mention any changes to rackdc.properties). As > a user of this system are we pushing overlays (and or configuration HTTP > requests) to the sidecar running on each node, each DC, per cluster with > flags? Looking at the current design it appears as though configuration is > handled per node. > > What happens when someone logs into a node and changes a configuration file > manually? > I wonder if it would be helpful to provide an endpoint (possibly enriching an > existing one) to report when the configuration files on disk differ from the > values materialized by sidecar. I remember generating a number of reports for > users highlighting the inconsistency in configuration between nodes. Ideally > Sidecar resolves the possibility of this happening, but a user with ssh and a > bit of determination will always find a way. I could see this being leveraged > for determining which nodes may need a restart to pick up configuration > changes (assuming lack of hot-reloading support for specific fields) as well > as running reports for misconfigured instances. > > I see there may be a little overlap with Josh's questions, my apologies for > any duplication. Again, thank you for the CEP. > > Cheers, > ~Chris > Christopher Bradford > > > > On Tue, Mar 17, 2026 at 3:04 PM Josh McKenzie <[email protected]> wrote: > > > Fantastic work here Paulo et al. Just finished first read through and have > some questions and observations. > > Operators can update the base template for all instances by modifying > sidecar.yaml and restarting Sidecar, while individual instance customizations > remain preserved in their respective overlays. > > Could we allow hot reloading via API endpoint trigger rather than requiring > bouncing the sidecar? > > It then validates this result against a version-aware configuration schema > maintained per Cassandra major version to ensure that all cassandraYaml keys > being updated are recognized for that version. Updates of unknown keys are > rejected to prevent Cassandra startup failures. > > Am I reading this correctly that this design would be a single cassandra.yaml > per MAJOR Cassandra version? If we introduce new Config.java + cassandra.yaml > values in a patch release, won't that run afoul of this design? > > Q: I didn't see anything about polling an external config provider for > updates to configuration; config on a sidecar could end up stale relative to > the config stored in an external system if not restarted. Is that something > we're going to leave for future work or leave in the hands of operators? > > Q: If a new config param is added in a patch version and we don't have an > updated base cassandra.yaml, we'd be moving from "cassandra.yaml w/default > overrides what's in Config.java" to "we use what's in Config.java". In theory > we'd always keep our values in cassandra.yaml in sync w/Config.java, but this > change to the sidecar base .yaml as authoritative means we'd need a new > mechanism (human toil, automated generation, ?) to keep the base sidecar C* > .yaml config in sync w/whatever MINOR we're running I think. > > Q: What happens if someone mutates parameters via the REST API that are Bad > News? Thinking things like num_tokens, partitioner, cluster_name, > initial_token, etc. Maybe we have a built-in blocklist for .yaml values you > can't mutate remotely and 422 on them, or provide a base list of blocklisted > .yaml params disallowed in overlay PATCH to prevent these kinds of issues and > make them more "durable" in the base .yaml config file? There's a tension > here between wanting to be able to setup the normal base config via API and > wanting to prevent disastrous changes via the API. > > Q: Which leads to the question: how do operators push out the base .yaml > config across a fleet? We have the ConfigurationProvider abstraction for > overlays, but how do operators get the base template out there? Especially > since you can / will have a different base template per instance on the node, > this seems like a gap where operators would still have a lot of toil. Maybe > some kind of "pull the base cassandra.yaml from the local node as base > template if none is provided" so the default is to defer to what's in the > local node, lock it in, then add overlays? I think a mechanism like this > could help w/the MINOR version .yaml drift as well, if we had a mechanism to > defer to the node local base .yaml file and then overlay things on top of it. > > And lastly - apologies if I misread or misunderstood anything in the CEP. If > nothing else it'll give us insight into areas to flesh out verbiage for > clarity. > > On Tue, Mar 17, 2026, at 12:32 PM, Paulo Motta wrote: > > Hi everyone, > > I'd like to propose CEP-62: Cassandra Configuration Management via Sidecar > for discussion by the community. > > CASSSIDECAR-266[1] introduced Cassandra process lifecycle management > capabilities to Sidecar, giving operators the ability to start and stop > Cassandra instances programmatically. However, Sidecar currently has no way > to manipulate the configuration files that those instances consume at startup. > > Many Cassandra settings (memtable configuration, SSTable settings, > storage_compatibility_mode) cannot be modified at runtime via JMX/CQL and > must be set in cassandra.yaml or JVM options files, requiring a restart to > take effect. Managing these files manually or through custom tooling is > cumbersome and lacks a stable API. > > This CEP extends Sidecar's lifecycle management by adding configuration > management capabilities for persisted configuration artifacts. It introduces > a REST API for reading and updating cassandra.yaml and JVM options, a > pluggable ConfigurationProvider abstraction for integration with centralized > configuration systems (etcd, Consul, or custom backends), and version-aware > validation to prevent startup failures. > > This CEP also serves as a prerequisite for future Cassandra upgrades via > Sidecar. For example, upgrading from Cassandra 4 to Cassandra 5 requires > updating storage_compatibility_mode in cassandra.yaml. The configuration > management capabilities introduced here will enable Sidecar to orchestrate > such upgrades by updating configuration artifacts alongside binary version > changes. > > The CEP is linked here: > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-62%3A+Cassandra+Configuration+Management+via+Sidecar > > Looking forward to your feedback! > > Thanks, > > Paulo > > [1] - https://issues.apache.org/jira/browse/CASSSIDECAR-266 > > >
