Hi everyone, Following up on my earlier comment, I'd like to elaborate on the UDF-based approach for role generation that I mentioned, as it might offer an interesting alternative perspective to consider alongside CEP-55.
I believe we could leverage Cassandra's existing UDF infrastructure to achieve the same goals as CEP-55, with lower implementation complexity. Here's the approach: We could extend the existing CREATE ROLE statement to accept UDF expressions for both role names and passwords: -- Generate role name using UDF CREATE ROLE generate_role_name() WITH PASSWORD = 'static_password'; -- Generate both role name and password CREATE ROLE generate_role_name() WITH PASSWORD = generate_password(16); -- Compose with other functions CREATE ROLE 'service_' + uuid() WITH PASSWORD = secure_random(20); I've actually prototyped this approach and have it working on a branch. It leverages existing UDF features, follows similar patterns to existing CQL statements, and offers operational simplicity (we can just CREATE/DROP functions without updating config/restarts/class path management). I think it's a natural evolution of existing capabilities rather than a new feature category. This isn't meant to replace CEP-55, but rather to offer an alternative implementation path that might achieve the same goals with different trade-offs. If the community prefers CEP-55's explicit CREATE GENERATED ROLE syntax for clarity, that's completely valid. However, if there's interest in exploring a more composable, function-based approach, I'd be happy to share the prototype and discuss further. I appreciate the thorough discussion everyone has had on this topic—it really highlights the thoughtful consideration this community gives to new features. Best regards, Harikrishna On Fri, Sep 19, 2025 at 7:22 PM Josh McKenzie <[email protected]> wrote: > We end up with duplicate implementation > > On further reflection, if we had some kind of shared library the sidecar > and C* could both rely on where we could place CQL-based operations, we > wouldn't have this struggle w/duplication, where we place functionality, > and version-based support. > > Another thing I would *not* suggest we block this CEP on *at all*, but > just an interesting data point IMO. > > On Fri, Sep 19, 2025, at 9:42 AM, Josh McKenzie wrote: > > Wow - I seem to really have struck a nerve. > > Let me reiterate what I closed my earlier email with: Why. Not. Both. > > Nobody is suggesting we gatekeep things and put them only in the sidecar > to try and coerce people to use it. > > Let me reiterate: I strongly disagree with characterizing features added > to the sidecar as: > > Putting this to Sidecar almost guarantees nobody is going to use this > particular functionality. > > That's dismissive, implies that adding features to the sidecar is a waste > of time, and is directly stating that putting things in the sidecar will > "almost guarantee nobody is going to use" it. Which is clearly false given > the multiple large organizations with large cassandra fleets who are > actively integrating the sidecar with their environments today. > > So let's try to back away from the misunderstanding / straw-man that > anyone is suggesting we strategically place features in certain places to > force peoples' hands, and instead stay focused on the discussion at hand. > > We have 3 paths I can see: > > 1. We do it in C* only and expose the API through sidecar as well. > This means either: > 1. It'll be available in trunk only > 2. Or we open Pandora's Box and talk about backporting features to > older GA branches of C* if we want this functionality on all GA > versions of > C* > 2. We do it in the sidecar and add support for each version of C*. > This means: > 1. It'll only be available to people using the sidecar > 3. We do it in both C* and the sidecar. This means: > 1. It'll be available on all GA versions of C* w/out backporting > 2. We *don't* have to tackle the backporting question > 3. We end up with duplicate implementation > > My instinct is we should go with #1: do it in C*, expose the API through > the sidecar, and separately open up a thread of discussion on the dev list > about our backporting policy since it seems like a lot of people are > backporting features to older GA branches anyway. Plus we have some real > hard blockers that are going to slow adoption of new versions of C* > (one-way doors that increase risk), so if we want this functionality > available to users in the near future we'll need to tackle that question. > > I definitely DON'T think we should block this CEP on us having a hard > conversation about backports so: > - feature in C* > - exposed via sidecar > - conversation about backporting separately > > is my preference fwiw. > > On Fri, Sep 19, 2025, at 4:03 AM, Štefan Miklošovič wrote: > > > > On Thu, Sep 18, 2025 at 10:13 PM Johnny Miller <[email protected]> > wrote: > > Hey everyone, > > I wanted to share a few thoughts based on the CEP and this thread. From my > understanding: > > 1 - The Cassandra sidecar uses a standard authentication/authorization > mechanism to connect to Cassandra, just like any other client application, > and is limited to the actions permitted by its certificate/role mapping. > 2 - There’s a proposal for some convenient CQL statements (CREATE > GENERATED ROLE etc,,) that would allow generation of random rolenames > (similar to the existing random password functionality). > 3 - The sidecar would (or could) expose an API for operators to generate > multiple users with random usernames, which would in turn delegate to this > new convenience CQL > 4 - If you don't use sidecar you can still leverage this new CQL via other > processes (Vault, Ansible, Bash whatever) > > I don't understand why sidecar is relevant here. It's just another app > like vault, bash, ansible etc.. using this new CQL which could be reused > for those frameworks also. If it's in CQL then it's going to be reused by > other tech where sidecar is not being deployed. I would like to use this > new CQL as I need to leverage it via what I am able to deploy and sidecar > is not always going to be an option - in fact the vault cassandra plugin > could use this also and is pretty well widely adopted and approved in a lot > of enterprises. > > > See, Josh? Here you have it in black and white. The existence of something > does not guarantee its usage at all. I think we should be brutally honest > here about that. It will take _years_ if any highly regulated environment > e.g. banks etc. will see Sidecar as a viable, vetted and audited component > they might even start to consider to integrate into their environments. But > there are already people on the ground who have to play the cards they > have. I do not think it is reasonable to reject an in-database solution > just for the promise of something different later. It is not about me not > wanting Sidecar to be successful, the very opposite is true, but we have to > be realists first. > > Also, notice how nobody actually protests the addition on CQL level. > "Freezing CQL" is pretty low on the priorities list here. It is nice to > have at best and it is a great goal in the ideal world but when actually > facing it nobody seems to be ultimately against it. > > > > What I’m struggling to understand is where the “sidecar” aspect makes a > difference. If it’s simply acting as a regular application - authenticating > and executing CQL like Ansible, Vault bash scripts, or any other client - > then I don’t see any issue. > > However, if the sidecar is bypassing RBAC or given some special ability to > interact with Cassandra’s DCLs outside of the normal authentication and > authorization flow, that would be a serious concern. It would undermine > both the security model and auditing guarantees. In my view, the sidecar > should behave like any other client with a named user and explicitly > assigned permissions. > > I’m still relatively new to the details of the sidecar project, so if > there is a special non-standard path exposed that allows it to circumvent > existing RBAC and auditing controls, that feels risky and like a potential > security hole. If that’s not the case, and it’s just an app leveraging some > convenience CQL which may help people and I have usually done this with > things like > https://developer.hashicorp.com/vault/docs/secrets/databases/cassandra - > but if theres a conveience CQL that does this better and is also audited > then thats safer no? > > > Yes. Simpler = safer. But to answer your question, I believe the roles you > authenticate with against endpoints are mapped to roles in Cassandra. > Please see this section (1). So there is no custom auth / custom bypassing > etc. It is mapped to Cassandra. > > (1) > https://github.com/apache/cassandra-sidecar/blob/trunk/conf/sidecar.yaml#L217-L279 > > > > Johnny > > On Thu, 18 Sept 2025 at 12:39, Štefan Miklošovič <[email protected]> > wrote: > > This is getting too complex so I have summarized pros / cons for each > approach. Taking Patrick's suggestions into consideration as well so nobody > can tell that I have completely disregarded that. > > > Generation in Sidecar > > - user has to deploy Sidecar > - needs to secure communication channels (TLS) > - calling "create role abc ..." will leak it in audit logs > - there would need to be a pluggable way to configure a generator able to > talk to external services, then additional complexity with patching Sidecar > - if this is not done then an extra layer of complexity to interpret the > response, putting more stress on integrators > - one Sidecar is enough to be able to create users. We need to configure > just one Sidecar to start to call endpoints capable of user creation. > - will be available for other Cassandra versions as well > > Generation in Cassandra > > - will be available only in trunk onwards > - custom integrations done by implementing IRoleManager and returning > custom response > - talking via (secure) CQL, no additional plumbing > - nothing leaks in audit logs > - It might be possible to code IRoleManager in such a way that user > credentials would not be stored in Cassandra at all. All operations dealing > with user management might be just proxied to external service (vault etc) > so no credentials whatsoever would be stored in Cassandra. The advantage of > that is that everything would be implemented in one place and CREATE > GENERATED ROLE would be completely transparent from the user's perspective. > This can not be achieved in Sidecar, it can not abstract away what > IRoleManager is doing. > - Cassandra would need to be configured in cassandra.yaml on each node. > While this might seem as sub-optimal, these things are configured just once > and then the creation of next node is for free as the configuration is > taken from some template (same as it would be done for Sidecar anyway). > > In case we wanted to reconfigure Sidecar to talk to another external > service or to reconfigure the generation as such, we would need to take > Sidecar down, change config, and start it up again. In the case of > Cassandra, it is possible to reconfigure this via JMX in runtime so no > restart is necessary. This functionality would be based on Guardrails which > already exposes GuardrailsMBean. I do not think there is a similar > counterpart of this functionality in Sidecar yet. You can not change the > settings on the fly. This would bring additional complexity to Sidecar > which is free in Cassandra already. > > There might be also the fusion of these approaches: > > - Sidecar would expose the endpoint. > - Sidecar would call "CREATE GENERATED ROLE" > - Response would be already returned, processed in Cassandra. > > So from Sidecar's point of view, it would just call an endpoint while the > actual generation would be done in Cassandra. The advantage of that is that > Cassandra might implement a completely custom IRoleManager with all logic > treating role management in a complex way (talking to external services > etc), but by the means of Sidecar it might be integrated further. > > On Thu, Sep 18, 2025 at 10:42 AM Štefan Miklošovič <[email protected]> > wrote: > > By the way, if you do it by Sidecar - that is you generate username on > Sidecar and then you send it via CQL so there will be "create role abc > ...", this will be also visible in audit logs, that exact statement. > However if you do "create generated role" this will not be leaking. If you > want this to be still somehow visible you might consider to turn on > Cassandra's Diagnostic Events on and propagate this information to whatever > sink you want if you truly want that. > > Also, by doing it in Sidecar, you also make Jaydeep's idea about coding > his own CassandraRoleManager which would interpret credentials stored e.g. > in some vault etc. more clunky. > > He would need to touch two things, first he would need to call Sidecar's > endpoint, endpoint would generate credentials, credentials would be sent to > Cassandra, role would be created, Sidecar would need to interpret these > credentials in whatever way Jaydeep sees right. So he would need to either > have _yet another layer of abstraction_ outside of Sidecar (more work) to > interpret what Sidecar returned him, or he would need to patch Sidecar > (more work) and make special generator (more work) which would know how to > talk to whatever external service handling credentials. Then this service > would need to be also somehow configured from Sidecar's point of view and > making it pluggable (more work). > > On Thu, Sep 18, 2025 at 8:08 AM Štefan Miklošovič <[email protected]> > wrote: > > That's right. I also think it is smaller. If you think about it in purely > practical terms, you would need to set up Sidecar, then make the connection > secure via TLS etc. (otherwise username and password would travel from > Sidecar to target recipient of these credentials via plaintext). Then you > need to authenticate the actual caller of that endpoint so it can reach it > in order to call Cassandra to create users for that ... > > What if you do not want to do ANY OF THESE THNIG? > > I mean ... if somebody is serious about Sidecar, all these things would be > done probably anyway but it is just an unnecessary hurdle to jump over if > one just needs to get the job done. By forcibly siphoning everything > through Sidecar, the very first question of a user would be: why? Why do I > need to take more steps in achieving something? Just to satisfy somebody's > architectural desires? > > Does it make sense to have e.g. CEP about cluster wide restarts in > Sidecar? Yes. Sure. Does it make sense to force people to call Sidecar to > create some users? No. It should be possible to do it with the least amount > of plumbing possible. The use cases are various. > > On Wed, Sep 17, 2025 at 11:03 PM Joel Shepherd <[email protected]> > wrote: > > > > On 9/17/2025 1:21 AM, Štefan Miklošovič wrote: > > On Wed, Sep 17, 2025 at 2:17 AM Joel Shepherd <[email protected]> wrote: > > Could I make a suggestion? Well, I will make a suggestion :-) , but if > it's not useful then feel free to ignore it. > > Could we talk a bit about how users/operators would work with the CREATE > ROLE features you're proposing? > Somewhat related to that ... is there any need for role "stability" across > clusters: e.g. I want to create a role that can access existing tables but > not create/drop tables or keyspaces, and for my own sanity I want that role > to have the same name on every cluster I operate. Do I have to implement a > custom role name generator to do that, or is that common enough > functionality that it should be supportable by the tooling I'm using to > manage my clusters? > > > > I do not think we have such a requirement for "stability". If you had this > requirement then you would not use the feature we are discussing here and > you created them manually. I also do not think that having the same name > everywhere is a good idea in general. Username is security sensitive as > well. > > We can agree to disagree on this. :-) I generally don't think names > should be considered especially sensitive but am really looking at this > more from how end-users are going to work with the capability. > > The use-case as I understand it is that there are organizations that have > or are going to create large numbers of clusters (say > 3), and they would > appreciate some automation around creating role names and credentials for > all those clusters. The proposal is to extend the CREATE ROLE statement to > enable the database to generate those names and credentials automatically, > including persisting them in the database itself. > > One thing I'm wondering about is what kind of tooling those organizations > are likely to be using for creating/managing all those clusters. Are they > going to be scripting, or are they going to be using some third-party > tooling like Terraform, CloudFormation, Puppet, etc.? If they're using > tooling like that, which is going to be a more natural fit: making > role/password generation available through CQL, or through Sidecar APIs, or > ... ? I don't have an opinion at the moment so that's not a rhetorical > question. I'd actually like to reason through what's going to work best for > the folks who actually have to manage tons of clusters all day long. > > > > I do not see why we should have a ton of logic / functionality outside of > Cassandra for doing basic things. I think that Cassandra is notoriously > known for its "do it yourself" approach and I think _that_ is the > primary impediment for broader adoption, not if we dare to introduce CREATE > GENERATED ROLE or not. The focus on usability is completely missed. For a > lot of things you want to have you have to have "tooling" which you need to > take care of and so on. People are sick of it. They just want to do the > thing in the most efficient and time-saving manner. > > This isn't an either-or question. I'm not posing "CREATE GENERATED ROLE" > vs infra-as-code (IAC) support. I'm poking at the best way for the two to > work together. Because I think/hope that most people who run large clusters > and/or a lot of clusters (or really a lot of instances of any kind of > service) use some flavor of IAC. There is a lot more than Cassandra to > manage: there's the hosts, disk in some form, networking OS, config, keys, > schema, etc. If I already have a tool to manage all the infra, it'd be nice > for Cassandra to play nicely with that tooling so I can do my basic cluster > setup set-up via automation as well. That doesn't exclude me from putting > down my IAC tool and continuing on to do Cassandra configuration in > Cassandra if I wish ... but in my mind having to jump between tools > (including cqlsh) to configure different aspects of all the things involved > in standing up my cluster is not a usability improvement ... especially if > I have to do it a lot. > > So I'm trying to shed some light on the Sidecar and/or CQL debate by > asking how people are going to be using this functionality "at scale" > (where efficient and time-saving may look very different from adhoc use) > and if there's any benefit to API access via Sidecar vs access via CQL. > > (TBH, I'm actually leaning towards your CQL proposal because I think the > attack surface is actually smaller than it is with letting Sidecar execute > CQL on the API caller's behalf.) > > Thanks -- Joel. > > > > > When I was introduced to this community for the first time, like 2015-16 > maybe, I remember that there was somebody on the mailing list complaining > that "repair should be automatic", "that should be provided", "this should > be natively in". People see this for years. It takes just 9 years to > finally introduce automatic repairs. Thank god for repairing people finally > doing that. They should be weighted in gold. But the response to that was > that "well if you need it you need to write it yourself, there is no "one > size fits all!", you need to take care of that yourself". Just imagine > that. This was a kind of genuinely meant response. How are we going to make > this popular if everything beyond trivial is left to an end user to figure > out. Who sane is going to put up with that? People just want to turn on the > thing and not think too much about it anymore. > > > I don't have strong opinions on CQL vs Sidecar, but I think one way to > frame the debate is to look at which will work best with the tooling that > people already use to manage large numbers of clusters. > > Thanks -- Joel. > On 9/16/2025 3:15 PM, Štefan Miklošovič wrote: > > > Oh crap, what a feedback! If nothing else this shows a lesson to everybody > that the most sure way to have a fast feedback if you are tired of waiting > or impatient so you can move quickly is to just propose your ideas, then > boldly proclaim you go to do something and the universe will mysteriously > take care of finding out somebody who will reject it. Because people are > not always interested in agreeing. A lot of times, they take action only in > case they don't and are put in front of it. So don't be afraid to take some > flak as soon as possible! > > > > On Tue, Sep 16, 2025 at 9:05 PM Patrick McFadin <[email protected]> > wrote: > > Thanks Mick, I'm just digging into this more after a long week of travel. > Generally, I'm -1 for adding more custom syntax. Another concern of mine > is adding control plane actions in DDL. I understand the usefulness of a > feature like this in ops. It's a great idea.. Here would be my counter > proposal: > - Leave the CQL as is and keep "CREATE ROLE" etc as is, and avoid making > changes to core Cassandra. > > > Why should we keep it "as is"? Genuinely asking. Why? Where is this need > for conserving stuff coming from? Is this what we are doing here? Adding as > little as possible? I think we are stifling innovation unnecessarily. There > was the same discussion about constraints and CHECK NOT NULL / NOT NULL > where we were trying to follow "the Holy Postgres Grail". I just don't get > it. Are we not obsessed with that at this point? Literally nobody cares if > there will be CREATE GENERATED ROLE. Nobody. Cares. So I do not take this > point of yours as valid without some strong backing from your side. > > > - Move the generation & policy to the sidecar project. A sidecar endpoint > will generate the role name/password, enforce > > prefix/suffix/length requirements, ensure uniqueness, and then return the > role and password (or a secret handle) to the caller. > > > Well the problem I see in putting this to Sidecar is that this would be > only possible to do via HTTP(S). Not everybody is interested in it. Hardly. > Zero interest. Sidecar is 0.2.0 at this point. I think that realistically > speaking I am not far from the truth at all if I say that there is > practically nobody who is using 0.2.0 in production. 0.2.0. I do not count > exceptions as early adopters or Analytics. > > Putting this to Sidecar almost guarantees nobody is going to use this > particular functionality. People have their own control planes, their own > way of generating this stuff and they are not going to deploy Sidecar just > because they want to delegate this task to it. Come on. I think that it > would, paradoxically, create more problems for them. Not less. So again, I > do not take this point as something which is solving anything. This will > have 0 users when put in Sidecar. I think it would be better if we just > flat out refuse this instead of putting that to Sidecar. It is even worse > imho. > > Another problem with Sidecar I see is that the current implementation is > pluggable. How do you want to make this pluggable in Sidecar? Pluggable > how? People might have their own opinion on how role names should be > generated. That is why you can just code your own generator / validator, > put it on the class path and be done with it. How are you supposed to > "patch Sidecar"? You create a custom implementation, then you put it on the > class path of Sidecar? Is this even supported? I think that you have > proposed it with a good will but I don't think that would fly. > > > Why? > - End users will have it faster since it will work with any version of > Cassandra supporting the CREATE syntax. (No having to backport either) > - Keeps control plane actions optional and separated. Not an attack > surface inside core Cassandra > > > Thirdly, what _attack surface_? I think you are pretty aware of the fact > that this feature is by default turned off. If you have an organisation > deploying hundreds of clusters and for each they have to figure out some > role name for a user which is going to use it, how is this going to be > abused concretely? There are dedicated accounts for CQL management, > creation of a role is tied to some workflow etc. What is attacked exactly > and how? Concrete examples please. > > Dineshi had the concern that "what if we just have a script which will > generate roles repeatedly nonstop?" How is this different from having a > script which would generate roles itself instead of Cassandra and it would > execute that? What's the difference really? If you want to abuse it you > will. There is no protection against that unless we put some rate limiting > in front of it - which I do not have a problem to address in follow-up work > as already explained. > > > - We keep the syntax of CQL more generic and less one-off. > > > I don't think this is relevant, really. I think we should abandon this > mindset. At this point, to make the point, I suspect that CQL had to "hurt > you" somehow :) > > Regards > > > - k8s/Cloud native friendly with separation of control plane/data plane. > Patrick > > > On Tue, Sep 16, 2025 at 7:31 AM Mick <[email protected]> wrote: > > > > > > I think enough time passed for everybody to participate in the > discussion so I would just move on and start the voting thread soon. > > > > Can we give CEP discussions longer than ~one week, please. > > Folk are easily away/offline for a whole week. Take for example many who > were at Community over Code and may still be catching up on their inbox, > thinking dev@ is a less urgent folder. > > I haven't look at how fast the other CEP discuss threads have turned > around, I apologise if I'm only singling one out, my concern applies > generally. > > > >
