Re: [DISCUSS] CEP-55 Generated role names

Štefan Miklošovič Fri, 19 Sep 2025 01:04:21 -0700

On Thu, Sep 18, 2025 at 10:13 PM Johnny Miller <[email protected]>
wrote:


> Hey everyone,
>
> I wanted to share a few thoughts based on the CEP and this thread. From my
> understanding:
>
> 1 - The Cassandra sidecar uses a standard authentication/authorization
> mechanism to connect to Cassandra, just like any other client application,
> and is limited to the actions permitted by its certificate/role mapping.
> 2 - There’s a proposal for some convenient CQL statements (CREATE
> GENERATED ROLE etc,,) that would allow generation of random rolenames
> (similar to the existing random password functionality).
> 3 - The sidecar would (or could) expose an API for operators to generate
> multiple users with random usernames, which would in turn delegate to this
> new convenience CQL
> 4 - If you don't use sidecar you can still leverage this new CQL via other
> processes (Vault, Ansible, Bash whatever)
>
> I don't understand why sidecar is relevant here. It's just another app
> like vault, bash, ansible etc.. using this new CQL which could be reused
> for those frameworks also. If it's in CQL then it's going to be reused by
> other tech where sidecar is not being deployed. I would like to use this
> new CQL as I need to leverage it via what I am able to deploy and sidecar
> is not always going to be an option - in fact the vault cassandra plugin
> could use this also and is pretty well widely adopted and approved in a lot
> of enterprises.
>

See, Josh? Here you have it in black and white. The existence of something
does not guarantee its usage at all. I think we should be brutally honest
here about that. It will take _years_ if any highly regulated environment
e.g. banks etc. will see Sidecar as a viable, vetted and audited component
they might even start to consider to integrate into their environments. But
there are already people on the ground who have to play the cards they
have. I do not think it is reasonable to reject an in-database solution
just for the promise of something different later. It is not about me not
wanting Sidecar to be successful, the very opposite is true, but we have to
be realists first.

Also, notice how nobody actually protests the addition on CQL level.
"Freezing CQL" is pretty low on the priorities list here. It is nice to
have at best and it is a great goal in the ideal world but when actually
facing it nobody seems to be ultimately against it.


>
> What I’m struggling to understand is where the “sidecar” aspect makes a
> difference. If it’s simply acting as a regular application - authenticating
> and executing CQL like Ansible, Vault bash scripts, or any other client -
> then I don’t see any issue.
>
> However, if the sidecar is bypassing RBAC or given some special ability to
> interact with Cassandra’s DCLs outside of the normal authentication and
> authorization flow, that would be a serious concern. It would undermine
> both the security model and auditing guarantees. In my view, the sidecar
> should behave like any other client with a named user and explicitly
> assigned permissions.
>
> I’m still relatively new to the details of the sidecar project, so if
> there is a special non-standard path exposed that allows it to circumvent
> existing RBAC and auditing controls, that feels risky and like a potential
> security hole. If that’s not the case, and it’s just an app leveraging some
> convenience CQL which may help people and I have usually done this with
> things like
> https://developer.hashicorp.com/vault/docs/secrets/databases/cassandra -
> but if theres a conveience CQL that does this better and is also audited
> then thats safer no?
>

Yes. Simpler = safer. But to answer your question, I believe the roles you
authenticate with against endpoints are mapped to roles in Cassandra.
Please see this section (1). So there is no custom auth / custom bypassing
etc. It is mapped to Cassandra.

(1)
https://github.com/apache/cassandra-sidecar/blob/trunk/conf/sidecar.yaml#L217-L279


>
> Johnny
>
> On Thu, 18 Sept 2025 at 12:39, Štefan Miklošovič <[email protected]>
> wrote:
>
>> This is getting too complex so I have summarized pros / cons for each
>> approach. Taking Patrick's suggestions into consideration as well so nobody
>> can tell that I have completely disregarded that.
>>
>>
>> Generation in Sidecar
>>
>> - user has to deploy Sidecar
>> - needs to secure communication channels (TLS)
>> - calling "create role abc ..." will leak it in audit logs
>> - there would need to be a pluggable way to configure a generator able to
>> talk to external services, then additional complexity with patching Sidecar
>> - if this is not done then an extra layer of complexity to interpret the
>> response, putting more stress on integrators
>> - one Sidecar is enough to be able to create users. We need to configure
>> just one Sidecar to start to call endpoints capable of user creation.
>> - will be available for other Cassandra versions as well
>>
>> Generation in Cassandra
>>
>> - will be available only in trunk onwards
>> - custom integrations done by implementing IRoleManager and returning
>> custom response
>> - talking via (secure) CQL, no additional plumbing
>> - nothing leaks in audit logs
>> - It might be possible to code IRoleManager in such a way that user
>> credentials would not be stored in Cassandra at all. All operations dealing
>> with user management might be just proxied to external service (vault etc)
>> so no credentials whatsoever would be stored in Cassandra. The advantage of
>> that is that everything would be implemented in one place and CREATE
>> GENERATED ROLE would be completely transparent from the user's perspective.
>> This can not be achieved in Sidecar, it can not abstract away what
>> IRoleManager is doing.
>> - Cassandra would need to be configured in cassandra.yaml on each node.
>> While this might seem as sub-optimal, these things are configured just once
>> and then the creation of next node is for free as the configuration is
>> taken from some template (same as it would be done for Sidecar anyway).
>>
>> In case we wanted to reconfigure Sidecar to talk to another external
>> service or to reconfigure the generation as such, we would need to take
>> Sidecar down, change config, and start it up again. In the case of
>> Cassandra, it is possible to reconfigure this via JMX in runtime so no
>> restart is necessary. This functionality would be based on Guardrails which
>> already exposes GuardrailsMBean. I do not think there is a similar
>> counterpart of this functionality in Sidecar yet. You can not change the
>> settings on the fly. This would bring additional complexity to Sidecar
>> which is free in Cassandra already.
>>
>> There might be also the fusion of these approaches:
>>
>> - Sidecar would expose the endpoint.
>> - Sidecar would call "CREATE GENERATED ROLE"
>> - Response would be already returned, processed in Cassandra.
>>
>> So from Sidecar's point of view, it would just call an endpoint while the
>> actual generation would be done in Cassandra. The advantage of that is that
>> Cassandra might implement a completely custom IRoleManager with all logic
>> treating role management in a complex way (talking to external services
>> etc), but by the means of Sidecar it might be integrated further.
>>
>> On Thu, Sep 18, 2025 at 10:42 AM Štefan Miklošovič <
>> [email protected]> wrote:
>>
>>> By the way, if you do it by Sidecar - that is you generate username on
>>> Sidecar and then you send it via CQL so there will be "create role abc
>>> ...", this will be also visible in audit logs, that exact statement.
>>> However if you do "create generated role" this will not be leaking. If you
>>> want this to be still somehow visible you might consider to turn on
>>> Cassandra's Diagnostic Events on and propagate this information to whatever
>>> sink you want if you truly want that.
>>>
>>> Also, by doing it in Sidecar,  you also make Jaydeep's idea about coding
>>> his own CassandraRoleManager which would interpret credentials stored e.g.
>>> in some vault etc. more clunky.
>>>
>>> He would need to touch two things, first he would need to call Sidecar's
>>> endpoint, endpoint would generate credentials, credentials would be sent to
>>> Cassandra, role would be created, Sidecar would need to interpret these
>>> credentials in whatever way Jaydeep sees right. So he would need to either
>>> have _yet another layer of abstraction_ outside of Sidecar (more work) to
>>> interpret what Sidecar returned him, or he would need to patch Sidecar
>>> (more work) and make special generator (more work) which would know how to
>>> talk to whatever external service handling credentials. Then this service
>>> would need to be also somehow configured from Sidecar's point of view and
>>> making it pluggable (more work).
>>>
>>> On Thu, Sep 18, 2025 at 8:08 AM Štefan Miklošovič <
>>> [email protected]> wrote:
>>>
>>>> That's right. I also think it is smaller. If you think about it in
>>>> purely practical terms, you would need to set up Sidecar, then make the
>>>> connection secure via TLS etc. (otherwise username and password would
>>>> travel from Sidecar to target recipient of these credentials via
>>>> plaintext). Then you need to authenticate the actual caller of that
>>>> endpoint so it can reach it in order to call Cassandra to create users for
>>>> that ...
>>>>
>>>> What if you do not want to do ANY OF THESE THNIG?
>>>>
>>>> I mean ... if somebody is serious about Sidecar, all these things would
>>>> be done probably anyway but it is just an unnecessary hurdle to jump over
>>>> if one just needs to get the job done. By forcibly siphoning everything
>>>> through Sidecar, the very first question of a user would be: why? Why do I
>>>> need to take more steps in achieving something? Just to satisfy somebody's
>>>> architectural desires?
>>>>
>>>> Does it make sense to have e.g. CEP about cluster wide restarts in
>>>> Sidecar? Yes. Sure. Does it make sense to force people to call Sidecar to
>>>> create some users? No. It should be possible to do it with the least amount
>>>> of plumbing possible. The use cases are various.
>>>>
>>>> On Wed, Sep 17, 2025 at 11:03 PM Joel Shepherd <[email protected]>
>>>> wrote:
>>>>
>>>>>
>>>>> On 9/17/2025 1:21 AM, Štefan Miklošovič wrote:
>>>>>
>>>>> On Wed, Sep 17, 2025 at 2:17 AM Joel Shepherd <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Could I make a suggestion? Well, I will make a suggestion :-) , but
>>>>>> if it's not useful then feel free to ignore it.
>>>>>>
>>>>>> Could we talk a bit about how users/operators would work with the
>>>>>> CREATE ROLE features you're proposing?
>>>>>> Somewhat related to that ... is there any need for role "stability"
>>>>>> across clusters: e.g. I want to create a role that can access existing
>>>>>> tables but not create/drop tables or keyspaces, and for my own sanity I
>>>>>> want that role to have the same name on every cluster I operate. Do I 
>>>>>> have
>>>>>> to implement a custom role name generator to do that, or is that common
>>>>>> enough functionality that it should be supportable by the tooling I'm 
>>>>>> using
>>>>>> to manage my clusters?
>>>>>>
>>>>>
>>>>> I do not think we have such a requirement for "stability". If you had
>>>>> this requirement then you would not use the feature we are discussing here
>>>>> and you created them manually. I also do not think that having the same
>>>>> name everywhere is a good idea in general. Username is security sensitive
>>>>> as well.
>>>>>
>>>>> We can agree to disagree on this. :-)  I generally don't think names
>>>>> should be considered especially sensitive but am really looking at this
>>>>> more from how end-users are going to work with the capability.
>>>>>
>>>>> The use-case as I understand it is that there are organizations that
>>>>>> have or are going to create large numbers of clusters (say  > 3), and 
>>>>>> they
>>>>>> would appreciate some automation around creating role names and 
>>>>>> credentials
>>>>>> for all those clusters. The proposal is to extend the CREATE ROLE 
>>>>>> statement
>>>>>> to enable the database to generate those names and credentials
>>>>>> automatically, including persisting them in the database itself.
>>>>>>
>>>>>> One thing I'm wondering about is what kind of tooling those
>>>>>> organizations are likely to be using for creating/managing all those
>>>>>> clusters. Are they going to be scripting, or are they going to be using
>>>>>> some third-party tooling like Terraform, CloudFormation, Puppet, etc.? If
>>>>>> they're using tooling like that, which is going to be a more natural fit:
>>>>>> making role/password generation available through CQL, or through Sidecar
>>>>>> APIs, or ... ? I don't have an opinion at the moment so that's not a
>>>>>> rhetorical question. I'd actually like to reason through what's going to
>>>>>> work best for the folks who actually have to manage tons of clusters all
>>>>>> day  long.
>>>>>>
>>>>>
>>>>> I do not see why we should have a ton of logic / functionality outside
>>>>> of Cassandra for doing basic things. I think that Cassandra is notoriously
>>>>> known for its "do it yourself" approach and I think _that_ is the
>>>>> primary impediment for broader adoption, not if we dare to introduce 
>>>>> CREATE
>>>>> GENERATED ROLE or not. The focus on usability is completely missed. For a
>>>>> lot of things you want to have you have to have "tooling" which you need 
>>>>> to
>>>>> take care of and so on. People are sick of it. They just want to do the
>>>>> thing in the most efficient and time-saving manner.
>>>>>
>>>>> This isn't an either-or question. I'm not posing "CREATE GENERATED
>>>>> ROLE" vs infra-as-code (IAC) support. I'm poking at the best way for the
>>>>> two to work together. Because I think/hope that most people who run large
>>>>> clusters and/or a lot of clusters (or really a lot of instances of any 
>>>>> kind
>>>>> of service) use some flavor of IAC. There is a lot more than Cassandra to
>>>>> manage: there's the hosts, disk in some form, networking OS, config, keys,
>>>>> schema, etc. If I already have a tool to manage all the infra, it'd be 
>>>>> nice
>>>>> for Cassandra to play nicely with that tooling so I can do my basic 
>>>>> cluster
>>>>> setup set-up via automation as well. That doesn't exclude me from putting
>>>>> down my IAC tool and continuing on to do Cassandra configuration in
>>>>> Cassandra if I wish ... but in my mind having to jump between tools
>>>>> (including cqlsh) to configure different aspects of all the things 
>>>>> involved
>>>>> in standing up my cluster is not a usability improvement ... especially if
>>>>> I have to do it a lot.
>>>>>
>>>>> So I'm trying to shed some light on the Sidecar and/or CQL debate by
>>>>> asking how people are going to be using this functionality "at scale"
>>>>> (where efficient and time-saving may look very different from adhoc use)
>>>>> and if there's any benefit to API access via Sidecar vs access via CQL.
>>>>>
>>>>> (TBH, I'm actually leaning towards your CQL proposal because I think
>>>>> the attack surface is actually smaller than it is with letting Sidecar
>>>>> execute CQL on the API caller's behalf.)
>>>>>
>>>>> Thanks -- Joel.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> When I was introduced to this community for the first time, like
>>>>> 2015-16 maybe, I remember that there was somebody on the mailing list
>>>>> complaining that "repair should be automatic", "that should be provided",
>>>>> "this should be natively in". People see this for years. It takes just 9
>>>>> years to finally introduce automatic repairs. Thank god for repairing
>>>>> people finally doing that. They should be weighted in gold. But the
>>>>> response to that was that "well if you need it you need to write it
>>>>> yourself, there is no "one size fits all!", you need to take care of that
>>>>> yourself". Just imagine that. This was a kind of genuinely meant response.
>>>>> How are we going to make this popular if everything beyond trivial is left
>>>>> to an end user to figure out. Who sane is going to put up with that? 
>>>>> People
>>>>> just want to turn on the thing and not think too much about it anymore.
>>>>>
>>>>>
>>>>>> I don't have strong opinions on CQL vs Sidecar, but I think one way
>>>>>> to frame the debate is to look at which will work best with the tooling
>>>>>> that people already use to manage large numbers of clusters.
>>>>>>
>>>>>> Thanks -- Joel.
>>>>>> On 9/16/2025 3:15 PM, Štefan Miklošovič wrote:
>>>>>>
>>>>>>
>>>>>> Oh crap, what a feedback! If nothing else this shows a lesson to
>>>>>> everybody that the most sure way to have a fast feedback if you are tired
>>>>>> of waiting or impatient so you can move quickly is to just propose your
>>>>>> ideas, then boldly proclaim you go to do something and the universe will
>>>>>> mysteriously take care of finding out somebody who will reject it. 
>>>>>> Because
>>>>>> people are not always interested in agreeing. A lot of times, they take
>>>>>> action only in case they don't and are put in front of it. So don't be
>>>>>> afraid to take some flak as soon as possible!
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Sep 16, 2025 at 9:05 PM Patrick McFadin <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks Mick, I'm just digging into this more after a long week of
>>>>>>> travel.
>>>>>>>
>>>>>>> Generally, I'm -1 for adding more custom syntax. Another concern of
>>>>>>> mine is adding control plane actions in DDL. I understand the 
>>>>>>> usefulness of
>>>>>>> a feature like this in ops. It's a great idea.. Here would be my counter
>>>>>>> proposal:
>>>>>>>
>>>>>>>  - Leave the CQL as is and keep "CREATE ROLE" etc as is, and avoid
>>>>>>> making changes to core Cassandra.
>>>>>>>
>>>>>>
>>>>>> Why should we keep it "as is"? Genuinely asking. Why? Where is this
>>>>>> need for conserving stuff coming from? Is this what we are doing here?
>>>>>> Adding as little as possible? I think we are stifling innovation
>>>>>> unnecessarily. There was the same discussion about constraints and CHECK
>>>>>> NOT NULL / NOT NULL where we were trying to follow "the Holy
>>>>>> Postgres Grail". I just don't get it. Are we not obsessed with that at 
>>>>>> this
>>>>>> point? Literally nobody cares if there will be CREATE GENERATED ROLE.
>>>>>> Nobody. Cares. So I do not take this point of yours as valid without some
>>>>>> strong backing from your side.
>>>>>>
>>>>>>
>>>>>>>  - Move the generation & policy to the sidecar project. A sidecar
>>>>>>> endpoint will generate the role name/password, enforce
>>>>>>>
>>>>>> prefix/suffix/length requirements, ensure uniqueness, and then return
>>>>>>> the role and password (or a secret handle) to the caller.
>>>>>>>
>>>>>>
>>>>>> Well the problem I see in putting this to Sidecar is that this would
>>>>>> be only possible to do via HTTP(S). Not everybody is interested in it.
>>>>>> Hardly. Zero interest. Sidecar is 0.2.0 at this point. I think that
>>>>>> realistically speaking I am not far from the truth at all if I say that
>>>>>> there is practically nobody who is using 0.2.0 in production. 0.2.0. I do
>>>>>> not count exceptions as early adopters or Analytics.
>>>>>>
>>>>>> Putting this to Sidecar almost guarantees nobody is going to use this
>>>>>> particular functionality. People have their own control planes, their own
>>>>>> way of generating this stuff and they are not going to deploy Sidecar 
>>>>>> just
>>>>>> because they want to delegate this task to it. Come on. I think that it
>>>>>> would, paradoxically, create more problems for them. Not less. So again, 
>>>>>> I
>>>>>> do not take this point as something which is solving anything. This will
>>>>>> have 0 users when put in Sidecar. I think it would be better if we just
>>>>>> flat out refuse this instead of putting that to Sidecar. It is even worse
>>>>>> imho.
>>>>>>
>>>>>> Another problem with Sidecar I see is that the current implementation
>>>>>> is pluggable. How do you want to make this pluggable in Sidecar? 
>>>>>> Pluggable
>>>>>> how? People might have their own opinion on how role names should be
>>>>>> generated. That is why you can just code your own generator / validator,
>>>>>> put it on the class path and be done with it. How are you supposed to
>>>>>> "patch Sidecar"? You create a custom implementation, then you put it on 
>>>>>> the
>>>>>> class path of Sidecar? Is this even supported? I think that you have
>>>>>> proposed it with a good will but I don't think that would fly.
>>>>>>
>>>>>>
>>>>>>> Why?
>>>>>>>  - End users will have it faster since it will work with any version
>>>>>>> of Cassandra supporting the CREATE syntax. (No having to backport 
>>>>>>> either)
>>>>>>>  - Keeps control plane actions optional and separated. Not an attack
>>>>>>> surface inside core Cassandra
>>>>>>>
>>>>>>
>>>>>> Thirdly, what _attack surface_? I think you are pretty aware of the
>>>>>> fact that this feature is by default turned off. If you have an
>>>>>> organisation deploying hundreds of clusters and for each they have to
>>>>>> figure out some role name for a user which is going to use it, how is 
>>>>>> this
>>>>>> going to be abused concretely? There are dedicated accounts for CQL
>>>>>> management, creation of a role is tied to some workflow etc. What is
>>>>>> attacked exactly and how? Concrete examples please.
>>>>>>
>>>>>> Dineshi had the concern that "what if we just have a script which
>>>>>> will generate roles repeatedly nonstop?" How is this different from 
>>>>>> having
>>>>>> a script which would generate roles itself instead of Cassandra and it
>>>>>> would execute that? What's the difference really? If you want to abuse it
>>>>>> you will. There is no protection against that unless we put some rate
>>>>>> limiting in front of it - which I do not have a problem to address in
>>>>>> follow-up work as already explained.
>>>>>>
>>>>>>
>>>>>>>  - We keep the syntax of CQL more generic and less one-off.
>>>>>>>
>>>>>>
>>>>>> I don't think this is relevant, really. I think we should abandon
>>>>>> this mindset. At this point, to make the point, I suspect that CQL had to
>>>>>> "hurt you" somehow :)
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>>
>>>>>>>  - k8s/Cloud native friendly with separation of control plane/data
>>>>>>> plane.
>>>>>>>
>>>>>>> Patrick
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Sep 16, 2025 at 7:31 AM Mick <[email protected]> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> > I think enough time passed for everybody to participate in the
>>>>>>>> discussion so I would just move on and start the voting thread soon.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Can we give CEP discussions longer than ~one week, please.
>>>>>>>>
>>>>>>>> Folk are easily away/offline for a whole week.  Take for example
>>>>>>>> many who were at Community over Code and may still be catching up on 
>>>>>>>> their
>>>>>>>> inbox, thinking dev@ is a less urgent folder.
>>>>>>>>
>>>>>>>> I haven't look at how fast the other CEP discuss threads have
>>>>>>>> turned around, I apologise if I'm only singling one out, my concern 
>>>>>>>> applies
>>>>>>>> generally.
>>>>>>>>
>>>>>>>>

Re: [DISCUSS] CEP-55 Generated role names

Reply via email to