Re: [DISCUSS] CEP-55 Generated role names

Štefan Miklošovič Wed, 17 Sep 2025 23:08:57 -0700

That's right. I also think it is smaller. If you think about it in purely
practical terms, you would need to set up Sidecar, then make the connection
secure via TLS etc. (otherwise username and password would travel from
Sidecar to target recipient of these credentials via plaintext). Then you
need to authenticate the actual caller of that endpoint so it can reach it
in order to call Cassandra to create users for that ...


What if you do not want to do ANY OF THESE THNIG?

I mean ... if somebody is serious about Sidecar, all these things would be
done probably anyway but it is just an unnecessary hurdle to jump over if
one just needs to get the job done. By forcibly siphoning everything
through Sidecar, the very first question of a user would be: why? Why do I
need to take more steps in achieving something? Just to satisfy somebody's
architectural desires?

Does it make sense to have e.g. CEP about cluster wide restarts in Sidecar?
Yes. Sure. Does it make sense to force people to call Sidecar to create
some users? No. It should be possible to do it with the least amount of
plumbing possible. The use cases are various.

On Wed, Sep 17, 2025 at 11:03 PM Joel Shepherd <[email protected]> wrote:

>
> On 9/17/2025 1:21 AM, Štefan Miklošovič wrote:
>
> On Wed, Sep 17, 2025 at 2:17 AM Joel Shepherd <[email protected]> wrote:
>
>> Could I make a suggestion? Well, I will make a suggestion :-) , but if
>> it's not useful then feel free to ignore it.
>>
>> Could we talk a bit about how users/operators would work with the CREATE
>> ROLE features you're proposing?
>> Somewhat related to that ... is there any need for role "stability"
>> across clusters: e.g. I want to create a role that can access existing
>> tables but not create/drop tables or keyspaces, and for my own sanity I
>> want that role to have the same name on every cluster I operate. Do I have
>> to implement a custom role name generator to do that, or is that common
>> enough functionality that it should be supportable by the tooling I'm using
>> to manage my clusters?
>>
>
> I do not think we have such a requirement for "stability". If you had this
> requirement then you would not use the feature we are discussing here and
> you created them manually. I also do not think that having the same name
> everywhere is a good idea in general. Username is security sensitive as
> well.
>
> We can agree to disagree on this. :-)  I generally don't think names
> should be considered especially sensitive but am really looking at this
> more from how end-users are going to work with the capability.
>
> The use-case as I understand it is that there are organizations that have
>> or are going to create large numbers of clusters (say  > 3), and they would
>> appreciate some automation around creating role names and credentials for
>> all those clusters. The proposal is to extend the CREATE ROLE statement to
>> enable the database to generate those names and credentials automatically,
>> including persisting them in the database itself.
>>
>> One thing I'm wondering about is what kind of tooling those organizations
>> are likely to be using for creating/managing all those clusters. Are they
>> going to be scripting, or are they going to be using some third-party
>> tooling like Terraform, CloudFormation, Puppet, etc.? If they're using
>> tooling like that, which is going to be a more natural fit: making
>> role/password generation available through CQL, or through Sidecar APIs, or
>> ... ? I don't have an opinion at the moment so that's not a rhetorical
>> question. I'd actually like to reason through what's going to work best for
>> the folks who actually have to manage tons of clusters all day  long.
>>
>
> I do not see why we should have a ton of logic / functionality outside of
> Cassandra for doing basic things. I think that Cassandra is notoriously
> known for its "do it yourself" approach and I think _that_ is the
> primary impediment for broader adoption, not if we dare to introduce CREATE
> GENERATED ROLE or not. The focus on usability is completely missed. For a
> lot of things you want to have you have to have "tooling" which you need to
> take care of and so on. People are sick of it. They just want to do the
> thing in the most efficient and time-saving manner.
>
> This isn't an either-or question. I'm not posing "CREATE GENERATED ROLE"
> vs infra-as-code (IAC) support. I'm poking at the best way for the two to
> work together. Because I think/hope that most people who run large clusters
> and/or a lot of clusters (or really a lot of instances of any kind of
> service) use some flavor of IAC. There is a lot more than Cassandra to
> manage: there's the hosts, disk in some form, networking OS, config, keys,
> schema, etc. If I already have a tool to manage all the infra, it'd be nice
> for Cassandra to play nicely with that tooling so I can do my basic cluster
> setup set-up via automation as well. That doesn't exclude me from putting
> down my IAC tool and continuing on to do Cassandra configuration in
> Cassandra if I wish ... but in my mind having to jump between tools
> (including cqlsh) to configure different aspects of all the things involved
> in standing up my cluster is not a usability improvement ... especially if
> I have to do it a lot.
>
> So I'm trying to shed some light on the Sidecar and/or CQL debate by
> asking how people are going to be using this functionality "at scale"
> (where efficient and time-saving may look very different from adhoc use)
> and if there's any benefit to API access via Sidecar vs access via CQL.
>
> (TBH, I'm actually leaning towards your CQL proposal because I think the
> attack surface is actually smaller than it is with letting Sidecar execute
> CQL on the API caller's behalf.)
>
> Thanks -- Joel.
>
>
>
>
> When I was introduced to this community for the first time, like 2015-16
> maybe, I remember that there was somebody on the mailing list complaining
> that "repair should be automatic", "that should be provided", "this should
> be natively in". People see this for years. It takes just 9 years to
> finally introduce automatic repairs. Thank god for repairing people finally
> doing that. They should be weighted in gold. But the response to that was
> that "well if you need it you need to write it yourself, there is no "one
> size fits all!", you need to take care of that yourself". Just imagine
> that. This was a kind of genuinely meant response. How are we going to make
> this popular if everything beyond trivial is left to an end user to figure
> out. Who sane is going to put up with that? People just want to turn on the
> thing and not think too much about it anymore.
>
>
>> I don't have strong opinions on CQL vs Sidecar, but I think one way to
>> frame the debate is to look at which will work best with the tooling that
>> people already use to manage large numbers of clusters.
>>
>> Thanks -- Joel.
>> On 9/16/2025 3:15 PM, Štefan Miklošovič wrote:
>>
>>
>> Oh crap, what a feedback! If nothing else this shows a lesson to
>> everybody that the most sure way to have a fast feedback if you are tired
>> of waiting or impatient so you can move quickly is to just propose your
>> ideas, then boldly proclaim you go to do something and the universe will
>> mysteriously take care of finding out somebody who will reject it. Because
>> people are not always interested in agreeing. A lot of times, they take
>> action only in case they don't and are put in front of it. So don't be
>> afraid to take some flak as soon as possible!
>>
>>
>>
>> On Tue, Sep 16, 2025 at 9:05 PM Patrick McFadin <[email protected]>
>> wrote:
>>
>>> Thanks Mick, I'm just digging into this more after a long week of
>>> travel.
>>>
>>> Generally, I'm -1 for adding more custom syntax. Another concern of mine
>>> is adding control plane actions in DDL. I understand the usefulness of a
>>> feature like this in ops. It's a great idea.. Here would be my counter
>>> proposal:
>>>
>>>  - Leave the CQL as is and keep "CREATE ROLE" etc as is, and avoid
>>> making changes to core Cassandra.
>>>
>>
>> Why should we keep it "as is"? Genuinely asking. Why? Where is this need
>> for conserving stuff coming from? Is this what we are doing here? Adding as
>> little as possible? I think we are stifling innovation unnecessarily. There
>> was the same discussion about constraints and CHECK NOT NULL / NOT NULL
>> where we were trying to follow "the Holy Postgres Grail". I just don't get
>> it. Are we not obsessed with that at this point? Literally nobody cares if
>> there will be CREATE GENERATED ROLE. Nobody. Cares. So I do not take this
>> point of yours as valid without some strong backing from your side.
>>
>>
>>>  - Move the generation & policy to the sidecar project. A sidecar
>>> endpoint will generate the role name/password, enforce
>>>
>> prefix/suffix/length requirements, ensure uniqueness, and then return the
>>> role and password (or a secret handle) to the caller.
>>>
>>
>> Well the problem I see in putting this to Sidecar is that this would be
>> only possible to do via HTTP(S). Not everybody is interested in it. Hardly.
>> Zero interest. Sidecar is 0.2.0 at this point. I think that realistically
>> speaking I am not far from the truth at all if I say that there is
>> practically nobody who is using 0.2.0 in production. 0.2.0. I do not count
>> exceptions as early adopters or Analytics.
>>
>> Putting this to Sidecar almost guarantees nobody is going to use this
>> particular functionality. People have their own control planes, their own
>> way of generating this stuff and they are not going to deploy Sidecar just
>> because they want to delegate this task to it. Come on. I think that it
>> would, paradoxically, create more problems for them. Not less. So again, I
>> do not take this point as something which is solving anything. This will
>> have 0 users when put in Sidecar. I think it would be better if we just
>> flat out refuse this instead of putting that to Sidecar. It is even worse
>> imho.
>>
>> Another problem with Sidecar I see is that the current implementation is
>> pluggable. How do you want to make this pluggable in Sidecar? Pluggable
>> how? People might have their own opinion on how role names should be
>> generated. That is why you can just code your own generator / validator,
>> put it on the class path and be done with it. How are you supposed to
>> "patch Sidecar"? You create a custom implementation, then you put it on the
>> class path of Sidecar? Is this even supported? I think that you have
>> proposed it with a good will but I don't think that would fly.
>>
>>
>>> Why?
>>>  - End users will have it faster since it will work with any version of
>>> Cassandra supporting the CREATE syntax. (No having to backport either)
>>>  - Keeps control plane actions optional and separated. Not an attack
>>> surface inside core Cassandra
>>>
>>
>> Thirdly, what _attack surface_? I think you are pretty aware of the fact
>> that this feature is by default turned off. If you have an organisation
>> deploying hundreds of clusters and for each they have to figure out some
>> role name for a user which is going to use it, how is this going to be
>> abused concretely? There are dedicated accounts for CQL management,
>> creation of a role is tied to some workflow etc. What is attacked exactly
>> and how? Concrete examples please.
>>
>> Dineshi had the concern that "what if we just have a script which will
>> generate roles repeatedly nonstop?" How is this different from having a
>> script which would generate roles itself instead of Cassandra and it would
>> execute that? What's the difference really? If you want to abuse it you
>> will. There is no protection against that unless we put some rate limiting
>> in front of it - which I do not have a problem to address in follow-up work
>> as already explained.
>>
>>
>>>  - We keep the syntax of CQL more generic and less one-off.
>>>
>>
>> I don't think this is relevant, really. I think we should abandon this
>> mindset. At this point, to make the point, I suspect that CQL had to "hurt
>> you" somehow :)
>>
>> Regards
>>
>>
>>>  - k8s/Cloud native friendly with separation of control plane/data
>>> plane.
>>>
>>> Patrick
>>>
>>>
>>> On Tue, Sep 16, 2025 at 7:31 AM Mick <[email protected]> wrote:
>>>
>>>>
>>>>
>>>>
>>>> > I think enough time passed for everybody to participate in the
>>>> discussion so I would just move on and start the voting thread soon.
>>>>
>>>>
>>>>
>>>> Can we give CEP discussions longer than ~one week, please.
>>>>
>>>> Folk are easily away/offline for a whole week.  Take for example many
>>>> who were at Community over Code and may still be catching up on their
>>>> inbox, thinking dev@ is a less urgent folder.
>>>>
>>>> I haven't look at how fast the other CEP discuss threads have turned
>>>> around, I apologise if I'm only singling one out, my concern applies
>>>> generally.
>>>>
>>>>

Re: [DISCUSS] CEP-55 Generated role names

Reply via email to