Hi Murali,

Thanks for the KIP!

I think I understand the motivation for this KIP in situations where
there are a "cross product" of topics for two or more variables X and
Y, and want to write ACLs for each of the variable axes.
If you format your topics "X-Y-suffix", it's not easy to write rules
that apply to all "Y" topics, because you need to enumerate all of the
"X" values, and the problem persists even if you reorder the topic
name.

In my recent work on KIP-986 I found it necessary to introduce
"namespaces" to group topics together, and I was going to replicate
the ACL system to specify those namespaces. This change to the ACL
system could increase the expressiveness and complexity of that
feature, if it is ever implemented.
One of the primitives I needed when specifying namespaces was the
ability to tell when two namespaces overlapped (i.e. does there exist
any topic which is present in both namespaces). This is trivial to do
with the current PREFIX and LITERAL system, as we can find the
maximum-length common prefix with just some length comparisons and
equality checks.
I considered specifying namespaces via regular expressions, and found
that it was computationally much more difficult. Computing the
intersection of two regexes appears to be exponential in the length of
the regexes, leading me to avoid adding it.

I understand that you're not suggesting full REGEX support, and that
"namespaces" don't need to support MATCH, but I think MATCH may run
into related difficulties. Any MATCH can overlap with any other MATCH
or PREFIX if it includes a sufficient number of wildcards. For
example:
MATCH *-accounts-* has overlap with PREFIX nl as they can both match
"nl-accounts-localtopic", but that isn't sensitive to the contents
"nl", it is true for any PREFIX.
MATCH *-accounts-* has overlap with MATCH *localtopic, as they can
both match "nl-accounts-localtopic", but that isn't actually sensitive
to the contents "localtopic", it's true for any MATCH which includes a
wildcard at the beginning.

This has implications for execution complexity: If we can't compute
whether two patterns overlap, then we need to run both of them on each
piece of input to test if they both match. Under the current
LITERAL/PREFIX system, we can optimize execution with a trie, but that
option wouldn't be available to us with MATCH.

The current system makes users evaluate a trade-off:
1. Optimize the number of ACLs by organizing topics according to
prefixes (for example, "accounts-localtopic-nl" and PREFIX "accounts",
PREFIX "accounts-localtopic")
2. Use less-structured topic names, with a corresponding ACL scheme
that has more individual rules.
The system currently informs users of this tradeoff by making them
write multiple ACLs, and making them think "there has got to be a
better way!". Perhaps we can find a better way to surface this best
practice, or better inform users about it.

I understand that there are going to be situations more complex than
your example, where multiple individual rules will always be necessary
with only PREFIX evaluation. I think even in those situations, a
number of efficient-to-evaluate rules is preferable to just one
expensive-to-evaluate rule.

One alternative that I thought of could be "PARAMETERIZED" ACLs which
are like PREFIXED, but allow some parameter substitution. For example
PARAMETERIZED "(nl|de|cz)-accounts-". I'm lifting regex syntax here,
but this isn't actually a regex, and wouldn't allow arbitrary numbers
of characters, or the * or + operators.
In the background it could evaluate exactly like the 3 individual
PREFIX rules, but be easier to evaluate on the backend, and support
the intersection query I mentioned earlier. It could also support
[a-zA-Z] notation in case the parameter values aren't known ahead of
time, but have a fixed length.

Thanks,
Greg

On Mon, May 6, 2024 at 11:17 AM Claude Warren <cla...@xenei.com> wrote:
>
> I have an idea for how to reduce the time for ACL lookups in general and
> particularly where wildcards are involved using sequence
> characterization techniques from bioinformatics.  But I need a set of ACL
> patterns and associated topics to test with.
>
> On Fri, May 3, 2024 at 2:45 PM Haruki Okada <ocadar...@gmail.com> wrote:
>
> > Hi, Murali.
> >
> > First, could you add the KIP-1042 to the index (
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
> > )
> > as well so that everyone can find it easily?
> >
> > I took a look at the KIP, then I have 2 questions:
> >
> > 1. Though the new MATCH resource pattern type may reduce the effort of
> > adding ACLs in some cases, do you have any concrete use case you are in
> > mind? (When prefixed ACL was introduced in KIP-290, there was a use-case
> > that using it for implementing multi-tenancy)
> >
> > 2. As you may know, ACL lookup is in the hot-path which the performance is
> > very important. (
> >
> > https://github.com/apache/kafka/blob/240243b91d69c2b65b5e456065fdcce90c121b04/core/src/main/scala/kafka/security/authorizer/AclAuthorizer.scala#L539
> > ).
> > Do you have ideas how do we update `matchingAcls` to support MATCH-type ACL
> > without introducing performance issue?
> >
> >
> > Thanks,
> >
> > 2024年5月3日(金) 19:51 Claude Warren, Jr <claude.war...@aiven.io.invalid>:
> >
> > > As I wrote in [1], the ACL evaluation algorithm needs to be specified
> > with
> > > respect to the specificity of the pattern so that we know exactly which
> > if
> > > *-accounts-* takes precedence over nl-accounts-* or visa versa.
> > >
> > > I think that we should spell out that precedence is evaluated as follows:
> > >
> > > 1. Remove patterns that do not match
> > > 2. More specific patterns take precedence over less specific patterns
> > > 3. for patterns of the same precedence DENY overrides ALLOW
> > >
> > > Determining specificity:
> > >
> > > Specificity is based on the Levenshtein distance between the pattern and
> > > the text being evaluated. The lower the distance the more specific the
> > > rule.
> > > Using the topic name: nl-accounts-localtopic we can evaluate the
> > > Levenshtein distance for various patterns.
> > > nl-accounts-localtopic = 0
> > > *-accounts-localtopic = 2
> > > nl-accounts-local* = 5
> > > *-accounts-local* = 7
> > > nl-accounts-* = 10
> > > *-accounts-* = 12
> > >
> > > In the special case of matching principles User matches are more specific
> > > than Group matches.
> > >
> > > I don't know if this should be added to KIP-1042 or presented as a new
> > KIP.
> > >
> > > Claude
> > >
> > > [1] https://lists.apache.org/thread/0l88tkbxq3ol9rnx0ljnmswj5y6pho1f
> > > <
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1042%3A+Support+for+wildcard+when+creating+new+acls
> > > >
> > >
> > > On Fri, May 3, 2024 at 12:18 PM Claude Warren <cla...@xenei.com> wrote:
> > >
> > > > Took me awhile to find it but the link to the KIP is
> > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1042%3A+Support+for+wildcard+when+creating+new+acls
> > > >
> > > > On Fri, May 3, 2024 at 10:13 AM Murali Basani <murali.bas...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > I'd like to propose a suggestion to our resource patterns in Kafka
> > > ACLs.
> > > > >
> > > > > Currently, when adding new ACLs in Kafka, we have two types of
> > resource
> > > > > patterns for topics:
> > > > >
> > > > >    - LITERAL
> > > > >    - PREFIXED
> > > > >
> > > > > However, when it comes to listing or removing ACLs, we have a couple
> > > more
> > > > > options:
> > > > >
> > > > >    - MATCH
> > > > >    - ANY (will match any pattern type)
> > > > >
> > > > >
> > > > > If we can extend creating acls as well with 'MATCH' pattern type, it
> > > > would
> > > > > be very beneficial. Even though this kind of acl should be created
> > with
> > > > > utmost care, it will help organizations streamline their ACL
> > management
> > > > > processes.
> > > > >
> > > > > Example scenarios :
> > > > >
> > > > > Let's say we need to create ACLs for the following six topics:
> > > > > nl-accounts-localtopic, nl-accounts-remotetopic,
> > > de-accounts-localtopic,
> > > > > de-accounts-remotetopic, cz-accounts-localtopic,
> > > cz-accounts-remotetopic
> > > > >
> > > > > Currently, we achieve this using existing functionality by creating
> > > three
> > > > > prefixed ACLs as shown below:
> > > > >
> > > > > kafka-acls --bootstrap-server localhost:9092 \
> > > > > > --add \
> > > > > > --allow-principal
> > > > > >
> > > > >
> > > >
> > >
> > User:CN=serviceuser,OU=ServiceUsers,O=Unknown,L=Unknown,ST=Unknown,C=Unknown
> > > > > > \
> > > > > > --producer \
> > > > > > --topic nl-accounts- \
> > > > > > --resource-pattern-type prefixed
> > > > >
> > > > >
> > > > > kafka-acls --bootstrap-server localhost:9092 \
> > > > > > --add \
> > > > > > --allow-principal
> > > > > >
> > > > >
> > > >
> > >
> > User:CN=serviceuser,OU=ServiceUsers,O=Unknown,L=Unknown,ST=Unknown,C=Unknown
> > > > > > \
> > > > > > --producer \
> > > > > > --topic de-accounts- \
> > > > > > --resource-pattern-type prefixed
> > > > >
> > > > >
> > > > > kafka-acls --bootstrap-server localhost:9092 \
> > > > > > --add \
> > > > > > --allow-principal
> > > > > >
> > > > >
> > > >
> > >
> > User:CN=serviceuser,OU=ServiceUsers,O=Unknown,L=Unknown,ST=Unknown,C=Unknown
> > > > > > \
> > > > > > --producer \
> > > > > > --topic cz-accounts- \
> > > > > > --resource-pattern-type prefixed
> > > > >
> > > > >
> > > > > However, if we had the 'MATCH' pattern type available, we could
> > > > accomplish
> > > > > this with a single ACL, as illustrated here:
> > > > >
> > > > > kafka-acls --bootstrap-server localhost:9092 \
> > > > > > --add \
> > > > > > --allow-principal
> > > > > >
> > > > >
> > > >
> > >
> > User:CN=serviceuser,OU=ServiceUsers,O=Unknown,L=Unknown,ST=Unknown,C=Unknown
> > > > > > \
> > > > > > --producer \
> > > > > > --topic *-accounts-* \
> > > > > > --resource-pattern-type match
> > > > >
> > > > >
> > > > >
> > > > > This pattern closely resembles PREFIXED but offers broader allow/deny
> > > > > rules.
> > > > >
> > > > > Implementing this change could significantly reduce the effort in
> > > several
> > > > > acl management processes.
> > > > >
> > > > > I welcome your thoughts and any concerns you may have regarding this
> > > > > proposal.
> > > > >
> > > > > Thanks,
> > > > > Murali
> > > > >
> > > >
> > > >
> > > > --
> > > > LinkedIn: http://www.linkedin.com/in/claudewarren
> > > >
> > >
> >
> >
> > --
> > ========================
> > Okada Haruki
> > ocadar...@gmail.com
> > ========================
> >
>
>
> --
> LinkedIn: http://www.linkedin.com/in/claudewarren

Reply via email to