bq. describe topics by a regular expression on the server side

Should caution be taken if the regex doesn't filter ("*") ?

Cheers

On Fri, Jul 13, 2018 at 6:02 PM Colin McCabe <cmcc...@apache.org> wrote:

> As Jason wrote, this won't scale as the number of partitions increases.
> We already have users who have tens of thousands of topics, or more.  If
> you multiply that by 100x over the next few years, you end up with this API
> returning full information about millions of topics, which clearly doesn't
> work.
>
> We discussed this a lot in the original KIP-117 DISCUSS thread which added
> the Java AdminClient.  ListTopics and DescribeTopics were deliberately kept
> separate because we understood that eventually a single RPC would not be
> able to return information about all the topics in the cluster.  So I have
> to vote -1 for this proposal as it stands.
>
> I do agree that adding a way to describe topics by a regular expression on
> the server side would be very useful.  This would also fix a major
> scalability problem we have now, which is that when subscribing via a
> regular expression, clients need to fetch the full list of all topics in
> the cluster and filter locally.
>
> I think a regular expression library like re2 would be ideal for this
> purpose.  re2 is standardized and language-agnostic (it's not tied only to
> Java).  In contrast, Java regular expression change with different releases
> of the JDK (there were some changes in java 8, for example).  Also, re2
> regular expressions are linear time, never exponential time.  See
> https://github.com/google/re2j
>
> regards,
> Colin
>
>
> On Fri, Jul 13, 2018, at 05:00, Andras Beni wrote:
> > The KIP looks good to me.
> > However, if there is willingness in the community to work on metadata
> > request with patterns, the feature proposed here and filtering by '*' or
> > '.*' would be redundant.
> >
> > Andras
> >
> >
> >
> > On Fri, Jul 13, 2018 at 12:38 AM Jason Gustafson <ja...@confluent.io>
> wrote:
> >
> > > Hey Manikumar,
> > >
> > > As Kafka begins to scale to larger and larger numbers of
> topics/partitions,
> > > I'm a little concerned about the scalability of APIs such as this. The
> API
> > > looks benign, but imagine you have have a few million partitions. We
> > > already expose similar APIs in the producer and consumer, so probably
> not
> > > much additional harm to expose it in the AdminClient, but it would be
> nice
> > > to put a little thought into some longer term options. We should be
> giving
> > > users an efficient way to select a smaller set of the topics they are
> > > interested in. We have always discussed adding some filtering support
> to
> > > the Metadata API. Perhaps now is a good time to reconsider this? We now
> > > have a convention for wildcard ACLs, so perhaps we can do something
> > > similar. Full regex support might be ideal given the consumer's
> > > subscription API, but that is more challenging. What do you think?
> > >
> > > Thanks,
> > > Jason
> > >
> > > On Thu, Jul 12, 2018 at 2:35 PM, Harsha <ka...@harsha.io> wrote:
> > >
> > > > Very useful. LGTM.
> > > >
> > > > Thanks,
> > > > Harsha
> > > >
> > > > On Thu, Jul 12, 2018, at 9:56 AM, Manikumar wrote:
> > > > > Hi all,
> > > > >
> > > > > I have created a KIP to add describe all topics API to AdminClient
> .
> > > > >
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 327%3A+Add+describe+all+topics+API+to+AdminClient
> > > > >
> > > > > Please take a look.
> > > > >
> > > > > Thanks,
> > > >
> > >
>

Reply via email to