Hey Omnia, thanks for the KIP! I think this will be a really nice improvement for operators.
DA1: In "Public Interfaces" you say "max.request.pagination.size.limit" controls the max items to return by default. It's not clear to me if this is just a default, or if it is a hard limit. In KIP-966, this config serves as a hard limit to prevent misconfigured or malicious clients from requesting too many resources. Can you clarify this bit? DA2: Is "ItemsLeftToFetch" accounting for authorization? If not, it could be considered a minor info leak. DA3: By splitting up results into pages, we are introducing the possibility of inconsistency into these RPCs. For example, today MetadataRequest returns results from the same MetadataImage, so the response is consistent. With the paging approach, it is possible (likely even) that different requests will be served from different MetadataImage-s, leading to inconsistencies. This can be even worse if paged requests go to different brokers that may be lagging behind in metadata propagation. BTW this issue exists for KIP-966 as well. We don't necessarily need to solve this right away, but I think it's worth mentioning in the KIP. DA4: Have we considered some generic throttling for paged requests? I expect it might be an issue if clients want to get everything and just page through all of the results as quickly as possible. Cheers, David On Tue, Jul 9, 2024 at 8:20 AM Omnia Ibrahim <o.g.h.ibra...@gmail.com> wrote: > Hey, > I will open a voting thread by end of this week if there is no more > feedbacks. > > Thanks > Omnia > > > On 2 Jul 2024, at 17:25, Omnia Ibrahim <o.g.h.ibra...@gmail.com> wrote: > > > > Hi David, Thanks for having a look into this! > > > >> DJ1: Why do we want to migrate only a subset of the APIs vs migrating > all > >> of them? For instance, there are DescribeGroups, ConsumerGroupDescribe, > >> etc. Do we have reasons not to migrate them too? I think that it would > be > >> great to have a KIP that establishes the pattern for all the admin APIs. > > > > No particular reason I only was trying to focus on the requests that > cause the most pain ( as far as I know with Kafka). I don’t mind expand it > to other requests in AdminClient as well. > > The other thing I was wondering about if > max.request.partition.size.limit still make sense or should it be renamed > (or have another one called max.request.pagingation.size.limit) if we going > to expand to cover all Admin requests. I am leaning toward > max.request.pagingation.size.limit and max.request.partition.size.limit can > be set for now to max.request.partition.size.limit by default. > > > > I have updated the KIP now to cover more requests please have a second > look and let me know if I missed any? > > > >> DJ2: I am not a fan of all the new parameters passed to the tools > >> (e.g. --partition-size-limit-per-response). I wonder if we could rather > >> have a config in the admin client to set the default page size for all > >> requests. > > > > If it going to be pattern to all requests I agree it should be an admin > config then. > > Updated the KIP with this now. > > > >> DJ3: I assume that the Admin client will transparently handle the > change. > >> It would be great to call it out in the compatibility section. > > > > Updated, Old clients will still have the old behaviour of no pagingation > while all new versions will support this. > > > > Thanks > > Omnia > > > >> On 2 Jul 2024, at 10:43, David Jacot <dja...@confluent.io.INVALID> > wrote: > >> > >> Hi Omnia, > >> > >> Thanks for the KIP. I agree that we should migrate admin APIs to the new > >> pattern. > >> > >> DJ1: Why do we want to migrate only a subset of the APIs vs migrating > all > >> of them? For instance, there are DescribeGroups, ConsumerGroupDescribe, > >> etc. Do we have reasons not to migrate them too? I think that it would > be > >> great to have a KIP that establishes the pattern for all the admin APIs. > >> > >> DJ2: I am not a fan of all the new parameters passed to the tools > >> (e.g. --partition-size-limit-per-response). I wonder if we could rather > >> have a config in the admin client to set the default page size for all > >> requests. > >> > >> DJ3: I assume that the Admin client will transparently handle the > change. > >> It would be great to call it out in the compatibility section. > >> > >> Thanks, > >> David > >> > >> On Tue, Jul 2, 2024 at 11:17 AM Andrew Schofield < > andrew_schofi...@live.com> > >> wrote: > >> > >>> Hi, > >>> Thanks for the response. Makes sense to me. Just one additional > comment: > >>> > >>> AS5: The cursor for ListGroupsResponse is called `TransactionalCursor` > >>> which > >>> seems like a copy-paste mistake. > >>> > >>> Thanks, > >>> Andrew > >>> > >>>> On 30 Jun 2024, at 22:28, Omnia Ibrahim <o.g.h.ibra...@gmail.com> > wrote: > >>>> > >>>> Hi Andrew thanks for having a look into the KIP > >>>> > >>>>> AS1: Besides topics, the most numerous resources in Kafka clusters in > >>> my experience > >>>>> are consumer groups. Would it be possible to extend the KIP to cover > >>> ListGroups while > >>>>> you’re in here? I’ve heard of clusters with truly vast numbers of > >>> groups. This is also > >>>>> potentially a sign of a misbehaving or poorly written clients. > Getting > >>> a page of groups > >>>>> with a massive ItemsLeftToFetch would be nice. > >>>> Yes, I also had few experiences with large cluster where to list > >>> consumer groups can take up to 5min. I update the KIP to include this > as > >>> well. > >>>> > >>>>> AS2: A tiny nit: The versions for the added fields are incorrect in > >>> some cases. > >>>> I believe I fixed all of them now > >>>> > >>>>> AS3: I don’t quite understand the cursor for > >>> OffsetFetchRequest/Response. > >>>>> It looks like the cursor is (topic, partition), but not group ID. > Does > >>> the cursor > >>>>> apply to all groups in the request, or is group ID missing? > >>>> > >>>> I was thinking that the last one in the response will be the one that > >>> has the cursor while the rest will have null. But if we are moving > >>> NextCursour to the top level of the response then the cursor will need > >>> groupID. > >>>>> AS4: For the remaining request/response pairs, the cursor makes sense > >>> to me, > >>>>> but I do wonder whether `NextCursor` should be at the top level of > the > >>> responses > >>>>> instead, like DescribeTopicPartitionsResponse. > >>>> > >>>> Updates the KIP to reflect this now. > >>>> > >>>> Let me know if you have any more feedback on this. > >>>> > >>>> Best > >>>> Omnia > >>>> > >>>>> On 27 Jun 2024, at 17:53, Andrew Schofield < > andrew_schofi...@live.com> > >>> wrote: > >>>>> > >>>>> Hi Omnia, > >>>>> Thanks for the KIP. This is a really nice improvement for > administering > >>> large clusters. > >>>>> > >>>>> AS1: Besides topics, the most numerous resources in Kafka clusters in > >>> my experience > >>>>> are consumer groups. Would it be possible to extend the KIP to cover > >>> ListGroups while > >>>>> you’re in here? I’ve heard of clusters with truly vast numbers of > >>> groups. This is also > >>>>> potentially a sign of a misbehaving or poorly written clients. > Getting > >>> a page of groups > >>>>> with a massive ItemsLeftToFetch would be nice. > >>>>> > >>>>> AS2: A tiny nit: The versions for the added fields are incorrect in > >>> some cases. > >>>>> > >>>>> AS3: I don’t quite understand the cursor for > >>> OffsetFetchRequest/Response. > >>>>> It looks like the cursor is (topic, partition), but not group ID. > Does > >>> the cursor > >>>>> apply to all groups in the request, or is group ID missing? > >>>>> > >>>>> AS4: For the remaining request/response pairs, the cursor makes sense > >>> to me, > >>>>> but I do wonder whether `NextCursor` should be at the top level of > the > >>> responses > >>>>> instead, like DescribeTopicPartitionsResponse. > >>>>> > >>>>> Thanks, > >>>>> Andrew > >>>>> > >>>>>> On 27 Jun 2024, at 14:05, Omnia Ibrahim <o.g.h.ibra...@gmail.com> > >>> wrote: > >>>>>> > >>>>>> Hi everyone, I would like to start a discussion thread for KIP-1062 > >>>>>> > >>> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1062%3A+Introduce+Pagination+for+some+requests+used+by+Admin+API > >>>>>> > >>>>>> > >>>>>> Thanks > >>>>>> Omnia > >>> > >>> > >>> > > > > -- David Arthur