Hello David, For (3):
* I suppose that we could add a `generation` field to the JoinGroupRequest instead to do the fencing that you describe while handling the sentinel in the assignor directly. If we would add the `generation` to the request itself, would we need the `generation` in the subscription protocol as well?* On second thought, I think this is not better than adding `generation` field in the subscription protocol, because I think we don't have to do any generation validation on joinGroup request. The purpose of `joinGroupRequest` is to accept any members to join this group, even if the member is new or ever joined or what. As long as we have the generationId in the subscription metadata, the consumer lead can leverage the info to ignore the old ownedPartitions (or do other handling), and the rebalance can still complete successfully and correctly. On the other hand, if we did the generation check on JoinGroupRequest, and return `ILLEGAL_GENERATION` back to consumer, the consumer needs to clear its generation info and rejoin the group to continue the rebalance. It needs more request/response network and slow down the rebalance. So I think we should add the `generationId` field into Subscription protocol to achieve what we want. Thank you. Luke On Thu, Nov 18, 2021 at 8:51 PM Luke Chen <show...@gmail.com> wrote: > Hi David, > Thanks for your feedback. > > I've updated the KIP for your comments (1)(2). > For (3), it's a good point! Yes, we didn't deserialize the subscription > metadata on broker side, and it's not necessary to add overhead on broker > side. And, yes, I think we can fix the original issue by adding a > "generation" field into `JoinGroupRequest` instead, and also add a field > into `JoinGroupResponse` in `JoinGroupResponseMember` field. That way, the > broker can identify the old member from `JoinGroupRequest`. And the > assignor can also get the "generation" info via the `Subscription` instance. > > I'll update the KIP to add "generation" field into `JoinGroupRequest` and > `JoinGroupResponse`, if there is no other options. > > Thank you. > Luke > > > On Tue, Nov 16, 2021 at 12:31 AM David Jacot <dja...@confluent.io.invalid> > wrote: > >> Hi Luke, >> >> Thanks for the KIP. Overall, I think that the motivation makes sense. I >> have a couple of comments/questions: >> >> 1. In the Public Interfaces section, it would be great if you could put >> the >> end state not the current one. >> >> 2. Do we need to update the Subscription class to expose the >> generation? If so, it would be great to mention it in the Public >> Interfaces section as well. >> >> 3. You mention that the broker will set the generation if the subscription >> contains a sentinel value (-1). As of today, the broker does not parse >> the subscription so I am not sure how/why we would do this. I suppose >> that we could add a `generation` field to the JoinGroupRequest instead >> to do the fencing that you describe while handling the sentinel in the >> assignor directly. If we would add the `generation` to the request itself, >> would we need the `generation` in the subscription protocol as well? >> >> Best, >> David >> >> On Fri, Nov 12, 2021 at 3:31 AM Luke Chen <show...@gmail.com> wrote: >> > >> > Hi all, >> > >> > I'd like to start the discussion for KIP-792: Add "generation" field >> into >> > consumer protocol. >> > >> > The goal of this KIP is to allow assignor/consumer coordinator/group >> > coordinator to have a way to identify the out-of-date >> members/assignments. >> > >> > Detailed description can be found here: >> > >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614 >> > >> > Any feedback is welcome. >> > >> > Thank you. >> > Luke >> >