Hi David,

Thanks for the KIP. I just had minor observations:

1) In the Assignment Error section in Client Side mode Assignment process,
you mentioned => `In this case, the client side assignor can return an
error to the group coordinator`. In this case are you referring to the
Assignor returning an AssignmentError that's listed down towards the end?
If yes, do you think it would make sense to mention this explicitly here?

2) In the Case Studies section, I have a slight confusion, not sure if
others have the same. Consider this step:

When B heartbeats, the group coordinator transitions him to epoch 3 because
B has no partitions to revoke. It persists the change and reply.

   - Group (epoch=3)
      - A
      - B
      - C
   - Target Assignment (epoch=3)
      - A - partitions=[foo-0]
      - B - partitions=[foo-2]
      - C - partitions=[foo-1]
   - Member Assignment
      - A - epoch=2, partitions=[foo-0, foo-1]
      - B - epoch=3, partitions=[foo-2]
      - C - epoch=3, partitions=[foo-1]

When C heartbeats, it transitions to epoch 3 but cannot get foo-1 yet.

Here,it's mentioned that member C can't get the foo-1 partition yet, but
based on the description above, it seems it already has it. Do you think it
would be better to remove it and populate it only when it actually gets it?
I see this in a lot of other places, so have I understood it incorrectly ?


Regarding connect , it might be out of scope of this discussion, but from
what I understood it would probably be running in client side assignor mode
even on the new rebalance protocol as it has its own Custom Assignors(Eager
and IncrementalCooperative).

Thanks!

Sagar.






On Fri, Jul 15, 2022 at 5:00 PM David Jacot <dja...@confluent.io.invalid>
wrote:

> Thanks Hector! Our goal is to move forward with specialized API
> instead of relying on one generic API. For Connect, we can apply the
> exact same pattern and reuse/share the core implementation on the
> server side. For the schema registry, I think that we should consider
> having a tailored API to do simple membership/leader election.
>
> Best,
> David
>
> On Fri, Jul 15, 2022 at 10:22 AM Ismael Juma <ism...@juma.me.uk> wrote:
> >
> > Three quick comments:
> >
> > 1. Regarding java.util.regex.Pattern vs com.google.re2j.Pattern, we
> should
> > document the differences in more detail before deciding one way or
> another.
> > That said, if people pass java.util.regex.Pattern, they expect their
> > semantics to be honored. If we are doing something different, then we
> > should consider adding an overload with our own Pattern class (I don't
> > think we'd want to expose re2j's at this point).
> > 2. Regarding topic ids, any major new protocol should integrate fully
> with
> > it and should handle the topic recreation case correctly. That's the main
> > part we need to handle. I agree with David that we'd want to add topic
> ids
> > to the relevant protocols that don't have it yet and that we can probably
> > focus on the internals versus adding new APIs to the Java Consumer
> (unless
> > we find that adding new APIs is required for reasonable semantics).
> > 3. I am still not sure about the coordinator storing the configs. It's
> > powerful for configs to be centralized in the metadata log for various
> > reasons (auditability, visibility, consistency, etc.). Similarly, I am
> not
> > sure about automatically deleting configs in a way that they cannot be
> > recovered. A good property for modern systems is to minimize the number
> of
> > unrecoverable data loss scenarios.
> >
> > Ismael
> >
> > On Wed, Jul 13, 2022 at 3:47 PM David Jacot <dja...@confluent.io.invalid
> >
> > wrote:
> >
> > > Thanks Guozhang. My answers are below:
> > >
> > > > 1) the migration path, especially the last step when clients flip the
> > > flag
> > > > to enable the new protocol, in which we would have a window where
> both
> > > new
> > > > protocols / rpcs and old protocols / rpcs are used by members of the
> same
> > > > group. How the coordinator could "mimic" the old behavior while
> using the
> > > > new protocol is something we need to present about.
> > >
> > > Noted. I just published a new version of KIP which includes more
> > > details about this. See the "Supporting Online Consumer Group Upgrade"
> > > and the "Compatibility, Deprecation, and Migration Plan". I think that
> > > I have to think through a few cases now but the overall idea and
> > > mechanism should be understandable.
> > >
> > > > 2) the usage of topic ids. So far as KIP-516 the topic ids are only
> used
> > > as
> > > > part of RPCs and admin client, but they are not exposed via any
> public
> > > APIs
> > > > to consumers yet. I think the question is, first should we let the
> > > consumer
> > > > client to be maintaining the names -> ids mapping itself to fully
> > > leverage
> > > > on all the augmented existing RPCs and the new RPCs with the topic
> ids;
> > > and
> > > > secondly, should we ever consider exposing the topic ids in the
> consumer
> > > > public APIs as well (both subscribe/assign, as well as in the
> rebalance
> > > > listener for cases like topic deletion-and-recreation).
> > >
> > > a) Assuming that we would include converting all the offsets related
> > > RPCs to using topic ids in this KIP, the consumer would be able to
> > > fully operate with topic ids. That being said, it still has to provide
> > > the topics names in various APIs so having a mapping in the consumer
> > > seems inevitable to me.
> > > b) I don't have a strong opinion on this. Here I wonder if this goes
> > > beyond the scope of this KIP. I would rather focus on the internals
> > > here and we can consider this separately if we see value in doing it.
> > >
> > > Coming back to Ismael's point about using topic ids in the
> > > ConsumerGroupHeartbeatRequest, I think that there is one advantage in
> > > favour of it. The consumer will have the opportunity to validate that
> > > the topics exists before passing them into the group rebalance
> > > protocol. Obviously, the coordinator will also notice it but it does
> > > not really have a way to reject an invalid topic in the response.
> > >
> > > > I'm agreeing with David on all other minor questions except for the
> > > > `subscribe(Pattern)` question: personally I think it's not necessary
> to
> > > > deprecate the subscribe API with Pattern, but instead we still use
> > > Pattern
> > > > while just documenting that our subscription may be rejected by the
> > > server.
> > > > Since the incompatible case is a very rare scenario I felt using an
> > > > overloaded `String` based subscription may be more vulnerable to
> various
> > > > invalid regexes.
> > >
> > > That could work. I have to look at the differences between the two
> > > engines to better understand the potential issues. My understanding is
> > > that would work for all the basic regular expressions. The differences
> > > between the two are mainly about the various character classes. I
> > > wonder what other people think about this.
> > >
> > > Best,
> > > David
> > >
> > > On Tue, Jul 12, 2022 at 11:28 PM Guozhang Wang <wangg...@gmail.com>
> wrote:
> > > >
> > > > Thanks David! I think on the high level there are two meta points we
> need
> > > > to concretize a bit more:
> > > >
> > > > 1) the migration path, especially the last step when clients flip the
> > > flag
> > > > to enable the new protocol, in which we would have a window where
> both
> > > new
> > > > protocols / rpcs and old protocols / rpcs are used by members of the
> same
> > > > group. How the coordinator could "mimic" the old behavior while
> using the
> > > > new protocol is something we need to present about.
> > > > 2) the usage of topic ids. So far as KIP-516 the topic ids are only
> used
> > > as
> > > > part of RPCs and admin client, but they are not exposed via any
> public
> > > APIs
> > > > to consumers yet. I think the question is, first should we let the
> > > consumer
> > > > client to be maintaining the names -> ids mapping itself to fully
> > > leverage
> > > > on all the augmented existing RPCs and the new RPCs with the topic
> ids;
> > > and
> > > > secondly, should we ever consider exposing the topic ids in the
> consumer
> > > > public APIs as well (both subscribe/assign, as well as in the
> rebalance
> > > > listener for cases like topic deletion-and-recreation).
> > > >
> > > > I'm agreeing with David on all other minor questions except for the
> > > > `subscribe(Pattern)` question: personally I think it's not necessary
> to
> > > > deprecate the subscribe API with Pattern, but instead we still use
> > > Pattern
> > > > while just documenting that our subscription may be rejected by the
> > > server.
> > > > Since the incompatible case is a very rare scenario I felt using an
> > > > overloaded `String` based subscription may be more vulnerable to
> various
> > > > invalid regexes.
> > > >
> > > >
> > > > Guozhang
> > > >
> > > > On Tue, Jul 12, 2022 at 5:23 AM David Jacot
> <dja...@confluent.io.invalid
> > > >
> > > > wrote:
> > > >
> > > > > Hi Ismael,
> > > > >
> > > > > Thanks for your feedback. Let me answer your questions inline.
> > > > >
> > > > > > 1. I think it's premature to talk about target versions for
> > > deprecation
> > > > > and
> > > > > > removal of the existing group protocol. Unlike KRaft, this
> affects a
> > > core
> > > > > > client protocol and hence deprecation/removal will be heavily
> > > dependent
> > > > > on
> > > > > > how quickly applications migrate to the new protocol.
> > > > >
> > > > > That makes sense. I will remove it.
> > > > >
> > > > > > 2. The KIP says we intend to release this in 4.x, but it wasn't
> made
> > > > > clear
> > > > > > why. If we added that as a way to estimate when we'd deprecate
> and
> > > remove
> > > > > > the group protocol, I also suggest removing this part.
> > > > >
> > > > > Let me explain my reasoning. As explained, I plan to rewrite the
> group
> > > > > coordinator in Java while we implement the new protocol. This means
> > > > > that the internals will be slightly different (e.g. threading
> model).
> > > > > Therefore, I wanted to tighten the switch from the old group
> > > > > coordinator to the new group coordinator to a major release. The
> > > > > alternative would be to use a flag to do the switch instead of
> relying
> > > > > on the software upgrade.
> > > > >
> > > > > > 3. We need to flesh out the details of the migration story. It
> sounds
> > > > > like
> > > > > > we're saying we will support online migrations. Is that correct?
> We
> > > > > should
> > > > > > explain this in detail. It could also be done as a separate KIP,
> if
> > > it's
> > > > > > easier.
> > > > >
> > > > > Yes, we will support online migrations for the group. That means
> that
> > > > > a group using the old protocol will be able to switch to the new
> > > > > protocol.
> > > > >
> > > > > Let me briefly explain how that will work though. It is basically a
> > > > > four step process:
> > > > >
> > > > > 1. The cluster must be upgraded or rolled to a software supporting
> the
> > > > > new group coordinator. Both the old and the new coordinator will
> > > > > support the old protocol and rely on the same persisted metadata so
> > > > > they can work together. This point is an offline migration. We
> cannot
> > > > > do this one live because it would require shutting down the current
> > > > > coordinator and starting up the new one and that would cause
> > > > > unavailabilities.
> > > > > 2. The cluster's metadata version/IBP must be upgraded to X in
> order
> > > > > to enable the new protocol. This cannot be done before 1) is
> > > > > terminated because the old coordinator doesn't support the new
> > > > > protocol.
> > > > > 3. The consumers must be upgraded to a version supporting the
> online
> > > > > migration (must have KIP-792). If the consumer is already there.
> > > > > Nothing must be done at this point.
> > > > > 4. The consumers must be rolled with the feature flag turned on.
> The
> > > > > consumer group is automatically converted when the first consumer
> > > > > using the new protocol joins the group. While the members using the
> > > > > old protocol are being upgraded, the old protocol is proxied into
> the
> > > > > new one.
> > > > >
> > > > > Let me clarify all of this in the KIP.
> > > > >
> > > > > > 4. I am happy that we are pushing the pattern subscriptions to
> the
> > > > > server,
> > > > > > but it seems like there could be some tricky compatibility
> issues.
> > > Will
> > > > > we
> > > > > > have a mechanism for users to detect that they need to update
> their
> > > regex
> > > > > > before switching to the new protocol?
> > > > >
> > > > > I think that I am a bit more optimistic than you on this point. I
> > > > > believe that the majority of the cases are simple regexes which
> should
> > > > > work with the new engine. The coordinator will verify the regex
> anyway
> > > > > and reject the consumer if the regex is not valid. Coming back to
> the
> > > > > migration path, in the worst case, the first upgraded consumer
> joining
> > > > > the group will be rejected. This should be used as the last
> defence, I
> > > > > would say.
> > > > >
> > > > > One way for customers to validate their regex before upgrading
> their
> > > > > prod would be to test them with another group. For instance, that
> > > > > could be done in a pre-prod environment. Another way would be to
> > > > > extend the consumer-group tool to provide a regex validation
> > > > > mechanism. Would this be enough in your opinion?
> > > > >
> > > > > > 5. Related to the last question, will the Java client allow the
> > > users to
> > > > > > stick with the current regex engine for compatibility reasons?
> For
> > > > > example,
> > > > > > it may be handy to keep using client based regex at first to keep
> > > > > > migrations simple and then migrate to server based regexes as a
> > > second
> > > > > step.
> > > > >
> > > > > I understand your point but I am concerned that this would allow
> users
> > > > > to actually stay in this mode. That would go against our goal of
> > > > > simplifying the client because we would have to continue monitoring
> > > > > the metadata on the client side. I would rather not do this.
> > > > >
> > > > > > 6. When we say that the group coordinator will be responsible for
> > > storing
> > > > > > the configurations and that the configurations will be deleted
> when
> > > the
> > > > > > group is deleted. Will a transition to DEAD trigger deletion of
> > > > > > configurations?
> > > > >
> > > > > That's right. The configurations will be deleted when the group is
> > > > > deleted. They go together.
> > > > >
> > > > > > 7. Will the choice to store the configs in the group coordinator
> > > make it
> > > > > > harder to list all cluster configs and their values?
> > > > >
> > > > > I don't think so. The group configurations are overrides of cluster
> > > > > configs. If you want to know all the overrides though, you would
> have
> > > > > to ask all the group coordinators. You cannot rely on the metadata
> log
> > > > > for instance.
> > > > >
> > > > > > 8. How would someone configure a group before starting the
> consumers?
> > > > > Have
> > > > > > we considered allowing the explicit creation of groups?
> > > Alternatively,
> > > > > the
> > > > > > configs could be decoupled from the group lifecycle.
> > > > >
> > > > > Yes. The group will be automatically created in this case. However,
> > > > > the configs will be lost after the retention period of the group
> > > > > passes.
> > > > >
> > > > > > 9. Will the Consumer.subscribe method for the Java client still
> take
> > > a
> > > > > > `java.util.regex.Pattern` of do we have to introduce an overload?
> > > > >
> > > > > That's a very group question. I forgot about that one. As the
> > > > > `java.util.regex.Pattern` is not fully compatible with the engine
> that
> > > > > we plan to use, it might be better to deprecate it and use an
> overload
> > > > > which takes a string. We would rely on the server side validation.
> > > > > During the migration, I think that we could still try to toString
> the
> > > > > regex and use it. That should work, I think, in the majority of the
> > > > > cases.
> > > > >
> > > > > > 10. I agree with Justine that we should be clearer about the
> reason
> > > to
> > > > > > switch to IBP/metadata.version from the feature flag. Maybe we
> mean
> > > that
> > > > > we
> > > > > > can switch the default for the feature flag to true based on the
> > > > > > metadata.version once we want to make it the default.
> > > > >
> > > > > My plan was to use that feature flag mainly during the development
> > > > > phase. I should not have mentioned it, I think, because we could
> use
> > > > > an internal config for it.
> > > > >
> > > > > > 11. Some of the protocol APIs don't mention the required ACLs, it
> > > would
> > > > > be
> > > > > > good to add that for consistency.
> > > > >
> > > > > Noted.
> > > > >
> > > > > > 12. It is a bit odd that ConsumerGroupHeartbeat requires "Read
> Group"
> > > > > even
> > > > > > though it seems to do more than reading.
> > > > >
> > > > > I agree. This is how the current protocol works though. We only
> > > > > require "Read Group" to join a group. We could consider changing
> this
> > > > > but I am not sure that it is worth it.
> > > > >
> > > > > > 13. How is topic recreation handled by the consumer with the new
> > > group
> > > > > > protocol? It would be good to have a section on this.
> > > > >
> > > > > Noted. From a protocol perspective, the new topic will have a new
> > > > > topic id so it will treat it like a topic with a different name.
> The
> > > > > only issue is that the fetch/commit offsets APIs do not support
> topic
> > > > > IDs so the consumer would reuse the offsets based on the same. I
> think
> > > > > that we should update those APIs as well in order to be consistent
> end
> > > > > to end. That would strengthen the semantics of the consumer.
> > > > >
> > > > > > 14. The KIP mentions we will write the new coordinator in Java.
> Even
> > > > > though
> > > > > > this is an implementation detail, do we plan to have a new gradle
> > > module
> > > > > > for it?
> > > > >
> > > > > Yes.
> > > > >
> > > > > > 15. Do we have a scalability goal when it comes to how many
> members
> > > the
> > > > > new
> > > > > > group protocol can support?
> > > > >
> > > > > We don't have numbers at the moment. The protocol should support
> 1000s
> > > > > of members per group. We will measure this when we have a first
> > > > > implementation. Note that we might have other bottlenecks down the
> > > > > road (e.g. offset commits).
> > > > >
> > > > > > 16. Did we consider having SubscribedTopidIds instead of
> > > > > > SubscribedTopicNames in ConsumerGroupHeartbeatRequest? Is the
> idea
> > > that
> > > > > > since we have to resolve the regex on the server, we can do the
> same
> > > for
> > > > > > the topic name? The difference is that sending the regex is more
> > > > > efficient
> > > > > > whereas sending the topic names is less efficient. Furthermore,
> > > delete
> > > > > and
> > > > > > recreation is easier to handle if we have topic ids.
> > > > >
> > > > > The idea was to consolidate the metadata lookup on the server for
> both
> > > > > paths but I do agree with your point. As a second though, using
> topic
> > > > > ids may be better here for the delete and recreation case. Also, I
> > > > > suppose that we may allow users to subscribe with topic ids in the
> > > > > future because that is the only way to be really robust to topic
> > > > > re-creation.
> > > > >
> > > > > Best,
> > > > > David
> > > > >
> > > > > On Tue, Jul 12, 2022 at 1:38 PM David Jacot <dja...@confluent.io>
> > > wrote:
> > > > > >
> > > > > > Hi Justine,
> > > > > >
> > > > > > Thanks for your comments. Please find my answers below.
> > > > > >
> > > > > > - Yes, the new protocol relies on topic IDs with the exception
> of the
> > > > > > topic names based in the ConsumerGroupHeartbeatRequest. I am not
> sure
> > > > > > if using topic names is the right call here. I need to think
> about it
> > > > > > a little more. Obviously, the KIP does not change the
> fetch/commit
> > > > > > offsets RPCs to use topic IDs. This may be something that we
> should
> > > > > > include though as it would give better overall guarantee in the
> > > > > > producer.
> > > > > > - You're right. I think that I should not have mentioned this
> flag at
> > > > > > all. I will remove it. We can use an internal configuration while
> > > > > > developing the feature.
> > > > > > - Both cluster types will be supported. The change is
> orthogonal. The
> > > > > > only requirement is that the cluster uses topic IDs.
> > > > > >
> > > > > > Best,
> > > > > > David
> > > > > >
> > > > > > On Mon, Jul 11, 2022 at 9:53 PM Guozhang Wang <
> wangg...@gmail.com>
> > > > > wrote:
> > > > > > >
> > > > > > > Hi Ismael,
> > > > > > >
> > > > > > > Thanks for the feedback. Here are some replies inlined below:
> > > > > > >
> > > > > > > On Sat, Jul 9, 2022 at 2:53 AM Ismael Juma <ism...@juma.me.uk>
> > > wrote:
> > > > > > >
> > > > > > > > Thanks for the KIP. This has the potential to be a great
> > > > > improvement. A few
> > > > > > > > initial questions/comments:
> > > > > > > >
> > > > > > > > 1. I think it's premature to talk about target versions for
> > > > > deprecation and
> > > > > > > > removal of the existing group protocol. Unlike KRaft, this
> > > affects a
> > > > > core
> > > > > > > > client protocol and hence deprecation/removal will be heavily
> > > > > dependent on
> > > > > > > > how quickly applications migrate to the new protocol.
> > > > > > > >
> > > > > > >
> > > > > > > Yeah I agree with you. I think we can remove the proposed
> timeline
> > > in
> > > > > the
> > > > > > > `Compatibility, Deprecation, and Migration Plan` and instead
> just
> > > state
> > > > > > > that we will decide in the future about when we would
> deprecate old
> > > > > > > protocol and behaviors.
> > > > > > >
> > > > > > >
> > > > > > > > 2. The KIP says we intend to release this in 4.x, but it
> wasn't
> > > made
> > > > > clear
> > > > > > > > why. If we added that as a way to estimate when we'd
> deprecate
> > > and
> > > > > remove
> > > > > > > > the group protocol, I also suggest removing this part.
> > > > > > > >
> > > > > > >
> > > > > > > I think that's not specifically related to the
> deprecation/removal
> > > > > timeline
> > > > > > > plan, but it's more for client upgrades. I.e. the broker-side
> > > > > > > implementation may be done first, and then the client side,
> and we
> > > > > would
> > > > > > > only mark it as "released" by the time clients implementations
> are
> > > > > done. At
> > > > > > > that time, to enable the feature the clients need to first
> swap-in
> > > the
> > > > > > > bytecode with a rolling bounce and then set the flag with a
> second
> > > > > rolling
> > > > > > > bounce, and hence we feel it's better to be released in a major
> > > > > version.
> > > > > > >
> > > > > > >
> > > > > > > > 3. We need to flesh out the details of the migration story.
> It
> > > > > sounds like
> > > > > > > > we're saying we will support online migrations. Is that
> correct?
> > > We
> > > > > should
> > > > > > > > explain this in detail. It could also be done as a separate
> KIP,
> > > if
> > > > > it's
> > > > > > > > easier.
> > > > > > > >
> > > > > > >
> > > > > > > Yes I think that's the part we can be more concrete about for
> sure
> > > (and
> > > > > > > this is related to your question 2) above). We will work on
> making
> > > it
> > > > > more
> > > > > > > explicit in parallel as we solicit more feedback.
> > > > > > >
> > > > > > >
> > > > > > > > 4. I am happy that we are pushing the pattern subscriptions
> to
> > > the
> > > > > server,
> > > > > > > > but it seems like there could be some tricky compatibility
> > > issues.
> > > > > Will we
> > > > > > > > have a mechanism for users to detect that they need to update
> > > their
> > > > > regex
> > > > > > > > before switching to the new protocol?
> > > > > > > >
> > > > > > >
> > > > > > > Yes I think we need some tooling for non-java client users to
> sort
> > > of
> > > > > > > "dry-run" the client before switching to the new protocol. I
> do not
> > > > > have a
> > > > > > > specific idea on top of my head though, maybe others like @Matt
> > > > > Howlett can
> > > > > > > chime-in here?
> > > > > > >
> > > > > > >
> > > > > > > > 5. Related to the last question, will the Java client allow
> the
> > > > > users to
> > > > > > > > stick with the current regex engine for compatibility
> reasons?
> > > For
> > > > > example,
> > > > > > > > it may be handy to keep using client based regex at first to
> keep
> > > > > > > > migrations simple and then migrate to server based regexes
> as a
> > > > > second
> > > > > > > > step.
> > > > > > > >
> > > > > > >
> > > > > > > Honestly I have not thought about that for java clients, and
> we can
> > > > > discuss
> > > > > > > that. What kind of compatibility issues do you have in mind?
> > > > > > >
> > > > > > >
> > > > > > > > 6. When we say that the group coordinator will be
> responsible for
> > > > > storing
> > > > > > > > the configurations and that the configurations will be
> deleted
> > > when
> > > > > the
> > > > > > > > group is deleted. Will a transition to DEAD trigger deletion
> of
> > > > > > > > configurations?
> > > > > > > >
> > > > > > >
> > > > > > > Yes, since the DEAD state is an ending state (we would only
> > > transit to
> > > > > that
> > > > > > > state when the group is EMPTY and also all of its metadata are
> > > gone),
> > > > > once
> > > > > > > it's transited to DEAD this group would never be revived.
> > > > > > >
> > > > > > >
> > > > > > > > 7. Will the choice to store the configs in the group
> coordinator
> > > > > make it
> > > > > > > > harder to list all cluster configs and their values?
> > > > > > > >
> > > > > > >
> > > > > > > That's a good question, and our thoughts are that the so-called
> > > "group
> > > > > > > configurations" are overrides of the cluster-level
> configurations
> > > > > > > customized per group so when an admin list cluster configs it's
> > > okay to
> > > > > > > list just the cluster-level "defaults", not showing any
> per-group
> > > > > > > customizations.
> > > > > > >
> > > > > > >
> > > > > > > > 8. How would someone configure a group before starting the
> > > > > consumers? Have
> > > > > > > > we considered allowing the explicit creation of groups?
> > > > > Alternatively, the
> > > > > > > > configs could be decoupled from the group lifecycle.
> > > > > > > >
> > > > > > >
> > > > > > > The configs can be created before the group itself as an
> > > independent
> > > > > entity
> > > > > > > --- of course, this requires the corresponding request to be
> > > routed to
> > > > > the
> > > > > > > right coordinator based on the group id --- the only thing that
> > > > > differs is,
> > > > > > > when the group itself is gone we also check if there are any
> > > > > configuration
> > > > > > > entities related to that group and delete as well.
> > > > > > >
> > > > > > > Admittedly this indeed introduces an asymmetry on the creation
> /
> > > > > deletion
> > > > > > > lifecycles of the config entities, and we would like to hear
> > > everyone's
> > > > > > > feelings whether we should aim for symmetry i.e. totally
> decouple
> > > group
> > > > > > > configs and hence not delete them at all when the group is
> gone,
> > > but
> > > > > always
> > > > > > > require explicit deletion operations by themselves.
> > > > > > >
> > > > > > >
> > > > > > > > 9. Will the Consumer.subscribe method for the Java client
> still
> > > take
> > > > > a
> > > > > > > > `java.util.regex.Pattern` of do we have to introduce an
> overload?
> > > > > > > >
> > > > > > >
> > > > > > > I think we do not need to introduce an overload, but I'm all
> ears
> > > if
> > > > > there
> > > > > > > may be some compatibility issues that we may overlook.
> > > > > > >
> > > > > > >
> > > > > > > > 10. I agree with Justine that we should be clearer about the
> > > reason
> > > > > to
> > > > > > > > switch to IBP/metadata.version from the feature flag. Maybe
> we
> > > mean
> > > > > that we
> > > > > > > > can switch the default for the feature flag to true based on
> the
> > > > > > > > metadata.version once we want to make it the default.
> > > > > > >
> > > > > > > 11. Some of the protocol APIs don't mention the required ACLs,
> it
> > > > > would be
> > > > > > > > good to add that for consistency.
> > > > > > > >
> > > > > > >
> > > > > > > Ack, we can certainly do that.
> > > > > > >
> > > > > > >
> > > > > > > > 12. It is a bit odd that ConsumerGroupHeartbeat requires
> "Read
> > > > > Group" even
> > > > > > > > though it seems to do more than reading.
> > > > > > > >
> > > > > > >
> > > > > > > I had that thought myself as well, but in the end we could not
> > > find a
> > > > > > > better alternative: adding Write Group seems an overkill here
> > > since we
> > > > > do
> > > > > > > not have it elsewhere (we only have Read / Delete and Describe
> on
> > > > > groups so
> > > > > > > far). Would like to hear others thoughts.
> > > > > > >
> > > > > > >
> > > > > > > > 13. How is topic recreation handled by the consumer with the
> new
> > > > > group
> > > > > > > > protocol? It would be good to have a section on this.
> > > > > > > >
> > > > > > >
> > > > > > > You mean with regex subscription right? Yes we can add a
> section
> > > about
> > > > > > > that, but basically the idea is that consumer would be totally
> > > > > agnostic in
> > > > > > > the new protocol as it's handled all by the brokers.
> > > > > > >
> > > > > > >
> > > > > > > > 14. The KIP mentions we will write the new coordinator in
> Java.
> > > Even
> > > > > though
> > > > > > > > this is an implementation detail, do we plan to have a new
> gradle
> > > > > module
> > > > > > > > for it?
> > > > > > > >
> > > > > > >
> > > > > > > We have not thought about that. But I think the answer should
> be
> > > yes.
> > > > > > >
> > > > > > >
> > > > > > > > 15. Do we have a scalability goal when it comes to how many
> > > members
> > > > > the new
> > > > > > > > group protocol can support?
> > > > > > > >
> > > > > > >
> > > > > > > Within a group, I think we should shoot for 1000s of members.
> But
> > > that
> > > > > > > scalability goals also depend on the offset management (commit,
> > > fetch)
> > > > > > > capabilities of the coordinator which we did not cover in this
> > > KIP, so
> > > > > it's
> > > > > > > hard to give a number that applies universally.
> > > > > > >
> > > > > > >
> > > > > > > > 16. Did we consider having SubscribedTopidIds instead of
> > > > > > > > SubscribedTopicNames in ConsumerGroupHeartbeatRequest? Is the
> > > idea
> > > > > that
> > > > > > > > since we have to resolve the regex on the server, we can do
> the
> > > same
> > > > > for
> > > > > > > > the topic name? The difference is that sending the regex is
> more
> > > > > efficient
> > > > > > > > whereas sending the topic names is less efficient.
> Furthermore,
> > > > > delete and
> > > > > > > > recreation is easier to handle if we have topic ids.
> > > > > > > >
> > > > > > >
> > > > > > > The main reason to still let the clients send names is to keep
> the
> > > > > > > reasoning of names -> ids on the broker / admin client only.
> Note
> > > that
> > > > > > > although we added topic id in KIP-516, we never implemented the
> > > logic
> > > > > on
> > > > > > > consumer/producers leveraging the related newer versioned RPCs,
> > > > > instead we
> > > > > > > just set the topic id as empty UUID. We want to keep the
> > > > > consumer/producer
> > > > > > > to be thin and only delegate the reasoning on broker and
> > > potentially
> > > > > admin
> > > > > > > clients.
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Ismael
> > > > > > > >
> > > > > > > > On Wed, Jul 6, 2022 at 10:45 AM David Jacot
> > > > > <dja...@confluent.io.invalid>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > > I would like to start a discussion thread on KIP-848: The
> Next
> > > > > > > > > Generation of the Consumer Rebalance Protocol. With this
> KIP,
> > > we
> > > > > aim
> > > > > > > > > to make the rebalance protocol (for consumers) more
> reliable,
> > > more
> > > > > > > > > scalable, easier to implement for clients, and easier to
> debug
> > > for
> > > > > > > > > operators.
> > > > > > > > >
> > > > > > > > > The KIP is here:
> https://cwiki.apache.org/confluence/x/HhD1D.
> > > > > > > > >
> > > > > > > > > Please take a look and let me know what you think.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > David
> > > > > > > > >
> > > > > > > > > PS: I will be away from July 18th to August 8th. That gives
> > > you a
> > > > > bit
> > > > > > > > > of time to read and digest this long KIP.
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > -- Guozhang
> > > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > >
>

Reply via email to