Hi, David,

Thanks for the KIP. Overall, the main benefits of the KIP seem to be fewer
RPCs during rebalance and more efficient support of wildcard. A few
comments below.

30. ConsumerGroupHeartbeatRequest
30.1 ServerAssignor is a singleton. Do we plan to support rolling changing
of the partition assignor in the consumers?
30.2 For each field, could you explain whether it's required in every
request or the scenarios when it needs to be filled? For example, it's not
clear to me when TopicPartitions needs to be filled.

31. In the current consumer protocol, the rack affinity between the client
and the broker is only considered during fetching, but not during assigning
partitions to consumers. Sometimes, once the assignment is made, there is
no opportunity for read affinity because no replicas of assigned partitions
are close to the member. I am wondering if we should use this opportunity
to address this by including rack in GroupMember.

32. On the metric side, often, it's useful to know how busy a group
coordinator is. By moving the event loop model, it seems that we could add
a metric that tracks the fraction of the time the event loop is doing the
actual work.

33. Could we add a section on coordinator failover handling? For example,
does it need to trigger the check if any group with the wildcard
subscription now has a new matching topic?

34. ConsumerGroupMetadataValue, ConsumerGroupPartitionMetadataValue,
ConsumerGroupMemberMetadataValue: Could we document what the epoch field
reflects? For example, does the epoch in ConsumerGroupMetadataValue reflect
the latest group epoch? What about the one in
ConsumerGroupPartitionMetadataValue and ConsumerGroupMemberMetadataValue?

35. "the group coordinator will ensure that the following invariants are
met: ... All members exists." It's possible for a member not to get any
assigned partitions, right?

36. "He can rejoins the group with a member epoch equals to 0": When would
a consumer rejoin and what member id would be used?

37. "Instead, power users will have the ability to trigger a reassignment
by either providing a non-zero reason or by updating the assignor
metadata." Hmm, this seems to be conflicting with the deprecation of
Consumer#enforeRebalance.

38. The reassignment examples are nice. But the section seems to have
multiple typos.
38.1 When the group transitions to epoch 2, B immediately gets into
"epoch=1, partitions=[foo-2]", which seems incorrect.
38.2 When the group transitions to epoch 3, C seems to get into epoch=3,
partitions=[foo-1] too early.
38.3 After A transitions to epoch 3, C still has A - epoch=2,
partitions=[foo-0].

39. Rolling upgrade of consumers: Do we support the upgrade from any old
version to new one?

Thanks,

Jun

On Mon, Aug 29, 2022 at 9:20 AM David Jacot <dja...@confluent.io.invalid>
wrote:

> Hi all,
>
> The KIP states that we will re-implement the coordinator in Java. I
> discussed this offline with a few folks and folks are concerned that
> we could introduce many regressions in the old protocol if we do so.
> Therefore, I am going to remove this statement from the KIP. It is an
> implementation detail after all so it does not have to be decided at
> this stage. We will likely start by trying to refactor the current
> implementation as a first step.
>
> Cheers,
> David
>
> On Mon, Aug 29, 2022 at 3:52 PM David Jacot <dja...@confluent.io> wrote:
> >
> > Hi Luke,
> >
> > > 1.1. I think the state machine are: "Empty, assigning, reconciling,
> stable,
> > > dead" mentioned in Consumer Group States section, right?
> >
> > This sentence does not refer to those group states but rather to a
> > state machine replication (SMR). This refers to the entire state of
> > group coordinator which is replicated via the log layer. I will
> > clarify this in the KIP.
> >
> > > 1.2. What do you mean "each state machine is modelled as an event
> loop"?
> >
> > The idea is to follow a model similar to the new quorum controller. We
> > will have N threads to process events. Each __consumer_offsets
> > partition is assigned to a unique thread and all the events (e.g.
> > requests, callbacks, etc.) are processed by this thread. This simplify
> > concurrency and will enable us to do simulation testing for the group
> > coordinator.
> >
> > > 1.3. Why do we need a state machine per *__consumer_offsets*
> partitions?
> > > Not a state machine "per consumer group" owned by a group coordinator?
> For
> > > example, if one group coordinator owns 2 consumer groups, and both
> exist in
> > > *__consumer_offsets-0*, will we have 1 state machine for it, or 2?
> >
> > See 1.1. The confusion comes from there, I think.
> >
> > > 1.4. I know the "*group.coordinator.threads" *is the number of threads
> used
> > > to run the state machines. But I'm wondering if the purpose of the
> threads
> > > is only to keep the state of each consumer group (or
> *__consumer_offsets*
> > > partitions?), and no heavy computation, why should we need
> multi-threads
> > > here?
> >
> > See 1.2. The idea is to have an ability to shard the processing as the
> > computation could be heavy.
> >
> > > 2.1. The consumer session timeout, why does the default session
> timeout not
> > > locate between min (45s) and max(60s)? I thought the min/max session
> > > timeout is to define lower/upper bound of it, no?
> > >
> > > group.consumer.session.timeout.ms int 30s The timeout to detect client
> > > failures when using the consumer group protocol.
> > > group.consumer.min.session.timeout.ms int 45s The minimum session
> timeout.
> > > group.consumer.max.session.timeout.ms int 60s The maximum session
> timeout.
> >
> > This is indeed a mistake. The default session timeout should be 45s
> > (the current default).
> >
> > > 2.2. The default server side assignor are [range, uniform], which means
> > > we'll default to "range" assignor. I'd like to know why not uniform
> one? I
> > > thought usually users will choose uniform assignor (former sticky
> assinor)
> > > for better evenly distribution. Any other reason we choose range
> assignor
> > > as default?
> > > group.consumer.assignors List range, uniform The server side assignors.
> >
> > The order on the server side has no influence because the client must
> > chose the selector that he wants to use. There is no default in the
> > current proposal. If the assignor is not specified by the client, the
> > request is rejected. The default client value for
> > `group.remote.assignor` is `uniform` though.
> >
> > Thanks for your very good comments, Luke. I hope that my answers help
> > to clarify things. I will update the KIP as well based on your
> > feedback.
> >
> > Cheers,
> > David
> >
> > On Mon, Aug 22, 2022 at 9:29 AM Luke Chen <show...@gmail.com> wrote:
> > >
> > > Hi David,
> > >
> > > Thanks for the update.
> > >
> > > Some more questions:
> > > 1. In Group Coordinator section, you mentioned:
> > > > The new group coordinator will have a state machine per
> > > *__consumer_offsets* partitions, where each state machine is modelled
> as an
> > > event loop. Those state machines will be executed in
> > > *group.coordinator.threads* threads.
> > >
> > > 1.1. I think the state machine are: "Empty, assigning, reconciling,
> stable,
> > > dead" mentioned in Consumer Group States section, right?
> > > 1.2. What do you mean "each state machine is modelled as an event
> loop"?
> > > 1.3. Why do we need a state machine per *__consumer_offsets*
> partitions?
> > > Not a state machine "per consumer group" owned by a group coordinator?
> For
> > > example, if one group coordinator owns 2 consumer groups, and both
> exist in
> > > *__consumer_offsets-0*, will we have 1 state machine for it, or 2?
> > > 1.4. I know the "*group.coordinator.threads" *is the number of threads
> used
> > > to run the state machines. But I'm wondering if the purpose of the
> threads
> > > is only to keep the state of each consumer group (or
> *__consumer_offsets*
> > > partitions?), and no heavy computation, why should we need
> multi-threads
> > > here?
> > >
> > > 2. For the default value in the new configs:
> > > 2.1. The consumer session timeout, why does the default session
> timeout not
> > > locate between min (45s) and max(60s)? I thought the min/max session
> > > timeout is to define lower/upper bound of it, no?
> > >
> > > group.consumer.session.timeout.ms int 30s The timeout to detect client
> > > failures when using the consumer group protocol.
> > > group.consumer.min.session.timeout.ms int 45s The minimum session
> timeout.
> > > group.consumer.max.session.timeout.ms int 60s The maximum session
> timeout.
> > >
> > >
> > >
> > > 2.2. The default server side assignor are [range, uniform], which means
> > > we'll default to "range" assignor. I'd like to know why not uniform
> one? I
> > > thought usually users will choose uniform assignor (former sticky
> assinor)
> > > for better evenly distribution. Any other reason we choose range
> assignor
> > > as default?
> > > group.consumer.assignors List range, uniform The server side assignors.
> > >
> > >
> > >
> > >
> > >
> > >
> > > Thank you.
> > > Luke
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Mon, Aug 22, 2022 at 2:10 PM Luke Chen <show...@gmail.com> wrote:
> > >
> > > > Hi Sagar,
> > > >
> > > > I have some thoughts about Kafka Connect integrating with KIP-848,
> but I
> > > > think we should have a separate discussion thread for the Kafka
> Connect
> > > > KIP: Integrating Kafka Connect With New Consumer Rebalance Protocol
> [1],
> > > > and let this discussion thread focus on consumer rebalance protocol,
> WDYT?
> > > >
> > > > [1]
> > > >
> https://cwiki.apache.org/confluence/display/KAFKA/%5BDRAFT%5DIntegrating+Kafka+Connect+With+New+Consumer+Rebalance+Protocol
> > > >
> > > > Thank you.
> > > > Luke
> > > >
> > > >
> > > >
> > > > On Fri, Aug 12, 2022 at 9:31 PM Sagar <sagarmeansoc...@gmail.com>
> wrote:
> > > >
> > > >> Thank you Guozhang/David for the feedback. Looks like there's
> agreement on
> > > >> using separate APIs for Connect. I would revisit the doc and see
> what
> > > >> changes are to be made.
> > > >>
> > > >> Thanks!
> > > >> Sagar.
> > > >>
> > > >> On Tue, Aug 9, 2022 at 7:11 PM David Jacot
> <dja...@confluent.io.invalid>
> > > >> wrote:
> > > >>
> > > >> > Hi Sagar,
> > > >> >
> > > >> > Thanks for the feedback and the document. That's really helpful. I
> > > >> > will take a look at it.
> > > >> >
> > > >> > Overall, it seems to me that both Connect and the Consumer could
> share
> > > >> > the same underlying "engine". The main difference is that the
> Consumer
> > > >> > assigns topic-partitions to members whereas Connect assigns tasks
> to
> > > >> > workers. I see two ways to move forward:
> > > >> > 1) We extend the new proposed APIs to support different resource
> types
> > > >> > (e.g. partitions, tasks, etc.); or
> > > >> > 2) We use new dedicated APIs for Connect. The dedicated APIs
> would be
> > > >> > similar to the new ones but different on the content/resources and
> > > >> > they would rely on the same engine on the coordinator side.
> > > >> >
> > > >> > I personally lean towards 2) because I am not a fan of
> overcharging
> > > >> > APIs to serve different purposes. That being said, I am not
> opposed to
> > > >> > 1) if we can find an elegant way to do it.
> > > >> >
> > > >> > I think that we can continue to discuss it here for now in order
> to
> > > >> > ensure that this KIP is compatible with what we will do for
> Connect in
> > > >> > the future.
> > > >> >
> > > >> > Best,
> > > >> > David
> > > >> >
> > > >> > On Mon, Aug 8, 2022 at 2:41 PM David Jacot <dja...@confluent.io>
> wrote:
> > > >> > >
> > > >> > > Hi all,
> > > >> > >
> > > >> > > I am back from vacation. I will go through and address your
> comments
> > > >> > > in the coming days. Thanks for your feedback.
> > > >> > >
> > > >> > > Cheers,
> > > >> > > David
> > > >> > >
> > > >> > > On Wed, Aug 3, 2022 at 10:05 PM Gregory Harris <
> gharris1...@gmail.com
> > > >> >
> > > >> > wrote:
> > > >> > > >
> > > >> > > > Hey All!
> > > >> > > >
> > > >> > > > Thanks for the KIP, it's wonderful to see cooperative
> rebalancing
> > > >> > making it
> > > >> > > > down the stack!
> > > >> > > >
> > > >> > > > I had a few questions:
> > > >> > > >
> > > >> > > > 1. The 'Rejected Alternatives' section describes how member
> epoch
> > > >> > should
> > > >> > > > advance in step with the group epoch and assignment epoch
> values. I
> > > >> > think
> > > >> > > > that this is a good idea for the reasons described in the
> KIP. When
> > > >> the
> > > >> > > > protocol is incrementally assigning partitions to a worker,
> what
> > > >> member
> > > >> > > > epoch does each incremental assignment use? Are member epochs
> > > >> re-used,
> > > >> > and
> > > >> > > > a single member epoch can correspond to multiple different
> > > >> > (monotonically
> > > >> > > > larger) assignments?
> > > >> > > >
> > > >> > > > 2. Is the Assignor's 'Reason' field opaque to the group
> > > >> coordinator? If
> > > >> > > > not, should custom client-side assignor implementations
> interact
> > > >> with
> > > >> > the
> > > >> > > > Reason field, and how is its common meaning agreed upon? If
> so, what
> > > >> > is the
> > > >> > > > benefit of a distinct Reason field over including such
> functionality
> > > >> > in the
> > > >> > > > opaque metadata?
> > > >> > > >
> > > >> > > > 3. The following is included in the KIP: "Thanks to this, the
> input
> > > >> of
> > > >> > the
> > > >> > > > client side assignor is entirely driven by the group
> coordinator.
> > > >> The
> > > >> > > > consumer is no longer responsible for maintaining any state
> besides
> > > >> its
> > > >> > > > assigned partitions." Does this mean that the client-side
> assignor
> > > >> MAY
> > > >> > > > incorporate additional non-Metadata state (such as partition
> > > >> > throughput,
> > > >> > > > cpu/memory metrics, config topics, etc), or that additional
> > > >> > non-Metadata
> > > >> > > > state SHOULD NOT be used?
> > > >> > > >
> > > >> > > > 4. I see that there are separate classes
> > > >> > > > for org.apache.kafka.server.group.consumer.PartitionAssignor
> > > >> > > > and org.apache.kafka.clients.consumer.PartitionAssignor that
> seem to
> > > >> > > > overlap significantly. Is it possible for these two
> implementations
> > > >> to
> > > >> > be
> > > >> > > > unified? This would serve to promote feature parity of
> server-side
> > > >> and
> > > >> > > > client-side assignors, and would also facilitate operational
> > > >> > flexibility in
> > > >> > > > certain situations. For example, if a server-side assignor
> has some
> > > >> > poor
> > > >> > > > behavior and needs a patch, deploying the patched assignor to
> the
> > > >> > client
> > > >> > > > and switching one consumer group to a client-side assignor
> may be
> > > >> > faster
> > > >> > > > and less risky than patching all of the brokers. With the
> currently
> > > >> > > > proposed distinct APIs, a non-trivial reimplementation would
> have
> > > >> to be
> > > >> > > > assembled, and if the two APIs have diverged significantly,
> then it
> > > >> is
> > > >> > > > possible that a reimplementation would not be possible.
> > > >> > > >
> > > >> > > > --
> > > >> > > > Greg Harris
> > > >> > > > gharris1...@gmail.com
> > > >> > > > github.com/gharris1727
> > > >> > > >
> > > >> > > > On Wed, Aug 3, 2022 at 8:39 AM Sagar <
> sagarmeansoc...@gmail.com>
> > > >> > wrote:
> > > >> > > >
> > > >> > > > > Hi Guozhang/David,
> > > >> > > > >
> > > >> > > > > I created a confluence page to discuss how Connect would
> need to
> > > >> > change
> > > >> > > > > based on the new rebalance protocol. Here's the page:
> > > >> > > > >
> > > >> > > > >
> > > >> >
> > > >>
> https://cwiki.apache.org/confluence/display/KAFKA/%5BDRAFT%5DIntegrating+Kafka+Connect+With+New+Consumer+Rebalance+Protocol
> > > >> > > > >
> > > >> > > > > It's also pretty longish and I have tried to keep a format
> > > >> similar to
> > > >> > > > > KIP-848. Let me know what you think. Also, do you think this
> > > >> should
> > > >> > be
> > > >> > > > > moved to a separate discussion thread or is this one fine?
> > > >> > > > >
> > > >> > > > > Thanks!
> > > >> > > > > Sagar.
> > > >> > > > >
> > > >> > > > > On Tue, Jul 26, 2022 at 7:37 AM Sagar <
> sagarmeansoc...@gmail.com>
> > > >> > wrote:
> > > >> > > > >
> > > >> > > > > > Hello Guozhang,
> > > >> > > > > >
> > > >> > > > > > Thank you so much for the doc on Kafka Streams. Sure, I
> would do
> > > >> > the
> > > >> > > > > > analysis and come up with such a document.
> > > >> > > > > >
> > > >> > > > > > Thanks!
> > > >> > > > > > Sagar.
> > > >> > > > > >
> > > >> > > > > > On Tue, Jul 26, 2022 at 4:47 AM Guozhang Wang <
> > > >> wangg...@gmail.com>
> > > >> > > > > wrote:
> > > >> > > > > >
> > > >> > > > > >> Hello Sagar,
> > > >> > > > > >>
> > > >> > > > > >> It would be great if you could come back with some
> analysis on
> > > >> > how to
> > > >> > > > > >> implement the Connect side integration with the new
> protocol;
> > > >> so
> > > >> > far
> > > >> > > > > >> besides leveraging on the new "protocol type" we did not
> yet
> > > >> think
> > > >> > > > > through
> > > >> > > > > >> the Connect side implementations. For Streams here's a
> draft of
> > > >> > > > > >> integration
> > > >> > > > > >> plan:
> > > >> > > > > >>
> > > >> > > > > >>
> > > >> > > > >
> > > >> >
> > > >>
> https://docs.google.com/document/d/17PNz2sGoIvGyIzz8vLyJTJTU2rqnD_D9uHJnH9XARjU/edit#heading=h.pdgirmi57dvn
> > > >> > > > > >> just FYI for your analysis on Connect.
> > > >> > > > > >>
> > > >> > > > > >> On Tue, Jul 19, 2022 at 10:48 PM Sagar <
> > > >> sagarmeansoc...@gmail.com
> > > >> > >
> > > >> > > > > wrote:
> > > >> > > > > >>
> > > >> > > > > >> > Hi David,
> > > >> > > > > >> >
> > > >> > > > > >> > Thank you for your response. The reason I thought
> connect can
> > > >> > also fit
> > > >> > > > > >> into
> > > >> > > > > >> > this new scheme is that even today the connect uses a
> > > >> > > > > WorkerCoordinator
> > > >> > > > > >> > extending from AbstractCoordinator to empower
> rebalances of
> > > >> > > > > >> > tasks/connectors. The WorkerCoordinator sets the
> > > >> protocolType()
> > > >> > to
> > > >> > > > > >> connect
> > > >> > > > > >> > and uses the metadata() method by plumbing into
> > > >> > > > > >> JoinGroupRequestProtocol.
> > > >> > > > > >> >
> > > >> > > > > >> > I think the changes to support connect would be
> similar at a
> > > >> > high
> > > >> > > > > level
> > > >> > > > > >> to
> > > >> > > > > >> > the changes in streams mainly because of the Client
> side
> > > >> > assignors
> > > >> > > > > being
> > > >> > > > > >> > used in both. At an implementation level, we might
> need to
> > > >> make
> > > >> > a lot
> > > >> > > > > of
> > > >> > > > > >> > changes to get onto this new assignment protocol like
> > > >> enhancing
> > > >> > the
> > > >> > > > > >> > JoinGroup request/response and SyncGroup and using
> > > >> > > > > >> ConsumerGroupHeartbeat
> > > >> > > > > >> > API etc again on similar lines to streams (or there
> might be
> > > >> > > > > >> deviations). I
> > > >> > > > > >> > would try to perform a detailed analysis of the same
> and we
> > > >> > can have
> > > >> > > > > a
> > > >> > > > > >> > separate discussion thread for that as that would
> derail this
> > > >> > > > > discussion
> > > >> > > > > >> > thread. Let me know if that sounds good to you.
> > > >> > > > > >> >
> > > >> > > > > >> > Thanks!
> > > >> > > > > >> > Sagar.
> > > >> > > > > >> >
> > > >> > > > > >> >
> > > >> > > > > >> >
> > > >> > > > > >> > On Fri, Jul 15, 2022 at 5:47 PM David Jacot
> > > >> > > > > <dja...@confluent.io.invalid
> > > >> > > > > >> >
> > > >> > > > > >> > wrote:
> > > >> > > > > >> >
> > > >> > > > > >> > > Hi Sagar,
> > > >> > > > > >> > >
> > > >> > > > > >> > > Thanks for your comments.
> > > >> > > > > >> > >
> > > >> > > > > >> > > 1) Yes. That refers to `Assignment#error`. Sure, I
> can
> > > >> > mention it.
> > > >> > > > > >> > >
> > > >> > > > > >> > > 2) The idea is to transition C from his current
> assignment
> > > >> to
> > > >> > his
> > > >> > > > > >> > > target assignment when he can move to epoch 3. When
> that
> > > >> > happens,
> > > >> > > > > the
> > > >> > > > > >> > > member assignment is updated and persisted with all
> its
> > > >> > assigned
> > > >> > > > > >> > > partitions even if they are not all revoked yet. In
> other
> > > >> > words, the
> > > >> > > > > >> > > member assignment becomes the target assignment.
> This is
> > > >> > basically
> > > >> > > > > an
> > > >> > > > > >> > > optimization to avoid having to write all the
> changes to
> > > >> the
> > > >> > log.
> > > >> > > > > The
> > > >> > > > > >> > > examples are based on the persisted state so I
> understand
> > > >> the
> > > >> > > > > >> > > confusion. Let me see if I can improve this in the
> > > >> > description.
> > > >> > > > > >> > >
> > > >> > > > > >> > > 3) Regarding Connect, it could reuse the protocol
> with a
> > > >> > client side
> > > >> > > > > >> > > assignor if it fits in the protocol. The assignment
> is
> > > >> about
> > > >> > > > > >> > > topicid-partitions + metadata, could Connect fit
> into this?
> > > >> > > > > >> > >
> > > >> > > > > >> > > Best,
> > > >> > > > > >> > > David
> > > >> > > > > >> > >
> > > >> > > > > >> > > On Fri, Jul 15, 2022 at 1:55 PM Sagar <
> > > >> > sagarmeansoc...@gmail.com>
> > > >> > > > > >> wrote:
> > > >> > > > > >> > > >
> > > >> > > > > >> > > > Hi David,
> > > >> > > > > >> > > >
> > > >> > > > > >> > > > Thanks for the KIP. I just had minor observations:
> > > >> > > > > >> > > >
> > > >> > > > > >> > > > 1) In the Assignment Error section in Client Side
> mode
> > > >> > Assignment
> > > >> > > > > >> > > process,
> > > >> > > > > >> > > > you mentioned => `In this case, the client side
> assignor
> > > >> can
> > > >> > > > > return
> > > >> > > > > >> an
> > > >> > > > > >> > > > error to the group coordinator`. In this case are
> you
> > > >> > referring to
> > > >> > > > > >> the
> > > >> > > > > >> > > > Assignor returning an AssignmentError that's
> listed down
> > > >> > towards
> > > >> > > > > the
> > > >> > > > > >> > end?
> > > >> > > > > >> > > > If yes, do you think it would make sense to
> mention this
> > > >> > > > > explicitly
> > > >> > > > > >> > here?
> > > >> > > > > >> > > >
> > > >> > > > > >> > > > 2) In the Case Studies section, I have a slight
> > > >> confusion,
> > > >> > not
> > > >> > > > > sure
> > > >> > > > > >> if
> > > >> > > > > >> > > > others have the same. Consider this step:
> > > >> > > > > >> > > >
> > > >> > > > > >> > > > When B heartbeats, the group coordinator
> transitions him
> > > >> to
> > > >> > epoch
> > > >> > > > > 3
> > > >> > > > > >> > > because
> > > >> > > > > >> > > > B has no partitions to revoke. It persists the
> change and
> > > >> > reply.
> > > >> > > > > >> > > >
> > > >> > > > > >> > > >    - Group (epoch=3)
> > > >> > > > > >> > > >       - A
> > > >> > > > > >> > > >       - B
> > > >> > > > > >> > > >       - C
> > > >> > > > > >> > > >    - Target Assignment (epoch=3)
> > > >> > > > > >> > > >       - A - partitions=[foo-0]
> > > >> > > > > >> > > >       - B - partitions=[foo-2]
> > > >> > > > > >> > > >       - C - partitions=[foo-1]
> > > >> > > > > >> > > >    - Member Assignment
> > > >> > > > > >> > > >       - A - epoch=2, partitions=[foo-0, foo-1]
> > > >> > > > > >> > > >       - B - epoch=3, partitions=[foo-2]
> > > >> > > > > >> > > >       - C - epoch=3, partitions=[foo-1]
> > > >> > > > > >> > > >
> > > >> > > > > >> > > > When C heartbeats, it transitions to epoch 3 but
> cannot
> > > >> get
> > > >> > foo-1
> > > >> > > > > >> yet.
> > > >> > > > > >> > > >
> > > >> > > > > >> > > > Here,it's mentioned that member C can't get the
> foo-1
> > > >> > partition
> > > >> > > > > yet,
> > > >> > > > > >> > but
> > > >> > > > > >> > > > based on the description above, it seems it
> already has
> > > >> it.
> > > >> > Do you
> > > >> > > > > >> > think
> > > >> > > > > >> > > it
> > > >> > > > > >> > > > would be better to remove it and populate it only
> when it
> > > >> > actually
> > > >> > > > > >> gets
> > > >> > > > > >> > > it?
> > > >> > > > > >> > > > I see this in a lot of other places, so have I
> > > >> understood it
> > > >> > > > > >> > incorrectly
> > > >> > > > > >> > > ?
> > > >> > > > > >> > > >
> > > >> > > > > >> > > >
> > > >> > > > > >> > > > Regarding connect , it might be out of scope of
> this
> > > >> > discussion,
> > > >> > > > > but
> > > >> > > > > >> > from
> > > >> > > > > >> > > > what I understood it would probably be running in
> client
> > > >> > side
> > > >> > > > > >> assignor
> > > >> > > > > >> > > mode
> > > >> > > > > >> > > > even on the new rebalance protocol as it has its
> own
> > > >> Custom
> > > >> > > > > >> > > Assignors(Eager
> > > >> > > > > >> > > > and IncrementalCooperative).
> > > >> > > > > >> > > >
> > > >> > > > > >> > > > Thanks!
> > > >> > > > > >> > > >
> > > >> > > > > >> > > > Sagar.
> > > >> > > > > >> > > >
> > > >> > > > > >> > > >
> > > >> > > > > >> > > >
> > > >> > > > > >> > > >
> > > >> > > > > >> > > >
> > > >> > > > > >> > > >
> > > >> > > > > >> > > > On Fri, Jul 15, 2022 at 5:00 PM David Jacot
> > > >> > > > > >> > <dja...@confluent.io.invalid
> > > >> > > > > >> > > >
> > > >> > > > > >> > > > wrote:
> > > >> > > > > >> > > >
> > > >> > > > > >> > > > > Thanks Hector! Our goal is to move forward with
> > > >> > specialized API
> > > >> > > > > >> > > > > instead of relying on one generic API. For
> Connect, we
> > > >> > can apply
> > > >> > > > > >> the
> > > >> > > > > >> > > > > exact same pattern and reuse/share the core
> > > >> > implementation on
> > > >> > > > > the
> > > >> > > > > >> > > > > server side. For the schema registry, I think
> that we
> > > >> > should
> > > >> > > > > >> consider
> > > >> > > > > >> > > > > having a tailored API to do simple
> membership/leader
> > > >> > election.
> > > >> > > > > >> > > > >
> > > >> > > > > >> > > > > Best,
> > > >> > > > > >> > > > > David
> > > >> > > > > >> > > > >
> > > >> > > > > >> > > > > On Fri, Jul 15, 2022 at 10:22 AM Ismael Juma <
> > > >> > ism...@juma.me.uk
> > > >> > > > > >
> > > >> > > > > >> > > wrote:
> > > >> > > > > >> > > > > >
> > > >> > > > > >> > > > > > Three quick comments:
> > > >> > > > > >> > > > > >
> > > >> > > > > >> > > > > > 1. Regarding java.util.regex.Pattern vs
> > > >> > > > > >> com.google.re2j.Pattern, we
> > > >> > > > > >> > > > > should
> > > >> > > > > >> > > > > > document the differences in more detail before
> > > >> deciding
> > > >> > one
> > > >> > > > > way
> > > >> > > > > >> or
> > > >> > > > > >> > > > > another.
> > > >> > > > > >> > > > > > That said, if people pass
> java.util.regex.Pattern,
> > > >> they
> > > >> > expect
> > > >> > > > > >> > their
> > > >> > > > > >> > > > > > semantics to be honored. If we are doing
> something
> > > >> > different,
> > > >> > > > > >> then
> > > >> > > > > >> > we
> > > >> > > > > >> > > > > > should consider adding an overload with our own
> > > >> Pattern
> > > >> > class
> > > >> > > > > (I
> > > >> > > > > >> > > don't
> > > >> > > > > >> > > > > > think we'd want to expose re2j's at this
> point).
> > > >> > > > > >> > > > > > 2. Regarding topic ids, any major new protocol
> should
> > > >> > > > > integrate
> > > >> > > > > >> > fully
> > > >> > > > > >> > > > > with
> > > >> > > > > >> > > > > > it and should handle the topic recreation case
> > > >> > correctly.
> > > >> > > > > That's
> > > >> > > > > >> > the
> > > >> > > > > >> > > main
> > > >> > > > > >> > > > > > part we need to handle. I agree with David
> that we'd
> > > >> > want to
> > > >> > > > > add
> > > >> > > > > >> > > topic
> > > >> > > > > >> > > > > ids
> > > >> > > > > >> > > > > > to the relevant protocols that don't have it
> yet and
> > > >> > that we
> > > >> > > > > can
> > > >> > > > > >> > > probably
> > > >> > > > > >> > > > > > focus on the internals versus adding new APIs
> to the
> > > >> > Java
> > > >> > > > > >> Consumer
> > > >> > > > > >> > > > > (unless
> > > >> > > > > >> > > > > > we find that adding new APIs is required for
> > > >> reasonable
> > > >> > > > > >> semantics).
> > > >> > > > > >> > > > > > 3. I am still not sure about the coordinator
> storing
> > > >> the
> > > >> > > > > >> configs.
> > > >> > > > > >> > > It's
> > > >> > > > > >> > > > > > powerful for configs to be centralized in the
> > > >> metadata
> > > >> > log for
> > > >> > > > > >> > > various
> > > >> > > > > >> > > > > > reasons (auditability, visibility, consistency,
> > > >> etc.).
> > > >> > > > > >> Similarly, I
> > > >> > > > > >> > > am
> > > >> > > > > >> > > > > not
> > > >> > > > > >> > > > > > sure about automatically deleting configs in a
> way
> > > >> that
> > > >> > they
> > > >> > > > > >> cannot
> > > >> > > > > >> > > be
> > > >> > > > > >> > > > > > recovered. A good property for modern systems
> is to
> > > >> > minimize
> > > >> > > > > the
> > > >> > > > > >> > > number
> > > >> > > > > >> > > > > of
> > > >> > > > > >> > > > > > unrecoverable data loss scenarios.
> > > >> > > > > >> > > > > >
> > > >> > > > > >> > > > > > Ismael
> > > >> > > > > >> > > > > >
> > > >> > > > > >> > > > > > On Wed, Jul 13, 2022 at 3:47 PM David Jacot
> > > >> > > > > >> > > <dja...@confluent.io.invalid
> > > >> > > > > >> > > > > >
> > > >> > > > > >> > > > > > wrote:
> > > >> > > > > >> > > > > >
> > > >> > > > > >> > > > > > > Thanks Guozhang. My answers are below:
> > > >> > > > > >> > > > > > >
> > > >> > > > > >> > > > > > > > 1) the migration path, especially the last
> step
> > > >> when
> > > >> > > > > clients
> > > >> > > > > >> > > flip the
> > > >> > > > > >> > > > > > > flag
> > > >> > > > > >> > > > > > > > to enable the new protocol, in which we
> would
> > > >> have a
> > > >> > > > > window
> > > >> > > > > >> > where
> > > >> > > > > >> > > > > both
> > > >> > > > > >> > > > > > > new
> > > >> > > > > >> > > > > > > > protocols / rpcs and old protocols / rpcs
> are
> > > >> used
> > > >> > by
> > > >> > > > > >> members
> > > >> > > > > >> > of
> > > >> > > > > >> > > the
> > > >> > > > > >> > > > > same
> > > >> > > > > >> > > > > > > > group. How the coordinator could "mimic"
> the old
> > > >> > behavior
> > > >> > > > > >> while
> > > >> > > > > >> > > > > using the
> > > >> > > > > >> > > > > > > > new protocol is something we need to
> present
> > > >> about.
> > > >> > > > > >> > > > > > >
> > > >> > > > > >> > > > > > > Noted. I just published a new version of KIP
> which
> > > >> > includes
> > > >> > > > > >> more
> > > >> > > > > >> > > > > > > details about this. See the "Supporting
> Online
> > > >> > Consumer
> > > >> > > > > Group
> > > >> > > > > >> > > Upgrade"
> > > >> > > > > >> > > > > > > and the "Compatibility, Deprecation, and
> Migration
> > > >> > Plan". I
> > > >> > > > > >> think
> > > >> > > > > >> > > that
> > > >> > > > > >> > > > > > > I have to think through a few cases now but
> the
> > > >> > overall idea
> > > >> > > > > >> and
> > > >> > > > > >> > > > > > > mechanism should be understandable.
> > > >> > > > > >> > > > > > >
> > > >> > > > > >> > > > > > > > 2) the usage of topic ids. So far as
> KIP-516 the
> > > >> > topic ids
> > > >> > > > > >> are
> > > >> > > > > >> > > only
> > > >> > > > > >> > > > > used
> > > >> > > > > >> > > > > > > as
> > > >> > > > > >> > > > > > > > part of RPCs and admin client, but they
> are not
> > > >> > exposed
> > > >> > > > > via
> > > >> > > > > >> any
> > > >> > > > > >> > > > > public
> > > >> > > > > >> > > > > > > APIs
> > > >> > > > > >> > > > > > > > to consumers yet. I think the question is,
> first
> > > >> > should we
> > > >> > > > > >> let
> > > >> > > > > >> > > the
> > > >> > > > > >> > > > > > > consumer
> > > >> > > > > >> > > > > > > > client to be maintaining the names -> ids
> mapping
> > > >> > itself
> > > >> > > > > to
> > > >> > > > > >> > fully
> > > >> > > > > >> > > > > > > leverage
> > > >> > > > > >> > > > > > > > on all the augmented existing RPCs and the
> new
> > > >> RPCs
> > > >> > with
> > > >> > > > > the
> > > >> > > > > >> > > topic
> > > >> > > > > >> > > > > ids;
> > > >> > > > > >> > > > > > > and
> > > >> > > > > >> > > > > > > > secondly, should we ever consider exposing
> the
> > > >> > topic ids
> > > >> > > > > in
> > > >> > > > > >> the
> > > >> > > > > >> > > > > consumer
> > > >> > > > > >> > > > > > > > public APIs as well (both
> subscribe/assign, as
> > > >> well
> > > >> > as in
> > > >> > > > > >> the
> > > >> > > > > >> > > > > rebalance
> > > >> > > > > >> > > > > > > > listener for cases like topic
> > > >> > deletion-and-recreation).
> > > >> > > > > >> > > > > > >
> > > >> > > > > >> > > > > > > a) Assuming that we would include converting
> all
> > > >> the
> > > >> > offsets
> > > >> > > > > >> > > related
> > > >> > > > > >> > > > > > > RPCs to using topic ids in this KIP, the
> consumer
> > > >> > would be
> > > >> > > > > >> able
> > > >> > > > > >> > to
> > > >> > > > > >> > > > > > > fully operate with topic ids. That being
> said, it
> > > >> > still has
> > > >> > > > > to
> > > >> > > > > >> > > provide
> > > >> > > > > >> > > > > > > the topics names in various APIs so having a
> > > >> mapping
> > > >> > in the
> > > >> > > > > >> > > consumer
> > > >> > > > > >> > > > > > > seems inevitable to me.
> > > >> > > > > >> > > > > > > b) I don't have a strong opinion on this.
> Here I
> > > >> > wonder if
> > > >> > > > > >> this
> > > >> > > > > >> > > goes
> > > >> > > > > >> > > > > > > beyond the scope of this KIP. I would rather
> focus
> > > >> on
> > > >> > the
> > > >> > > > > >> > internals
> > > >> > > > > >> > > > > > > here and we can consider this separately if
> we see
> > > >> > value in
> > > >> > > > > >> doing
> > > >> > > > > >> > > it.
> > > >> > > > > >> > > > > > >
> > > >> > > > > >> > > > > > > Coming back to Ismael's point about using
> topic ids
> > > >> > in the
> > > >> > > > > >> > > > > > > ConsumerGroupHeartbeatRequest, I think that
> there
> > > >> is
> > > >> > one
> > > >> > > > > >> > advantage
> > > >> > > > > >> > > in
> > > >> > > > > >> > > > > > > favour of it. The consumer will have the
> > > >> opportunity
> > > >> > to
> > > >> > > > > >> validate
> > > >> > > > > >> > > that
> > > >> > > > > >> > > > > > > the topics exists before passing them into
> the
> > > >> group
> > > >> > > > > rebalance
> > > >> > > > > >> > > > > > > protocol. Obviously, the coordinator will
> also
> > > >> notice
> > > >> > it but
> > > >> > > > > >> it
> > > >> > > > > >> > > does
> > > >> > > > > >> > > > > > > not really have a way to reject an invalid
> topic in
> > > >> > the
> > > >> > > > > >> response.
> > > >> > > > > >> > > > > > >
> > > >> > > > > >> > > > > > > > I'm agreeing with David on all other minor
> > > >> questions
> > > >> > > > > except
> > > >> > > > > >> for
> > > >> > > > > >> > > the
> > > >> > > > > >> > > > > > > > `subscribe(Pattern)` question: personally
> I think
> > > >> > it's not
> > > >> > > > > >> > > necessary
> > > >> > > > > >> > > > > to
> > > >> > > > > >> > > > > > > > deprecate the subscribe API with Pattern,
> but
> > > >> > instead we
> > > >> > > > > >> still
> > > >> > > > > >> > > use
> > > >> > > > > >> > > > > > > Pattern
> > > >> > > > > >> > > > > > > > while just documenting that our
> subscription may
> > > >> be
> > > >> > > > > >> rejected by
> > > >> > > > > >> > > the
> > > >> > > > > >> > > > > > > server.
> > > >> > > > > >> > > > > > > > Since the incompatible case is a very rare
> > > >> scenario
> > > >> > I felt
> > > >> > > > > >> > using
> > > >> > > > > >> > > an
> > > >> > > > > >> > > > > > > > overloaded `String` based subscription may
> be
> > > >> more
> > > >> > > > > >> vulnerable
> > > >> > > > > >> > to
> > > >> > > > > >> > > > > various
> > > >> > > > > >> > > > > > > > invalid regexes.
> > > >> > > > > >> > > > > > >
> > > >> > > > > >> > > > > > > That could work. I have to look at the
> differences
> > > >> > between
> > > >> > > > > the
> > > >> > > > > >> > two
> > > >> > > > > >> > > > > > > engines to better understand the potential
> issues.
> > > >> My
> > > >> > > > > >> > > understanding is
> > > >> > > > > >> > > > > > > that would work for all the basic regular
> > > >> > expressions. The
> > > >> > > > > >> > > differences
> > > >> > > > > >> > > > > > > between the two are mainly about the various
> > > >> character
> > > >> > > > > >> classes. I
> > > >> > > > > >> > > > > > > wonder what other people think about this.
> > > >> > > > > >> > > > > > >
> > > >> > > > > >> > > > > > > Best,
> > > >> > > > > >> > > > > > > David
> > > >> > > > > >> > > > > > >
> > > >> > > > > >> > > > > > > On Tue, Jul 12, 2022 at 11:28 PM Guozhang
> Wang <
> > > >> > > > > >> > wangg...@gmail.com
> > > >> > > > > >> > > >
> > > >> > > > > >> > > > > wrote:
> > > >> > > > > >> > > > > > > >
> > > >> > > > > >> > > > > > > > Thanks David! I think on the high level
> there are
> > > >> > two meta
> > > >> > > > > >> > > points we
> > > >> > > > > >> > > > > need
> > > >> > > > > >> > > > > > > > to concretize a bit more:
> > > >> > > > > >> > > > > > > >
> > > >> > > > > >> > > > > > > > 1) the migration path, especially the last
> step
> > > >> when
> > > >> > > > > clients
> > > >> > > > > >> > > flip the
> > > >> > > > > >> > > > > > > flag
> > > >> > > > > >> > > > > > > > to enable the new protocol, in which we
> would
> > > >> have a
> > > >> > > > > window
> > > >> > > > > >> > where
> > > >> > > > > >> > > > > both
> > > >> > > > > >> > > > > > > new
> > > >> > > > > >> > > > > > > > protocols / rpcs and old protocols / rpcs
> are
> > > >> used
> > > >> > by
> > > >> > > > > >> members
> > > >> > > > > >> > of
> > > >> > > > > >> > > the
> > > >> > > > > >> > > > > same
> > > >> > > > > >> > > > > > > > group. How the coordinator could "mimic"
> the old
> > > >> > behavior
> > > >> > > > > >> while
> > > >> > > > > >> > > > > using the
> > > >> > > > > >> > > > > > > > new protocol is something we need to
> present
> > > >> about.
> > > >> > > > > >> > > > > > > > 2) the usage of topic ids. So far as
> KIP-516 the
> > > >> > topic ids
> > > >> > > > > >> are
> > > >> > > > > >> > > only
> > > >> > > > > >> > > > > used
> > > >> > > > > >> > > > > > > as
> > > >> > > > > >> > > > > > > > part of RPCs and admin client, but they
> are not
> > > >> > exposed
> > > >> > > > > via
> > > >> > > > > >> any
> > > >> > > > > >> > > > > public
> > > >> > > > > >> > > > > > > APIs
> > > >> > > > > >> > > > > > > > to consumers yet. I think the question is,
> first
> > > >> > should we
> > > >> > > > > >> let
> > > >> > > > > >> > > the
> > > >> > > > > >> > > > > > > consumer
> > > >> > > > > >> > > > > > > > client to be maintaining the names -> ids
> mapping
> > > >> > itself
> > > >> > > > > to
> > > >> > > > > >> > fully
> > > >> > > > > >> > > > > > > leverage
> > > >> > > > > >> > > > > > > > on all the augmented existing RPCs and the
> new
> > > >> RPCs
> > > >> > with
> > > >> > > > > the
> > > >> > > > > >> > > topic
> > > >> > > > > >> > > > > ids;
> > > >> > > > > >> > > > > > > and
> > > >> > > > > >> > > > > > > > secondly, should we ever consider exposing
> the
> > > >> > topic ids
> > > >> > > > > in
> > > >> > > > > >> the
> > > >> > > > > >> > > > > consumer
> > > >> > > > > >> > > > > > > > public APIs as well (both
> subscribe/assign, as
> > > >> well
> > > >> > as in
> > > >> > > > > >> the
> > > >> > > > > >> > > > > rebalance
> > > >> > > > > >> > > > > > > > listener for cases like topic
> > > >> > deletion-and-recreation).
> > > >> > > > > >> > > > > > > >
> > > >> > > > > >> > > > > > > > I'm agreeing with David on all other minor
> > > >> questions
> > > >> > > > > except
> > > >> > > > > >> for
> > > >> > > > > >> > > the
> > > >> > > > > >> > > > > > > > `subscribe(Pattern)` question: personally
> I think
> > > >> > it's not
> > > >> > > > > >> > > necessary
> > > >> > > > > >> > > > > to
> > > >> > > > > >> > > > > > > > deprecate the subscribe API with Pattern,
> but
> > > >> > instead we
> > > >> > > > > >> still
> > > >> > > > > >> > > use
> > > >> > > > > >> > > > > > > Pattern
> > > >> > > > > >> > > > > > > > while just documenting that our
> subscription may
> > > >> be
> > > >> > > > > >> rejected by
> > > >> > > > > >> > > the
> > > >> > > > > >> > > > > > > server.
> > > >> > > > > >> > > > > > > > Since the incompatible case is a very rare
> > > >> scenario
> > > >> > I felt
> > > >> > > > > >> > using
> > > >> > > > > >> > > an
> > > >> > > > > >> > > > > > > > overloaded `String` based subscription may
> be
> > > >> more
> > > >> > > > > >> vulnerable
> > > >> > > > > >> > to
> > > >> > > > > >> > > > > various
> > > >> > > > > >> > > > > > > > invalid regexes.
> > > >> > > > > >> > > > > > > >
> > > >> > > > > >> > > > > > > >
> > > >> > > > > >> > > > > > > > Guozhang
> > > >> > > > > >> > > > > > > >
> > > >> > > > > >> > > > > > > > On Tue, Jul 12, 2022 at 5:23 AM David Jacot
> > > >> > > > > >> > > > > <dja...@confluent.io.invalid
> > > >> > > > > >> > > > > > > >
> > > >> > > > > >> > > > > > > > wrote:
> > > >> > > > > >> > > > > > > >
> > > >> > > > > >> > > > > > > > > Hi Ismael,
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > Thanks for your feedback. Let me answer
> your
> > > >> > questions
> > > >> > > > > >> > inline.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > > 1. I think it's premature to talk about
> > > >> target
> > > >> > > > > versions
> > > >> > > > > >> for
> > > >> > > > > >> > > > > > > deprecation
> > > >> > > > > >> > > > > > > > > and
> > > >> > > > > >> > > > > > > > > > removal of the existing group protocol.
> > > >> Unlike
> > > >> > KRaft,
> > > >> > > > > >> this
> > > >> > > > > >> > > > > affects a
> > > >> > > > > >> > > > > > > core
> > > >> > > > > >> > > > > > > > > > client protocol and hence
> deprecation/removal
> > > >> > will be
> > > >> > > > > >> > heavily
> > > >> > > > > >> > > > > > > dependent
> > > >> > > > > >> > > > > > > > > on
> > > >> > > > > >> > > > > > > > > > how quickly applications migrate to
> the new
> > > >> > protocol.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > That makes sense. I will remove it.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > > 2. The KIP says we intend to release
> this in
> > > >> > 4.x, but
> > > >> > > > > it
> > > >> > > > > >> > > wasn't
> > > >> > > > > >> > > > > made
> > > >> > > > > >> > > > > > > > > clear
> > > >> > > > > >> > > > > > > > > > why. If we added that as a way to
> estimate
> > > >> when
> > > >> > we'd
> > > >> > > > > >> > > deprecate
> > > >> > > > > >> > > > > and
> > > >> > > > > >> > > > > > > remove
> > > >> > > > > >> > > > > > > > > > the group protocol, I also suggest
> removing
> > > >> > this part.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > Let me explain my reasoning. As
> explained, I
> > > >> plan
> > > >> > to
> > > >> > > > > >> rewrite
> > > >> > > > > >> > > the
> > > >> > > > > >> > > > > group
> > > >> > > > > >> > > > > > > > > coordinator in Java while we implement
> the new
> > > >> > protocol.
> > > >> > > > > >> This
> > > >> > > > > >> > > means
> > > >> > > > > >> > > > > > > > > that the internals will be slightly
> different
> > > >> > (e.g.
> > > >> > > > > >> threading
> > > >> > > > > >> > > > > model).
> > > >> > > > > >> > > > > > > > > Therefore, I wanted to tighten the
> switch from
> > > >> > the old
> > > >> > > > > >> group
> > > >> > > > > >> > > > > > > > > coordinator to the new group coordinator
> to a
> > > >> > major
> > > >> > > > > >> release.
> > > >> > > > > >> > > The
> > > >> > > > > >> > > > > > > > > alternative would be to use a flag to do
> the
> > > >> > switch
> > > >> > > > > >> instead
> > > >> > > > > >> > of
> > > >> > > > > >> > > > > relying
> > > >> > > > > >> > > > > > > > > on the software upgrade.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > > 3. We need to flesh out the details of
> the
> > > >> > migration
> > > >> > > > > >> story.
> > > >> > > > > >> > > It
> > > >> > > > > >> > > > > sounds
> > > >> > > > > >> > > > > > > > > like
> > > >> > > > > >> > > > > > > > > > we're saying we will support online
> > > >> migrations.
> > > >> > Is
> > > >> > > > > that
> > > >> > > > > >> > > correct?
> > > >> > > > > >> > > > > We
> > > >> > > > > >> > > > > > > > > should
> > > >> > > > > >> > > > > > > > > > explain this in detail. It could also
> be done
> > > >> > as a
> > > >> > > > > >> separate
> > > >> > > > > >> > > KIP,
> > > >> > > > > >> > > > > if
> > > >> > > > > >> > > > > > > it's
> > > >> > > > > >> > > > > > > > > > easier.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > Yes, we will support online migrations
> for the
> > > >> > group.
> > > >> > > > > That
> > > >> > > > > >> > > means
> > > >> > > > > >> > > > > that
> > > >> > > > > >> > > > > > > > > a group using the old protocol will be
> able to
> > > >> > switch to
> > > >> > > > > >> the
> > > >> > > > > >> > > new
> > > >> > > > > >> > > > > > > > > protocol.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > Let me briefly explain how that will work
> > > >> though.
> > > >> > It is
> > > >> > > > > >> > > basically a
> > > >> > > > > >> > > > > > > > > four step process:
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > 1. The cluster must be upgraded or
> rolled to a
> > > >> > software
> > > >> > > > > >> > > supporting
> > > >> > > > > >> > > > > the
> > > >> > > > > >> > > > > > > > > new group coordinator. Both the old and
> the new
> > > >> > > > > >> coordinator
> > > >> > > > > >> > > will
> > > >> > > > > >> > > > > > > > > support the old protocol and rely on the
> same
> > > >> > persisted
> > > >> > > > > >> > > metadata so
> > > >> > > > > >> > > > > > > > > they can work together. This point is an
> > > >> offline
> > > >> > > > > >> migration.
> > > >> > > > > >> > We
> > > >> > > > > >> > > > > cannot
> > > >> > > > > >> > > > > > > > > do this one live because it would require
> > > >> > shutting down
> > > >> > > > > >> the
> > > >> > > > > >> > > current
> > > >> > > > > >> > > > > > > > > coordinator and starting up the new one
> and
> > > >> that
> > > >> > would
> > > >> > > > > >> cause
> > > >> > > > > >> > > > > > > > > unavailabilities.
> > > >> > > > > >> > > > > > > > > 2. The cluster's metadata version/IBP
> must be
> > > >> > upgraded
> > > >> > > > > to
> > > >> > > > > >> X
> > > >> > > > > >> > in
> > > >> > > > > >> > > > > order
> > > >> > > > > >> > > > > > > > > to enable the new protocol. This cannot
> be done
> > > >> > before
> > > >> > > > > 1)
> > > >> > > > > >> is
> > > >> > > > > >> > > > > > > > > terminated because the old coordinator
> doesn't
> > > >> > support
> > > >> > > > > the
> > > >> > > > > >> > new
> > > >> > > > > >> > > > > > > > > protocol.
> > > >> > > > > >> > > > > > > > > 3. The consumers must be upgraded to a
> version
> > > >> > > > > supporting
> > > >> > > > > >> the
> > > >> > > > > >> > > > > online
> > > >> > > > > >> > > > > > > > > migration (must have KIP-792). If the
> consumer
> > > >> is
> > > >> > > > > already
> > > >> > > > > >> > > there.
> > > >> > > > > >> > > > > > > > > Nothing must be done at this point.
> > > >> > > > > >> > > > > > > > > 4. The consumers must be rolled with the
> > > >> feature
> > > >> > flag
> > > >> > > > > >> turned
> > > >> > > > > >> > > on.
> > > >> > > > > >> > > > > The
> > > >> > > > > >> > > > > > > > > consumer group is automatically
> converted when
> > > >> > the first
> > > >> > > > > >> > > consumer
> > > >> > > > > >> > > > > > > > > using the new protocol joins the group.
> While
> > > >> the
> > > >> > > > > members
> > > >> > > > > >> > > using the
> > > >> > > > > >> > > > > > > > > old protocol are being upgraded, the old
> > > >> protocol
> > > >> > is
> > > >> > > > > >> proxied
> > > >> > > > > >> > > into
> > > >> > > > > >> > > > > the
> > > >> > > > > >> > > > > > > > > new one.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > Let me clarify all of this in the KIP.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > > 4. I am happy that we are pushing the
> pattern
> > > >> > > > > >> subscriptions
> > > >> > > > > >> > > to
> > > >> > > > > >> > > > > the
> > > >> > > > > >> > > > > > > > > server,
> > > >> > > > > >> > > > > > > > > > but it seems like there could be some
> tricky
> > > >> > > > > >> compatibility
> > > >> > > > > >> > > > > issues.
> > > >> > > > > >> > > > > > > Will
> > > >> > > > > >> > > > > > > > > we
> > > >> > > > > >> > > > > > > > > > have a mechanism for users to detect
> that
> > > >> they
> > > >> > need to
> > > >> > > > > >> > update
> > > >> > > > > >> > > > > their
> > > >> > > > > >> > > > > > > regex
> > > >> > > > > >> > > > > > > > > > before switching to the new protocol?
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > I think that I am a bit more optimistic
> than
> > > >> you
> > > >> > on this
> > > >> > > > > >> > > point. I
> > > >> > > > > >> > > > > > > > > believe that the majority of the cases
> are
> > > >> simple
> > > >> > > > > regexes
> > > >> > > > > >> > which
> > > >> > > > > >> > > > > should
> > > >> > > > > >> > > > > > > > > work with the new engine. The
> coordinator will
> > > >> > verify
> > > >> > > > > the
> > > >> > > > > >> > regex
> > > >> > > > > >> > > > > anyway
> > > >> > > > > >> > > > > > > > > and reject the consumer if the regex is
> not
> > > >> valid.
> > > >> > > > > Coming
> > > >> > > > > >> > back
> > > >> > > > > >> > > to
> > > >> > > > > >> > > > > the
> > > >> > > > > >> > > > > > > > > migration path, in the worst case, the
> first
> > > >> > upgraded
> > > >> > > > > >> > consumer
> > > >> > > > > >> > > > > joining
> > > >> > > > > >> > > > > > > > > the group will be rejected. This should
> be used
> > > >> > as the
> > > >> > > > > >> last
> > > >> > > > > >> > > > > defence, I
> > > >> > > > > >> > > > > > > > > would say.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > One way for customers to validate their
> regex
> > > >> > before
> > > >> > > > > >> > upgrading
> > > >> > > > > >> > > > > their
> > > >> > > > > >> > > > > > > > > prod would be to test them with another
> group.
> > > >> For
> > > >> > > > > >> instance,
> > > >> > > > > >> > > that
> > > >> > > > > >> > > > > > > > > could be done in a pre-prod environment.
> > > >> Another
> > > >> > way
> > > >> > > > > >> would be
> > > >> > > > > >> > > to
> > > >> > > > > >> > > > > > > > > extend the consumer-group tool to
> provide a
> > > >> regex
> > > >> > > > > >> validation
> > > >> > > > > >> > > > > > > > > mechanism. Would this be enough in your
> > > >> opinion?
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > > 5. Related to the last question, will
> the
> > > >> Java
> > > >> > client
> > > >> > > > > >> allow
> > > >> > > > > >> > > the
> > > >> > > > > >> > > > > > > users to
> > > >> > > > > >> > > > > > > > > > stick with the current regex engine for
> > > >> > compatibility
> > > >> > > > > >> > > reasons?
> > > >> > > > > >> > > > > For
> > > >> > > > > >> > > > > > > > > example,
> > > >> > > > > >> > > > > > > > > > it may be handy to keep using client
> based
> > > >> > regex at
> > > >> > > > > >> first
> > > >> > > > > >> > to
> > > >> > > > > >> > > keep
> > > >> > > > > >> > > > > > > > > > migrations simple and then migrate to
> server
> > > >> > based
> > > >> > > > > >> regexes
> > > >> > > > > >> > > as a
> > > >> > > > > >> > > > > > > second
> > > >> > > > > >> > > > > > > > > step.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > I understand your point but I am
> concerned that
> > > >> > this
> > > >> > > > > would
> > > >> > > > > >> > > allow
> > > >> > > > > >> > > > > users
> > > >> > > > > >> > > > > > > > > to actually stay in this mode. That
> would go
> > > >> > against our
> > > >> > > > > >> goal
> > > >> > > > > >> > > of
> > > >> > > > > >> > > > > > > > > simplifying the client because we would
> have to
> > > >> > continue
> > > >> > > > > >> > > monitoring
> > > >> > > > > >> > > > > > > > > the metadata on the client side. I would
> rather
> > > >> > not do
> > > >> > > > > >> this.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > > 6. When we say that the group
> coordinator
> > > >> will
> > > >> > be
> > > >> > > > > >> > > responsible for
> > > >> > > > > >> > > > > > > storing
> > > >> > > > > >> > > > > > > > > > the configurations and that the
> > > >> configurations
> > > >> > will be
> > > >> > > > > >> > > deleted
> > > >> > > > > >> > > > > when
> > > >> > > > > >> > > > > > > the
> > > >> > > > > >> > > > > > > > > > group is deleted. Will a transition to
> DEAD
> > > >> > trigger
> > > >> > > > > >> > deletion
> > > >> > > > > >> > > of
> > > >> > > > > >> > > > > > > > > > configurations?
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > That's right. The configurations will be
> > > >> deleted
> > > >> > when
> > > >> > > > > the
> > > >> > > > > >> > > group is
> > > >> > > > > >> > > > > > > > > deleted. They go together.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > > 7. Will the choice to store the
> configs in
> > > >> the
> > > >> > group
> > > >> > > > > >> > > coordinator
> > > >> > > > > >> > > > > > > make it
> > > >> > > > > >> > > > > > > > > > harder to list all cluster configs and
> their
> > > >> > values?
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > I don't think so. The group
> configurations are
> > > >> > overrides
> > > >> > > > > >> of
> > > >> > > > > >> > > cluster
> > > >> > > > > >> > > > > > > > > configs. If you want to know all the
> overrides
> > > >> > though,
> > > >> > > > > you
> > > >> > > > > >> > > would
> > > >> > > > > >> > > > > have
> > > >> > > > > >> > > > > > > > > to ask all the group coordinators. You
> cannot
> > > >> > rely on
> > > >> > > > > the
> > > >> > > > > >> > > metadata
> > > >> > > > > >> > > > > log
> > > >> > > > > >> > > > > > > > > for instance.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > > 8. How would someone configure a group
> before
> > > >> > starting
> > > >> > > > > >> the
> > > >> > > > > >> > > > > consumers?
> > > >> > > > > >> > > > > > > > > Have
> > > >> > > > > >> > > > > > > > > > we considered allowing the explicit
> creation
> > > >> of
> > > >> > > > > groups?
> > > >> > > > > >> > > > > > > Alternatively,
> > > >> > > > > >> > > > > > > > > the
> > > >> > > > > >> > > > > > > > > > configs could be decoupled from the
> group
> > > >> > lifecycle.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > Yes. The group will be automatically
> created in
> > > >> > this
> > > >> > > > > case.
> > > >> > > > > >> > > However,
> > > >> > > > > >> > > > > > > > > the configs will be lost after the
> retention
> > > >> > period of
> > > >> > > > > the
> > > >> > > > > >> > > group
> > > >> > > > > >> > > > > > > > > passes.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > > 9. Will the Consumer.subscribe method
> for the
> > > >> > Java
> > > >> > > > > >> client
> > > >> > > > > >> > > still
> > > >> > > > > >> > > > > take
> > > >> > > > > >> > > > > > > a
> > > >> > > > > >> > > > > > > > > > `java.util.regex.Pattern` of do we
> have to
> > > >> > introduce
> > > >> > > > > an
> > > >> > > > > >> > > overload?
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > That's a very group question. I forgot
> about
> > > >> that
> > > >> > one.
> > > >> > > > > As
> > > >> > > > > >> the
> > > >> > > > > >> > > > > > > > > `java.util.regex.Pattern` is not fully
> > > >> compatible
> > > >> > with
> > > >> > > > > the
> > > >> > > > > >> > > engine
> > > >> > > > > >> > > > > that
> > > >> > > > > >> > > > > > > > > we plan to use, it might be better to
> deprecate
> > > >> > it and
> > > >> > > > > >> use an
> > > >> > > > > >> > > > > overload
> > > >> > > > > >> > > > > > > > > which takes a string. We would rely on
> the
> > > >> server
> > > >> > side
> > > >> > > > > >> > > validation.
> > > >> > > > > >> > > > > > > > > During the migration, I think that we
> could
> > > >> still
> > > >> > try to
> > > >> > > > > >> > > toString
> > > >> > > > > >> > > > > the
> > > >> > > > > >> > > > > > > > > regex and use it. That should work, I
> think, in
> > > >> > the
> > > >> > > > > >> majority
> > > >> > > > > >> > > of the
> > > >> > > > > >> > > > > > > > > cases.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > > 10. I agree with Justine that we
> should be
> > > >> > clearer
> > > >> > > > > about
> > > >> > > > > >> > the
> > > >> > > > > >> > > > > reason
> > > >> > > > > >> > > > > > > to
> > > >> > > > > >> > > > > > > > > > switch to IBP/metadata.version from the
> > > >> feature
> > > >> > flag.
> > > >> > > > > >> Maybe
> > > >> > > > > >> > > we
> > > >> > > > > >> > > > > mean
> > > >> > > > > >> > > > > > > that
> > > >> > > > > >> > > > > > > > > we
> > > >> > > > > >> > > > > > > > > > can switch the default for the feature
> flag
> > > >> to
> > > >> > true
> > > >> > > > > >> based
> > > >> > > > > >> > on
> > > >> > > > > >> > > the
> > > >> > > > > >> > > > > > > > > > metadata.version once we want to make
> it the
> > > >> > default.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > My plan was to use that feature flag
> mainly
> > > >> > during the
> > > >> > > > > >> > > development
> > > >> > > > > >> > > > > > > > > phase. I should not have mentioned it, I
> think,
> > > >> > because
> > > >> > > > > we
> > > >> > > > > >> > > could
> > > >> > > > > >> > > > > use
> > > >> > > > > >> > > > > > > > > an internal config for it.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > > 11. Some of the protocol APIs don't
> mention
> > > >> the
> > > >> > > > > required
> > > >> > > > > >> > > ACLs, it
> > > >> > > > > >> > > > > > > would
> > > >> > > > > >> > > > > > > > > be
> > > >> > > > > >> > > > > > > > > > good to add that for consistency.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > Noted.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > > 12. It is a bit odd that
> > > >> ConsumerGroupHeartbeat
> > > >> > > > > requires
> > > >> > > > > >> > > "Read
> > > >> > > > > >> > > > > Group"
> > > >> > > > > >> > > > > > > > > even
> > > >> > > > > >> > > > > > > > > > though it seems to do more than
> reading.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > I agree. This is how the current
> protocol works
> > > >> > though.
> > > >> > > > > We
> > > >> > > > > >> > only
> > > >> > > > > >> > > > > > > > > require "Read Group" to join a group. We
> could
> > > >> > consider
> > > >> > > > > >> > > changing
> > > >> > > > > >> > > > > this
> > > >> > > > > >> > > > > > > > > but I am not sure that it is worth it.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > > 13. How is topic recreation handled by
> the
> > > >> > consumer
> > > >> > > > > with
> > > >> > > > > >> > the
> > > >> > > > > >> > > new
> > > >> > > > > >> > > > > > > group
> > > >> > > > > >> > > > > > > > > > protocol? It would be good to have a
> section
> > > >> on
> > > >> > this.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > Noted. From a protocol perspective, the
> new
> > > >> topic
> > > >> > will
> > > >> > > > > >> have a
> > > >> > > > > >> > > new
> > > >> > > > > >> > > > > > > > > topic id so it will treat it like a
> topic with
> > > >> a
> > > >> > > > > different
> > > >> > > > > >> > > name.
> > > >> > > > > >> > > > > The
> > > >> > > > > >> > > > > > > > > only issue is that the fetch/commit
> offsets
> > > >> APIs
> > > >> > do not
> > > >> > > > > >> > support
> > > >> > > > > >> > > > > topic
> > > >> > > > > >> > > > > > > > > IDs so the consumer would reuse the
> offsets
> > > >> based
> > > >> > on the
> > > >> > > > > >> > same.
> > > >> > > > > >> > > I
> > > >> > > > > >> > > > > think
> > > >> > > > > >> > > > > > > > > that we should update those APIs as well
> in
> > > >> order
> > > >> > to be
> > > >> > > > > >> > > consistent
> > > >> > > > > >> > > > > end
> > > >> > > > > >> > > > > > > > > to end. That would strengthen the
> semantics of
> > > >> the
> > > >> > > > > >> consumer.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > > 14. The KIP mentions we will write the
> new
> > > >> > coordinator
> > > >> > > > > >> in
> > > >> > > > > >> > > Java.
> > > >> > > > > >> > > > > Even
> > > >> > > > > >> > > > > > > > > though
> > > >> > > > > >> > > > > > > > > > this is an implementation detail, do
> we plan
> > > >> to
> > > >> > have a
> > > >> > > > > >> new
> > > >> > > > > >> > > gradle
> > > >> > > > > >> > > > > > > module
> > > >> > > > > >> > > > > > > > > > for it?
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > Yes.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > > 15. Do we have a scalability goal when
> it
> > > >> comes
> > > >> > to how
> > > >> > > > > >> many
> > > >> > > > > >> > > > > members
> > > >> > > > > >> > > > > > > the
> > > >> > > > > >> > > > > > > > > new
> > > >> > > > > >> > > > > > > > > > group protocol can support?
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > We don't have numbers at the moment. The
> > > >> protocol
> > > >> > should
> > > >> > > > > >> > > support
> > > >> > > > > >> > > > > 1000s
> > > >> > > > > >> > > > > > > > > of members per group. We will measure
> this when
> > > >> > we have
> > > >> > > > > a
> > > >> > > > > >> > first
> > > >> > > > > >> > > > > > > > > implementation. Note that we might have
> other
> > > >> > > > > bottlenecks
> > > >> > > > > >> > down
> > > >> > > > > >> > > the
> > > >> > > > > >> > > > > > > > > road (e.g. offset commits).
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > > 16. Did we consider having
> SubscribedTopidIds
> > > >> > instead
> > > >> > > > > of
> > > >> > > > > >> > > > > > > > > > SubscribedTopicNames in
> > > >> > ConsumerGroupHeartbeatRequest?
> > > >> > > > > >> Is
> > > >> > > > > >> > the
> > > >> > > > > >> > > > > idea
> > > >> > > > > >> > > > > > > that
> > > >> > > > > >> > > > > > > > > > since we have to resolve the regex on
> the
> > > >> > server, we
> > > >> > > > > >> can do
> > > >> > > > > >> > > the
> > > >> > > > > >> > > > > same
> > > >> > > > > >> > > > > > > for
> > > >> > > > > >> > > > > > > > > > the topic name? The difference is that
> > > >> sending
> > > >> > the
> > > >> > > > > >> regex is
> > > >> > > > > >> > > more
> > > >> > > > > >> > > > > > > > > efficient
> > > >> > > > > >> > > > > > > > > > whereas sending the topic names is less
> > > >> > efficient.
> > > >> > > > > >> > > Furthermore,
> > > >> > > > > >> > > > > > > delete
> > > >> > > > > >> > > > > > > > > and
> > > >> > > > > >> > > > > > > > > > recreation is easier to handle if we
> have
> > > >> topic
> > > >> > ids.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > The idea was to consolidate the metadata
> lookup
> > > >> > on the
> > > >> > > > > >> server
> > > >> > > > > >> > > for
> > > >> > > > > >> > > > > both
> > > >> > > > > >> > > > > > > > > paths but I do agree with your point. As
> a
> > > >> second
> > > >> > > > > though,
> > > >> > > > > >> > using
> > > >> > > > > >> > > > > topic
> > > >> > > > > >> > > > > > > > > ids may be better here for the delete and
> > > >> > recreation
> > > >> > > > > case.
> > > >> > > > > >> > > Also, I
> > > >> > > > > >> > > > > > > > > suppose that we may allow users to
> subscribe
> > > >> with
> > > >> > topic
> > > >> > > > > >> ids
> > > >> > > > > >> > in
> > > >> > > > > >> > > the
> > > >> > > > > >> > > > > > > > > future because that is the only way to be
> > > >> really
> > > >> > robust
> > > >> > > > > to
> > > >> > > > > >> > > topic
> > > >> > > > > >> > > > > > > > > re-creation.
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > Best,
> > > >> > > > > >> > > > > > > > > David
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > > > On Tue, Jul 12, 2022 at 1:38 PM David
> Jacot <
> > > >> > > > > >> > > dja...@confluent.io>
> > > >> > > > > >> > > > > > > wrote:
> > > >> > > > > >> > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > Hi Justine,
> > > >> > > > > >> > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > Thanks for your comments. Please find
> my
> > > >> answers
> > > >> > > > > below.
> > > >> > > > > >> > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > - Yes, the new protocol relies on
> topic IDs
> > > >> > with the
> > > >> > > > > >> > > exception
> > > >> > > > > >> > > > > of the
> > > >> > > > > >> > > > > > > > > > topic names based in the
> > > >> > > > > ConsumerGroupHeartbeatRequest.
> > > >> > > > > >> I
> > > >> > > > > >> > am
> > > >> > > > > >> > > not
> > > >> > > > > >> > > > > sure
> > > >> > > > > >> > > > > > > > > > if using topic names is the right call
> here.
> > > >> I
> > > >> > need to
> > > >> > > > > >> > think
> > > >> > > > > >> > > > > about it
> > > >> > > > > >> > > > > > > > > > a little more. Obviously, the KIP does
> not
> > > >> > change the
> > > >> > > > > >> > > > > fetch/commit
> > > >> > > > > >> > > > > > > > > > offsets RPCs to use topic IDs. This
> may be
> > > >> > something
> > > >> > > > > >> that
> > > >> > > > > >> > we
> > > >> > > > > >> > > > > should
> > > >> > > > > >> > > > > > > > > > include though as it would give better
> > > >> overall
> > > >> > > > > >> guarantee in
> > > >> > > > > >> > > the
> > > >> > > > > >> > > > > > > > > > producer.
> > > >> > > > > >> > > > > > > > > > - You're right. I think that I should
> not
> > > >> have
> > > >> > > > > mentioned
> > > >> > > > > >> > this
> > > >> > > > > >> > > > > flag at
> > > >> > > > > >> > > > > > > > > > all. I will remove it. We can use an
> internal
> > > >> > > > > >> configuration
> > > >> > > > > >> > > while
> > > >> > > > > >> > > > > > > > > > developing the feature.
> > > >> > > > > >> > > > > > > > > > - Both cluster types will be
> supported. The
> > > >> > change is
> > > >> > > > > >> > > > > orthogonal. The
> > > >> > > > > >> > > > > > > > > > only requirement is that the cluster
> uses
> > > >> topic
> > > >> > IDs.
> > > >> > > > > >> > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > Best,
> > > >> > > > > >> > > > > > > > > > David
> > > >> > > > > >> > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > On Mon, Jul 11, 2022 at 9:53 PM
> Guozhang
> > > >> Wang <
> > > >> > > > > >> > > > > wangg...@gmail.com>
> > > >> > > > > >> > > > > > > > > wrote:
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > Hi Ismael,
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > Thanks for the feedback. Here are
> some
> > > >> replies
> > > >> > > > > inlined
> > > >> > > > > >> > > below:
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > On Sat, Jul 9, 2022 at 2:53 AM
> Ismael Juma
> > > >> <
> > > >> > > > > >> > > ism...@juma.me.uk>
> > > >> > > > > >> > > > > > > wrote:
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > Thanks for the KIP. This has the
> > > >> potential
> > > >> > to be a
> > > >> > > > > >> > great
> > > >> > > > > >> > > > > > > > > improvement. A few
> > > >> > > > > >> > > > > > > > > > > > initial questions/comments:
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > 1. I think it's premature to talk
> about
> > > >> > target
> > > >> > > > > >> versions
> > > >> > > > > >> > > for
> > > >> > > > > >> > > > > > > > > deprecation and
> > > >> > > > > >> > > > > > > > > > > > removal of the existing group
> protocol.
> > > >> > Unlike
> > > >> > > > > >> KRaft,
> > > >> > > > > >> > > this
> > > >> > > > > >> > > > > > > affects a
> > > >> > > > > >> > > > > > > > > core
> > > >> > > > > >> > > > > > > > > > > > client protocol and hence
> > > >> > deprecation/removal will
> > > >> > > > > >> be
> > > >> > > > > >> > > heavily
> > > >> > > > > >> > > > > > > > > dependent on
> > > >> > > > > >> > > > > > > > > > > > how quickly applications migrate
> to the
> > > >> new
> > > >> > > > > >> protocol.
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > Yeah I agree with you. I think we can
> > > >> remove
> > > >> > the
> > > >> > > > > >> proposed
> > > >> > > > > >> > > > > timeline
> > > >> > > > > >> > > > > > > in
> > > >> > > > > >> > > > > > > > > the
> > > >> > > > > >> > > > > > > > > > > `Compatibility, Deprecation, and
> Migration
> > > >> > Plan` and
> > > >> > > > > >> > > instead
> > > >> > > > > >> > > > > just
> > > >> > > > > >> > > > > > > state
> > > >> > > > > >> > > > > > > > > > > that we will decide in the future
> about
> > > >> when
> > > >> > we
> > > >> > > > > would
> > > >> > > > > >> > > > > deprecate old
> > > >> > > > > >> > > > > > > > > > > protocol and behaviors.
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > 2. The KIP says we intend to
> release this
> > > >> > in 4.x,
> > > >> > > > > >> but
> > > >> > > > > >> > it
> > > >> > > > > >> > > > > wasn't
> > > >> > > > > >> > > > > > > made
> > > >> > > > > >> > > > > > > > > clear
> > > >> > > > > >> > > > > > > > > > > > why. If we added that as a way to
> > > >> estimate
> > > >> > when
> > > >> > > > > we'd
> > > >> > > > > >> > > > > deprecate
> > > >> > > > > >> > > > > > > and
> > > >> > > > > >> > > > > > > > > remove
> > > >> > > > > >> > > > > > > > > > > > the group protocol, I also suggest
> > > >> removing
> > > >> > this
> > > >> > > > > >> part.
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > I think that's not specifically
> related to
> > > >> the
> > > >> > > > > >> > > > > deprecation/removal
> > > >> > > > > >> > > > > > > > > timeline
> > > >> > > > > >> > > > > > > > > > > plan, but it's more for client
> upgrades.
> > > >> I.e.
> > > >> > the
> > > >> > > > > >> > > broker-side
> > > >> > > > > >> > > > > > > > > > > implementation may be done first,
> and then
> > > >> the
> > > >> > > > > client
> > > >> > > > > >> > side,
> > > >> > > > > >> > > > > and we
> > > >> > > > > >> > > > > > > > > would
> > > >> > > > > >> > > > > > > > > > > only mark it as "released" by the
> time
> > > >> clients
> > > >> > > > > >> > > implementations
> > > >> > > > > >> > > > > are
> > > >> > > > > >> > > > > > > > > done. At
> > > >> > > > > >> > > > > > > > > > > that time, to enable the feature the
> > > >> clients
> > > >> > need to
> > > >> > > > > >> > first
> > > >> > > > > >> > > > > swap-in
> > > >> > > > > >> > > > > > > the
> > > >> > > > > >> > > > > > > > > > > bytecode with a rolling bounce and
> then set
> > > >> > the flag
> > > >> > > > > >> > with a
> > > >> > > > > >> > > > > second
> > > >> > > > > >> > > > > > > > > rolling
> > > >> > > > > >> > > > > > > > > > > bounce, and hence we feel it's
> better to be
> > > >> > released
> > > >> > > > > >> in a
> > > >> > > > > >> > > major
> > > >> > > > > >> > > > > > > > > version.
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > 3. We need to flesh out the
> details of
> > > >> the
> > > >> > > > > migration
> > > >> > > > > >> > > story.
> > > >> > > > > >> > > > > It
> > > >> > > > > >> > > > > > > > > sounds like
> > > >> > > > > >> > > > > > > > > > > > we're saying we will support online
> > > >> > migrations. Is
> > > >> > > > > >> that
> > > >> > > > > >> > > > > correct?
> > > >> > > > > >> > > > > > > We
> > > >> > > > > >> > > > > > > > > should
> > > >> > > > > >> > > > > > > > > > > > explain this in detail. It could
> also be
> > > >> > done as a
> > > >> > > > > >> > > separate
> > > >> > > > > >> > > > > KIP,
> > > >> > > > > >> > > > > > > if
> > > >> > > > > >> > > > > > > > > it's
> > > >> > > > > >> > > > > > > > > > > > easier.
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > Yes I think that's the part we can
> be more
> > > >> > concrete
> > > >> > > > > >> about
> > > >> > > > > >> > > for
> > > >> > > > > >> > > > > sure
> > > >> > > > > >> > > > > > > (and
> > > >> > > > > >> > > > > > > > > > > this is related to your question 2)
> above).
> > > >> > We will
> > > >> > > > > >> work
> > > >> > > > > >> > on
> > > >> > > > > >> > > > > making
> > > >> > > > > >> > > > > > > it
> > > >> > > > > >> > > > > > > > > more
> > > >> > > > > >> > > > > > > > > > > explicit in parallel as we solicit
> more
> > > >> > feedback.
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > 4. I am happy that we are pushing
> the
> > > >> > pattern
> > > >> > > > > >> > > subscriptions
> > > >> > > > > >> > > > > to
> > > >> > > > > >> > > > > > > the
> > > >> > > > > >> > > > > > > > > server,
> > > >> > > > > >> > > > > > > > > > > > but it seems like there could be
> some
> > > >> tricky
> > > >> > > > > >> > > compatibility
> > > >> > > > > >> > > > > > > issues.
> > > >> > > > > >> > > > > > > > > Will we
> > > >> > > > > >> > > > > > > > > > > > have a mechanism for users to
> detect that
> > > >> > they
> > > >> > > > > need
> > > >> > > > > >> to
> > > >> > > > > >> > > update
> > > >> > > > > >> > > > > > > their
> > > >> > > > > >> > > > > > > > > regex
> > > >> > > > > >> > > > > > > > > > > > before switching to the new
> protocol?
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > Yes I think we need some tooling for
> > > >> non-java
> > > >> > client
> > > >> > > > > >> > users
> > > >> > > > > >> > > to
> > > >> > > > > >> > > > > sort
> > > >> > > > > >> > > > > > > of
> > > >> > > > > >> > > > > > > > > > > "dry-run" the client before
> switching to
> > > >> the
> > > >> > new
> > > >> > > > > >> > protocol.
> > > >> > > > > >> > > I
> > > >> > > > > >> > > > > do not
> > > >> > > > > >> > > > > > > > > have a
> > > >> > > > > >> > > > > > > > > > > specific idea on top of my head
> though,
> > > >> maybe
> > > >> > others
> > > >> > > > > >> like
> > > >> > > > > >> > > @Matt
> > > >> > > > > >> > > > > > > > > Howlett can
> > > >> > > > > >> > > > > > > > > > > chime-in here?
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > 5. Related to the last question,
> will the
> > > >> > Java
> > > >> > > > > >> client
> > > >> > > > > >> > > allow
> > > >> > > > > >> > > > > the
> > > >> > > > > >> > > > > > > > > users to
> > > >> > > > > >> > > > > > > > > > > > stick with the current regex
> engine for
> > > >> > > > > >> compatibility
> > > >> > > > > >> > > > > reasons?
> > > >> > > > > >> > > > > > > For
> > > >> > > > > >> > > > > > > > > example,
> > > >> > > > > >> > > > > > > > > > > > it may be handy to keep using
> client
> > > >> based
> > > >> > regex
> > > >> > > > > at
> > > >> > > > > >> > > first to
> > > >> > > > > >> > > > > keep
> > > >> > > > > >> > > > > > > > > > > > migrations simple and then migrate
> to
> > > >> > server based
> > > >> > > > > >> > > regexes
> > > >> > > > > >> > > > > as a
> > > >> > > > > >> > > > > > > > > second
> > > >> > > > > >> > > > > > > > > > > > step.
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > Honestly I have not thought about
> that for
> > > >> > java
> > > >> > > > > >> clients,
> > > >> > > > > >> > > and
> > > >> > > > > >> > > > > we can
> > > >> > > > > >> > > > > > > > > discuss
> > > >> > > > > >> > > > > > > > > > > that. What kind of compatibility
> issues do
> > > >> > you have
> > > >> > > > > in
> > > >> > > > > >> > > mind?
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > 6. When we say that the group
> coordinator
> > > >> > will be
> > > >> > > > > >> > > > > responsible for
> > > >> > > > > >> > > > > > > > > storing
> > > >> > > > > >> > > > > > > > > > > > the configurations and that the
> > > >> > configurations
> > > >> > > > > will
> > > >> > > > > >> be
> > > >> > > > > >> > > > > deleted
> > > >> > > > > >> > > > > > > when
> > > >> > > > > >> > > > > > > > > the
> > > >> > > > > >> > > > > > > > > > > > group is deleted. Will a
> transition to
> > > >> DEAD
> > > >> > > > > trigger
> > > >> > > > > >> > > deletion
> > > >> > > > > >> > > > > of
> > > >> > > > > >> > > > > > > > > > > > configurations?
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > Yes, since the DEAD state is an
> ending
> > > >> state
> > > >> > (we
> > > >> > > > > would
> > > >> > > > > >> > only
> > > >> > > > > >> > > > > > > transit to
> > > >> > > > > >> > > > > > > > > that
> > > >> > > > > >> > > > > > > > > > > state when the group is EMPTY and
> also all
> > > >> of
> > > >> > its
> > > >> > > > > >> > metadata
> > > >> > > > > >> > > are
> > > >> > > > > >> > > > > > > gone),
> > > >> > > > > >> > > > > > > > > once
> > > >> > > > > >> > > > > > > > > > > it's transited to DEAD this group
> would
> > > >> never
> > > >> > be
> > > >> > > > > >> revived.
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > 7. Will the choice to store the
> configs
> > > >> in
> > > >> > the
> > > >> > > > > group
> > > >> > > > > >> > > > > coordinator
> > > >> > > > > >> > > > > > > > > make it
> > > >> > > > > >> > > > > > > > > > > > harder to list all cluster configs
> and
> > > >> their
> > > >> > > > > values?
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > That's a good question, and our
> thoughts
> > > >> are
> > > >> > that
> > > >> > > > > the
> > > >> > > > > >> > > so-called
> > > >> > > > > >> > > > > > > "group
> > > >> > > > > >> > > > > > > > > > > configurations" are overrides of the
> > > >> > cluster-level
> > > >> > > > > >> > > > > configurations
> > > >> > > > > >> > > > > > > > > > > customized per group so when an
> admin list
> > > >> > cluster
> > > >> > > > > >> > configs
> > > >> > > > > >> > > it's
> > > >> > > > > >> > > > > > > okay to
> > > >> > > > > >> > > > > > > > > > > list just the cluster-level
> "defaults", not
> > > >> > showing
> > > >> > > > > >> any
> > > >> > > > > >> > > > > per-group
> > > >> > > > > >> > > > > > > > > > > customizations.
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > 8. How would someone configure a
> group
> > > >> > before
> > > >> > > > > >> starting
> > > >> > > > > >> > > the
> > > >> > > > > >> > > > > > > > > consumers? Have
> > > >> > > > > >> > > > > > > > > > > > we considered allowing the explicit
> > > >> > creation of
> > > >> > > > > >> groups?
> > > >> > > > > >> > > > > > > > > Alternatively, the
> > > >> > > > > >> > > > > > > > > > > > configs could be decoupled from
> the group
> > > >> > > > > lifecycle.
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > The configs can be created before
> the group
> > > >> > itself
> > > >> > > > > as
> > > >> > > > > >> an
> > > >> > > > > >> > > > > > > independent
> > > >> > > > > >> > > > > > > > > entity
> > > >> > > > > >> > > > > > > > > > > --- of course, this requires the
> > > >> corresponding
> > > >> > > > > >> request to
> > > >> > > > > >> > > be
> > > >> > > > > >> > > > > > > routed to
> > > >> > > > > >> > > > > > > > > the
> > > >> > > > > >> > > > > > > > > > > right coordinator based on the group
> id ---
> > > >> > the only
> > > >> > > > > >> > thing
> > > >> > > > > >> > > that
> > > >> > > > > >> > > > > > > > > differs is,
> > > >> > > > > >> > > > > > > > > > > when the group itself is gone we
> also check
> > > >> > if there
> > > >> > > > > >> are
> > > >> > > > > >> > > any
> > > >> > > > > >> > > > > > > > > configuration
> > > >> > > > > >> > > > > > > > > > > entities related to that group and
> delete
> > > >> as
> > > >> > well.
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > Admittedly this indeed introduces an
> > > >> > asymmetry on
> > > >> > > > > the
> > > >> > > > > >> > > creation
> > > >> > > > > >> > > > > /
> > > >> > > > > >> > > > > > > > > deletion
> > > >> > > > > >> > > > > > > > > > > lifecycles of the config entities,
> and we
> > > >> > would like
> > > >> > > > > >> to
> > > >> > > > > >> > > hear
> > > >> > > > > >> > > > > > > everyone's
> > > >> > > > > >> > > > > > > > > > > feelings whether we should aim for
> symmetry
> > > >> > i.e.
> > > >> > > > > >> totally
> > > >> > > > > >> > > > > decouple
> > > >> > > > > >> > > > > > > group
> > > >> > > > > >> > > > > > > > > > > configs and hence not delete them at
> all
> > > >> when
> > > >> > the
> > > >> > > > > >> group
> > > >> > > > > >> > is
> > > >> > > > > >> > > > > gone,
> > > >> > > > > >> > > > > > > but
> > > >> > > > > >> > > > > > > > > always
> > > >> > > > > >> > > > > > > > > > > require explicit deletion operations
> by
> > > >> > themselves.
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > 9. Will the Consumer.subscribe
> method for
> > > >> > the Java
> > > >> > > > > >> > client
> > > >> > > > > >> > > > > still
> > > >> > > > > >> > > > > > > take
> > > >> > > > > >> > > > > > > > > a
> > > >> > > > > >> > > > > > > > > > > > `java.util.regex.Pattern` of do we
> have
> > > >> to
> > > >> > > > > >> introduce an
> > > >> > > > > >> > > > > overload?
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > I think we do not need to introduce
> an
> > > >> > overload, but
> > > >> > > > > >> I'm
> > > >> > > > > >> > > all
> > > >> > > > > >> > > > > ears
> > > >> > > > > >> > > > > > > if
> > > >> > > > > >> > > > > > > > > there
> > > >> > > > > >> > > > > > > > > > > may be some compatibility issues
> that we
> > > >> may
> > > >> > > > > overlook.
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > 10. I agree with Justine that we
> should
> > > >> be
> > > >> > clearer
> > > >> > > > > >> > about
> > > >> > > > > >> > > the
> > > >> > > > > >> > > > > > > reason
> > > >> > > > > >> > > > > > > > > to
> > > >> > > > > >> > > > > > > > > > > > switch to IBP/metadata.version
> from the
> > > >> > feature
> > > >> > > > > >> flag.
> > > >> > > > > >> > > Maybe
> > > >> > > > > >> > > > > we
> > > >> > > > > >> > > > > > > mean
> > > >> > > > > >> > > > > > > > > that we
> > > >> > > > > >> > > > > > > > > > > > can switch the default for the
> feature
> > > >> flag
> > > >> > to
> > > >> > > > > true
> > > >> > > > > >> > > based on
> > > >> > > > > >> > > > > the
> > > >> > > > > >> > > > > > > > > > > > metadata.version once we want to
> make it
> > > >> the
> > > >> > > > > >> default.
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > 11. Some of the protocol APIs don't
> mention
> > > >> > the
> > > >> > > > > >> required
> > > >> > > > > >> > > ACLs,
> > > >> > > > > >> > > > > it
> > > >> > > > > >> > > > > > > > > would be
> > > >> > > > > >> > > > > > > > > > > > good to add that for consistency.
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > Ack, we can certainly do that.
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > 12. It is a bit odd that
> > > >> > ConsumerGroupHeartbeat
> > > >> > > > > >> > requires
> > > >> > > > > >> > > > > "Read
> > > >> > > > > >> > > > > > > > > Group" even
> > > >> > > > > >> > > > > > > > > > > > though it seems to do more than
> reading.
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > I had that thought myself as well,
> but in
> > > >> the
> > > >> > end we
> > > >> > > > > >> > could
> > > >> > > > > >> > > not
> > > >> > > > > >> > > > > > > find a
> > > >> > > > > >> > > > > > > > > > > better alternative: adding Write
> Group
> > > >> seems
> > > >> > an
> > > >> > > > > >> overkill
> > > >> > > > > >> > > here
> > > >> > > > > >> > > > > > > since we
> > > >> > > > > >> > > > > > > > > do
> > > >> > > > > >> > > > > > > > > > > not have it elsewhere (we only have
> Read /
> > > >> > Delete
> > > >> > > > > and
> > > >> > > > > >> > > Describe
> > > >> > > > > >> > > > > on
> > > >> > > > > >> > > > > > > > > groups so
> > > >> > > > > >> > > > > > > > > > > far). Would like to hear others
> thoughts.
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > 13. How is topic recreation
> handled by
> > > >> the
> > > >> > > > > consumer
> > > >> > > > > >> > with
> > > >> > > > > >> > > the
> > > >> > > > > >> > > > > new
> > > >> > > > > >> > > > > > > > > group
> > > >> > > > > >> > > > > > > > > > > > protocol? It would be good to have
> a
> > > >> > section on
> > > >> > > > > >> this.
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > You mean with regex subscription
> right? Yes
> > > >> > we can
> > > >> > > > > >> add a
> > > >> > > > > >> > > > > section
> > > >> > > > > >> > > > > > > about
> > > >> > > > > >> > > > > > > > > > > that, but basically the idea is that
> > > >> consumer
> > > >> > would
> > > >> > > > > be
> > > >> > > > > >> > > totally
> > > >> > > > > >> > > > > > > > > agnostic in
> > > >> > > > > >> > > > > > > > > > > the new protocol as it's handled all
> by the
> > > >> > brokers.
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > 14. The KIP mentions we will write
> the
> > > >> new
> > > >> > > > > >> coordinator
> > > >> > > > > >> > in
> > > >> > > > > >> > > > > Java.
> > > >> > > > > >> > > > > > > Even
> > > >> > > > > >> > > > > > > > > though
> > > >> > > > > >> > > > > > > > > > > > this is an implementation detail,
> do we
> > > >> > plan to
> > > >> > > > > >> have a
> > > >> > > > > >> > > new
> > > >> > > > > >> > > > > gradle
> > > >> > > > > >> > > > > > > > > module
> > > >> > > > > >> > > > > > > > > > > > for it?
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > We have not thought about that. But
> I think
> > > >> > the
> > > >> > > > > answer
> > > >> > > > > >> > > should
> > > >> > > > > >> > > > > be
> > > >> > > > > >> > > > > > > yes.
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > 15. Do we have a scalability goal
> when it
> > > >> > comes to
> > > >> > > > > >> how
> > > >> > > > > >> > > many
> > > >> > > > > >> > > > > > > members
> > > >> > > > > >> > > > > > > > > the new
> > > >> > > > > >> > > > > > > > > > > > group protocol can support?
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > Within a group, I think we should
> shoot for
> > > >> > 1000s of
> > > >> > > > > >> > > members.
> > > >> > > > > >> > > > > But
> > > >> > > > > >> > > > > > > that
> > > >> > > > > >> > > > > > > > > > > scalability goals also depend on the
> offset
> > > >> > > > > management
> > > >> > > > > >> > > (commit,
> > > >> > > > > >> > > > > > > fetch)
> > > >> > > > > >> > > > > > > > > > > capabilities of the coordinator
> which we
> > > >> did
> > > >> > not
> > > >> > > > > >> cover in
> > > >> > > > > >> > > this
> > > >> > > > > >> > > > > > > KIP, so
> > > >> > > > > >> > > > > > > > > it's
> > > >> > > > > >> > > > > > > > > > > hard to give a number that applies
> > > >> > universally.
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > 16. Did we consider having
> > > >> > SubscribedTopidIds
> > > >> > > > > >> instead
> > > >> > > > > >> > of
> > > >> > > > > >> > > > > > > > > > > > SubscribedTopicNames in
> > > >> > > > > >> ConsumerGroupHeartbeatRequest?
> > > >> > > > > >> > > Is the
> > > >> > > > > >> > > > > > > idea
> > > >> > > > > >> > > > > > > > > that
> > > >> > > > > >> > > > > > > > > > > > since we have to resolve the regex
> on the
> > > >> > server,
> > > >> > > > > we
> > > >> > > > > >> > can
> > > >> > > > > >> > > do
> > > >> > > > > >> > > > > the
> > > >> > > > > >> > > > > > > same
> > > >> > > > > >> > > > > > > > > for
> > > >> > > > > >> > > > > > > > > > > > the topic name? The difference is
> that
> > > >> > sending the
> > > >> > > > > >> > regex
> > > >> > > > > >> > > is
> > > >> > > > > >> > > > > more
> > > >> > > > > >> > > > > > > > > efficient
> > > >> > > > > >> > > > > > > > > > > > whereas sending the topic names is
> less
> > > >> > efficient.
> > > >> > > > > >> > > > > Furthermore,
> > > >> > > > > >> > > > > > > > > delete and
> > > >> > > > > >> > > > > > > > > > > > recreation is easier to handle if
> we have
> > > >> > topic
> > > >> > > > > ids.
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > The main reason to still let the
> clients
> > > >> send
> > > >> > names
> > > >> > > > > >> is to
> > > >> > > > > >> > > keep
> > > >> > > > > >> > > > > the
> > > >> > > > > >> > > > > > > > > > > reasoning of names -> ids on the
> broker /
> > > >> > admin
> > > >> > > > > client
> > > >> > > > > >> > > only.
> > > >> > > > > >> > > > > Note
> > > >> > > > > >> > > > > > > that
> > > >> > > > > >> > > > > > > > > > > although we added topic id in
> KIP-516, we
> > > >> > never
> > > >> > > > > >> > > implemented the
> > > >> > > > > >> > > > > > > logic
> > > >> > > > > >> > > > > > > > > on
> > > >> > > > > >> > > > > > > > > > > consumer/producers leveraging the
> related
> > > >> > newer
> > > >> > > > > >> versioned
> > > >> > > > > >> > > RPCs,
> > > >> > > > > >> > > > > > > > > instead we
> > > >> > > > > >> > > > > > > > > > > just set the topic id as empty UUID.
> We
> > > >> want
> > > >> > to keep
> > > >> > > > > >> the
> > > >> > > > > >> > > > > > > > > consumer/producer
> > > >> > > > > >> > > > > > > > > > > to be thin and only delegate the
> reasoning
> > > >> on
> > > >> > broker
> > > >> > > > > >> and
> > > >> > > > > >> > > > > > > potentially
> > > >> > > > > >> > > > > > > > > admin
> > > >> > > > > >> > > > > > > > > > > clients.
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > Thanks,
> > > >> > > > > >> > > > > > > > > > > > Ismael
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > On Wed, Jul 6, 2022 at 10:45 AM
> David
> > > >> Jacot
> > > >> > > > > >> > > > > > > > > <dja...@confluent.io.invalid>
> > > >> > > > > >> > > > > > > > > > > > wrote:
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > > Hi all,
> > > >> > > > > >> > > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > > I would like to start a
> discussion
> > > >> thread
> > > >> > on
> > > >> > > > > >> KIP-848:
> > > >> > > > > >> > > The
> > > >> > > > > >> > > > > Next
> > > >> > > > > >> > > > > > > > > > > > > Generation of the Consumer
> Rebalance
> > > >> > Protocol.
> > > >> > > > > >> With
> > > >> > > > > >> > > this
> > > >> > > > > >> > > > > KIP,
> > > >> > > > > >> > > > > > > we
> > > >> > > > > >> > > > > > > > > aim
> > > >> > > > > >> > > > > > > > > > > > > to make the rebalance protocol
> (for
> > > >> > consumers)
> > > >> > > > > >> more
> > > >> > > > > >> > > > > reliable,
> > > >> > > > > >> > > > > > > more
> > > >> > > > > >> > > > > > > > > > > > > scalable, easier to implement for
> > > >> > clients, and
> > > >> > > > > >> easier
> > > >> > > > > >> > > to
> > > >> > > > > >> > > > > debug
> > > >> > > > > >> > > > > > > for
> > > >> > > > > >> > > > > > > > > > > > > operators.
> > > >> > > > > >> > > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > > The KIP is here:
> > > >> > > > > >> > > > > https://cwiki.apache.org/confluence/x/HhD1D.
> > > >> > > > > >> > > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > > Please take a look and let me
> know what
> > > >> > you
> > > >> > > > > think.
> > > >> > > > > >> > > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > > Best,
> > > >> > > > > >> > > > > > > > > > > > > David
> > > >> > > > > >> > > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > > > PS: I will be away from July
> 18th to
> > > >> > August 8th.
> > > >> > > > > >> That
> > > >> > > > > >> > > gives
> > > >> > > > > >> > > > > > > you a
> > > >> > > > > >> > > > > > > > > bit
> > > >> > > > > >> > > > > > > > > > > > > of time to read and digest this
> long
> > > >> KIP.
> > > >> > > > > >> > > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > >
> > > >> > > > > >> > > > > > > > > > > --
> > > >> > > > > >> > > > > > > > > > > -- Guozhang
> > > >> > > > > >> > > > > > > > >
> > > >> > > > > >> > > > > > > >
> > > >> > > > > >> > > > > > > >
> > > >> > > > > >> > > > > > > > --
> > > >> > > > > >> > > > > > > > -- Guozhang
> > > >> > > > > >> > > > > > >
> > > >> > > > > >> > > > >
> > > >> > > > > >> > >
> > > >> > > > > >> > >
> > > >> > > > > >> >
> > > >> > > > > >>
> > > >> > > > > >>
> > > >> > > > > >> --
> > > >> > > > > >> -- Guozhang
> > > >> > > > > >>
> > > >> > > > > >
> > > >> > > > >
> > > >> >
> > > >>
> > > >
>

Reply via email to