Hi All,

As per the suggestions from David/Guozhang above, I updated the page for
Connect to have it's own set of APIs and not extend the ones from consumer.
Plz review again.

@Luke,

Thank you. I am actually looking for review comments/thoughts/suggestions
as Connect changes are still in draft stage but since this is a wholesome
change, I am sure there might be many cases missed by me. Also, on this
discussion thread David suggested to continue discussing on this very
thread to ensure connect is compatible with whatever is done in KIP-848. I
am ok either way.

@Guozhang,

Regarding the migration plan and the link to KIP-415 that you had shared, I
had taken a look at it. From what I understood, that migration/downgrading
is from switching the assignment protocol from eager -> incremental or vice
versa. In this case, the migration is to the new rebalance protocol which
would also depend on the group coordinator and whether they support the new
rebalance protocol or not. I have tried to incorporate this aspect in the
draft page.

Thanks!
Sagar.




On Mon, Aug 22, 2022 at 1:00 PM Luke Chen <show...@gmail.com> wrote:

> Hi David,
>
> Thanks for the update.
>
> Some more questions:
> 1. In Group Coordinator section, you mentioned:
> > The new group coordinator will have a state machine per
> *__consumer_offsets* partitions, where each state machine is modelled as an
> event loop. Those state machines will be executed in
> *group.coordinator.threads* threads.
>
> 1.1. I think the state machine are: "Empty, assigning, reconciling, stable,
> dead" mentioned in Consumer Group States section, right?
> 1.2. What do you mean "each state machine is modelled as an event loop"?
> 1.3. Why do we need a state machine per *__consumer_offsets* partitions?
> Not a state machine "per consumer group" owned by a group coordinator? For
> example, if one group coordinator owns 2 consumer groups, and both exist in
> *__consumer_offsets-0*, will we have 1 state machine for it, or 2?
> 1.4. I know the "*group.coordinator.threads" *is the number of threads used
> to run the state machines. But I'm wondering if the purpose of the threads
> is only to keep the state of each consumer group (or *__consumer_offsets*
> partitions?), and no heavy computation, why should we need multi-threads
> here?
>
> 2. For the default value in the new configs:
> 2.1. The consumer session timeout, why does the default session timeout not
> locate between min (45s) and max(60s)? I thought the min/max session
> timeout is to define lower/upper bound of it, no?
>
> group.consumer.session.timeout.ms int 30s The timeout to detect client
> failures when using the consumer group protocol.
> group.consumer.min.session.timeout.ms int 45s The minimum session timeout.
> group.consumer.max.session.timeout.ms int 60s The maximum session timeout.
>
>
>
> 2.2. The default server side assignor are [range, uniform], which means
> we'll default to "range" assignor. I'd like to know why not uniform one? I
> thought usually users will choose uniform assignor (former sticky assinor)
> for better evenly distribution. Any other reason we choose range assignor
> as default?
> group.consumer.assignors List range, uniform The server side assignors.
>
>
>
>
>
>
> Thank you.
> Luke
>
>
>
>
>
>
> On Mon, Aug 22, 2022 at 2:10 PM Luke Chen <show...@gmail.com> wrote:
>
> > Hi Sagar,
> >
> > I have some thoughts about Kafka Connect integrating with KIP-848, but I
> > think we should have a separate discussion thread for the Kafka Connect
> > KIP: Integrating Kafka Connect With New Consumer Rebalance Protocol [1],
> > and let this discussion thread focus on consumer rebalance protocol,
> WDYT?
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/KAFKA/%5BDRAFT%5DIntegrating+Kafka+Connect+With+New+Consumer+Rebalance+Protocol
> >
> > Thank you.
> > Luke
> >
> >
> >
> > On Fri, Aug 12, 2022 at 9:31 PM Sagar <sagarmeansoc...@gmail.com> wrote:
> >
> >> Thank you Guozhang/David for the feedback. Looks like there's agreement
> on
> >> using separate APIs for Connect. I would revisit the doc and see what
> >> changes are to be made.
> >>
> >> Thanks!
> >> Sagar.
> >>
> >> On Tue, Aug 9, 2022 at 7:11 PM David Jacot <dja...@confluent.io.invalid
> >
> >> wrote:
> >>
> >> > Hi Sagar,
> >> >
> >> > Thanks for the feedback and the document. That's really helpful. I
> >> > will take a look at it.
> >> >
> >> > Overall, it seems to me that both Connect and the Consumer could share
> >> > the same underlying "engine". The main difference is that the Consumer
> >> > assigns topic-partitions to members whereas Connect assigns tasks to
> >> > workers. I see two ways to move forward:
> >> > 1) We extend the new proposed APIs to support different resource types
> >> > (e.g. partitions, tasks, etc.); or
> >> > 2) We use new dedicated APIs for Connect. The dedicated APIs would be
> >> > similar to the new ones but different on the content/resources and
> >> > they would rely on the same engine on the coordinator side.
> >> >
> >> > I personally lean towards 2) because I am not a fan of overcharging
> >> > APIs to serve different purposes. That being said, I am not opposed to
> >> > 1) if we can find an elegant way to do it.
> >> >
> >> > I think that we can continue to discuss it here for now in order to
> >> > ensure that this KIP is compatible with what we will do for Connect in
> >> > the future.
> >> >
> >> > Best,
> >> > David
> >> >
> >> > On Mon, Aug 8, 2022 at 2:41 PM David Jacot <dja...@confluent.io>
> wrote:
> >> > >
> >> > > Hi all,
> >> > >
> >> > > I am back from vacation. I will go through and address your comments
> >> > > in the coming days. Thanks for your feedback.
> >> > >
> >> > > Cheers,
> >> > > David
> >> > >
> >> > > On Wed, Aug 3, 2022 at 10:05 PM Gregory Harris <
> gharris1...@gmail.com
> >> >
> >> > wrote:
> >> > > >
> >> > > > Hey All!
> >> > > >
> >> > > > Thanks for the KIP, it's wonderful to see cooperative rebalancing
> >> > making it
> >> > > > down the stack!
> >> > > >
> >> > > > I had a few questions:
> >> > > >
> >> > > > 1. The 'Rejected Alternatives' section describes how member epoch
> >> > should
> >> > > > advance in step with the group epoch and assignment epoch values.
> I
> >> > think
> >> > > > that this is a good idea for the reasons described in the KIP.
> When
> >> the
> >> > > > protocol is incrementally assigning partitions to a worker, what
> >> member
> >> > > > epoch does each incremental assignment use? Are member epochs
> >> re-used,
> >> > and
> >> > > > a single member epoch can correspond to multiple different
> >> > (monotonically
> >> > > > larger) assignments?
> >> > > >
> >> > > > 2. Is the Assignor's 'Reason' field opaque to the group
> >> coordinator? If
> >> > > > not, should custom client-side assignor implementations interact
> >> with
> >> > the
> >> > > > Reason field, and how is its common meaning agreed upon? If so,
> what
> >> > is the
> >> > > > benefit of a distinct Reason field over including such
> functionality
> >> > in the
> >> > > > opaque metadata?
> >> > > >
> >> > > > 3. The following is included in the KIP: "Thanks to this, the
> input
> >> of
> >> > the
> >> > > > client side assignor is entirely driven by the group coordinator.
> >> The
> >> > > > consumer is no longer responsible for maintaining any state
> besides
> >> its
> >> > > > assigned partitions." Does this mean that the client-side assignor
> >> MAY
> >> > > > incorporate additional non-Metadata state (such as partition
> >> > throughput,
> >> > > > cpu/memory metrics, config topics, etc), or that additional
> >> > non-Metadata
> >> > > > state SHOULD NOT be used?
> >> > > >
> >> > > > 4. I see that there are separate classes
> >> > > > for org.apache.kafka.server.group.consumer.PartitionAssignor
> >> > > > and org.apache.kafka.clients.consumer.PartitionAssignor that seem
> to
> >> > > > overlap significantly. Is it possible for these two
> implementations
> >> to
> >> > be
> >> > > > unified? This would serve to promote feature parity of server-side
> >> and
> >> > > > client-side assignors, and would also facilitate operational
> >> > flexibility in
> >> > > > certain situations. For example, if a server-side assignor has
> some
> >> > poor
> >> > > > behavior and needs a patch, deploying the patched assignor to the
> >> > client
> >> > > > and switching one consumer group to a client-side assignor may be
> >> > faster
> >> > > > and less risky than patching all of the brokers. With the
> currently
> >> > > > proposed distinct APIs, a non-trivial reimplementation would have
> >> to be
> >> > > > assembled, and if the two APIs have diverged significantly, then
> it
> >> is
> >> > > > possible that a reimplementation would not be possible.
> >> > > >
> >> > > > --
> >> > > > Greg Harris
> >> > > > gharris1...@gmail.com
> >> > > > github.com/gharris1727
> >> > > >
> >> > > > On Wed, Aug 3, 2022 at 8:39 AM Sagar <sagarmeansoc...@gmail.com>
> >> > wrote:
> >> > > >
> >> > > > > Hi Guozhang/David,
> >> > > > >
> >> > > > > I created a confluence page to discuss how Connect would need to
> >> > change
> >> > > > > based on the new rebalance protocol. Here's the page:
> >> > > > >
> >> > > > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/%5BDRAFT%5DIntegrating+Kafka+Connect+With+New+Consumer+Rebalance+Protocol
> >> > > > >
> >> > > > > It's also pretty longish and I have tried to keep a format
> >> similar to
> >> > > > > KIP-848. Let me know what you think. Also, do you think this
> >> should
> >> > be
> >> > > > > moved to a separate discussion thread or is this one fine?
> >> > > > >
> >> > > > > Thanks!
> >> > > > > Sagar.
> >> > > > >
> >> > > > > On Tue, Jul 26, 2022 at 7:37 AM Sagar <
> sagarmeansoc...@gmail.com>
> >> > wrote:
> >> > > > >
> >> > > > > > Hello Guozhang,
> >> > > > > >
> >> > > > > > Thank you so much for the doc on Kafka Streams. Sure, I would
> do
> >> > the
> >> > > > > > analysis and come up with such a document.
> >> > > > > >
> >> > > > > > Thanks!
> >> > > > > > Sagar.
> >> > > > > >
> >> > > > > > On Tue, Jul 26, 2022 at 4:47 AM Guozhang Wang <
> >> wangg...@gmail.com>
> >> > > > > wrote:
> >> > > > > >
> >> > > > > >> Hello Sagar,
> >> > > > > >>
> >> > > > > >> It would be great if you could come back with some analysis
> on
> >> > how to
> >> > > > > >> implement the Connect side integration with the new protocol;
> >> so
> >> > far
> >> > > > > >> besides leveraging on the new "protocol type" we did not yet
> >> think
> >> > > > > through
> >> > > > > >> the Connect side implementations. For Streams here's a draft
> of
> >> > > > > >> integration
> >> > > > > >> plan:
> >> > > > > >>
> >> > > > > >>
> >> > > > >
> >> >
> >>
> https://docs.google.com/document/d/17PNz2sGoIvGyIzz8vLyJTJTU2rqnD_D9uHJnH9XARjU/edit#heading=h.pdgirmi57dvn
> >> > > > > >> just FYI for your analysis on Connect.
> >> > > > > >>
> >> > > > > >> On Tue, Jul 19, 2022 at 10:48 PM Sagar <
> >> sagarmeansoc...@gmail.com
> >> > >
> >> > > > > wrote:
> >> > > > > >>
> >> > > > > >> > Hi David,
> >> > > > > >> >
> >> > > > > >> > Thank you for your response. The reason I thought connect
> can
> >> > also fit
> >> > > > > >> into
> >> > > > > >> > this new scheme is that even today the connect uses a
> >> > > > > WorkerCoordinator
> >> > > > > >> > extending from AbstractCoordinator to empower rebalances of
> >> > > > > >> > tasks/connectors. The WorkerCoordinator sets the
> >> protocolType()
> >> > to
> >> > > > > >> connect
> >> > > > > >> > and uses the metadata() method by plumbing into
> >> > > > > >> JoinGroupRequestProtocol.
> >> > > > > >> >
> >> > > > > >> > I think the changes to support connect would be similar at
> a
> >> > high
> >> > > > > level
> >> > > > > >> to
> >> > > > > >> > the changes in streams mainly because of the Client side
> >> > assignors
> >> > > > > being
> >> > > > > >> > used in both. At an implementation level, we might need to
> >> make
> >> > a lot
> >> > > > > of
> >> > > > > >> > changes to get onto this new assignment protocol like
> >> enhancing
> >> > the
> >> > > > > >> > JoinGroup request/response and SyncGroup and using
> >> > > > > >> ConsumerGroupHeartbeat
> >> > > > > >> > API etc again on similar lines to streams (or there might
> be
> >> > > > > >> deviations). I
> >> > > > > >> > would try to perform a detailed analysis of the same  and
> we
> >> > can have
> >> > > > > a
> >> > > > > >> > separate discussion thread for that as that would derail
> this
> >> > > > > discussion
> >> > > > > >> > thread. Let me know if that sounds good to you.
> >> > > > > >> >
> >> > > > > >> > Thanks!
> >> > > > > >> > Sagar.
> >> > > > > >> >
> >> > > > > >> >
> >> > > > > >> >
> >> > > > > >> > On Fri, Jul 15, 2022 at 5:47 PM David Jacot
> >> > > > > <dja...@confluent.io.invalid
> >> > > > > >> >
> >> > > > > >> > wrote:
> >> > > > > >> >
> >> > > > > >> > > Hi Sagar,
> >> > > > > >> > >
> >> > > > > >> > > Thanks for your comments.
> >> > > > > >> > >
> >> > > > > >> > > 1) Yes. That refers to `Assignment#error`. Sure, I can
> >> > mention it.
> >> > > > > >> > >
> >> > > > > >> > > 2) The idea is to transition C from his current
> assignment
> >> to
> >> > his
> >> > > > > >> > > target assignment when he can move to epoch 3. When that
> >> > happens,
> >> > > > > the
> >> > > > > >> > > member assignment is updated and persisted with all its
> >> > assigned
> >> > > > > >> > > partitions even if they are not all revoked yet. In other
> >> > words, the
> >> > > > > >> > > member assignment becomes the target assignment. This is
> >> > basically
> >> > > > > an
> >> > > > > >> > > optimization to avoid having to write all the changes to
> >> the
> >> > log.
> >> > > > > The
> >> > > > > >> > > examples are based on the persisted state so I understand
> >> the
> >> > > > > >> > > confusion. Let me see if I can improve this in the
> >> > description.
> >> > > > > >> > >
> >> > > > > >> > > 3) Regarding Connect, it could reuse the protocol with a
> >> > client side
> >> > > > > >> > > assignor if it fits in the protocol. The assignment is
> >> about
> >> > > > > >> > > topicid-partitions + metadata, could Connect fit into
> this?
> >> > > > > >> > >
> >> > > > > >> > > Best,
> >> > > > > >> > > David
> >> > > > > >> > >
> >> > > > > >> > > On Fri, Jul 15, 2022 at 1:55 PM Sagar <
> >> > sagarmeansoc...@gmail.com>
> >> > > > > >> wrote:
> >> > > > > >> > > >
> >> > > > > >> > > > Hi David,
> >> > > > > >> > > >
> >> > > > > >> > > > Thanks for the KIP. I just had minor observations:
> >> > > > > >> > > >
> >> > > > > >> > > > 1) In the Assignment Error section in Client Side mode
> >> > Assignment
> >> > > > > >> > > process,
> >> > > > > >> > > > you mentioned => `In this case, the client side
> assignor
> >> can
> >> > > > > return
> >> > > > > >> an
> >> > > > > >> > > > error to the group coordinator`. In this case are you
> >> > referring to
> >> > > > > >> the
> >> > > > > >> > > > Assignor returning an AssignmentError that's listed
> down
> >> > towards
> >> > > > > the
> >> > > > > >> > end?
> >> > > > > >> > > > If yes, do you think it would make sense to mention
> this
> >> > > > > explicitly
> >> > > > > >> > here?
> >> > > > > >> > > >
> >> > > > > >> > > > 2) In the Case Studies section, I have a slight
> >> confusion,
> >> > not
> >> > > > > sure
> >> > > > > >> if
> >> > > > > >> > > > others have the same. Consider this step:
> >> > > > > >> > > >
> >> > > > > >> > > > When B heartbeats, the group coordinator transitions
> him
> >> to
> >> > epoch
> >> > > > > 3
> >> > > > > >> > > because
> >> > > > > >> > > > B has no partitions to revoke. It persists the change
> and
> >> > reply.
> >> > > > > >> > > >
> >> > > > > >> > > >    - Group (epoch=3)
> >> > > > > >> > > >       - A
> >> > > > > >> > > >       - B
> >> > > > > >> > > >       - C
> >> > > > > >> > > >    - Target Assignment (epoch=3)
> >> > > > > >> > > >       - A - partitions=[foo-0]
> >> > > > > >> > > >       - B - partitions=[foo-2]
> >> > > > > >> > > >       - C - partitions=[foo-1]
> >> > > > > >> > > >    - Member Assignment
> >> > > > > >> > > >       - A - epoch=2, partitions=[foo-0, foo-1]
> >> > > > > >> > > >       - B - epoch=3, partitions=[foo-2]
> >> > > > > >> > > >       - C - epoch=3, partitions=[foo-1]
> >> > > > > >> > > >
> >> > > > > >> > > > When C heartbeats, it transitions to epoch 3 but cannot
> >> get
> >> > foo-1
> >> > > > > >> yet.
> >> > > > > >> > > >
> >> > > > > >> > > > Here,it's mentioned that member C can't get the foo-1
> >> > partition
> >> > > > > yet,
> >> > > > > >> > but
> >> > > > > >> > > > based on the description above, it seems it already has
> >> it.
> >> > Do you
> >> > > > > >> > think
> >> > > > > >> > > it
> >> > > > > >> > > > would be better to remove it and populate it only when
> it
> >> > actually
> >> > > > > >> gets
> >> > > > > >> > > it?
> >> > > > > >> > > > I see this in a lot of other places, so have I
> >> understood it
> >> > > > > >> > incorrectly
> >> > > > > >> > > ?
> >> > > > > >> > > >
> >> > > > > >> > > >
> >> > > > > >> > > > Regarding connect , it might be out of scope of this
> >> > discussion,
> >> > > > > but
> >> > > > > >> > from
> >> > > > > >> > > > what I understood it would probably be running in
> client
> >> > side
> >> > > > > >> assignor
> >> > > > > >> > > mode
> >> > > > > >> > > > even on the new rebalance protocol as it has its own
> >> Custom
> >> > > > > >> > > Assignors(Eager
> >> > > > > >> > > > and IncrementalCooperative).
> >> > > > > >> > > >
> >> > > > > >> > > > Thanks!
> >> > > > > >> > > >
> >> > > > > >> > > > Sagar.
> >> > > > > >> > > >
> >> > > > > >> > > >
> >> > > > > >> > > >
> >> > > > > >> > > >
> >> > > > > >> > > >
> >> > > > > >> > > >
> >> > > > > >> > > > On Fri, Jul 15, 2022 at 5:00 PM David Jacot
> >> > > > > >> > <dja...@confluent.io.invalid
> >> > > > > >> > > >
> >> > > > > >> > > > wrote:
> >> > > > > >> > > >
> >> > > > > >> > > > > Thanks Hector! Our goal is to move forward with
> >> > specialized API
> >> > > > > >> > > > > instead of relying on one generic API. For Connect,
> we
> >> > can apply
> >> > > > > >> the
> >> > > > > >> > > > > exact same pattern and reuse/share the core
> >> > implementation on
> >> > > > > the
> >> > > > > >> > > > > server side. For the schema registry, I think that we
> >> > should
> >> > > > > >> consider
> >> > > > > >> > > > > having a tailored API to do simple membership/leader
> >> > election.
> >> > > > > >> > > > >
> >> > > > > >> > > > > Best,
> >> > > > > >> > > > > David
> >> > > > > >> > > > >
> >> > > > > >> > > > > On Fri, Jul 15, 2022 at 10:22 AM Ismael Juma <
> >> > ism...@juma.me.uk
> >> > > > > >
> >> > > > > >> > > wrote:
> >> > > > > >> > > > > >
> >> > > > > >> > > > > > Three quick comments:
> >> > > > > >> > > > > >
> >> > > > > >> > > > > > 1. Regarding java.util.regex.Pattern vs
> >> > > > > >> com.google.re2j.Pattern, we
> >> > > > > >> > > > > should
> >> > > > > >> > > > > > document the differences in more detail before
> >> deciding
> >> > one
> >> > > > > way
> >> > > > > >> or
> >> > > > > >> > > > > another.
> >> > > > > >> > > > > > That said, if people pass java.util.regex.Pattern,
> >> they
> >> > expect
> >> > > > > >> > their
> >> > > > > >> > > > > > semantics to be honored. If we are doing something
> >> > different,
> >> > > > > >> then
> >> > > > > >> > we
> >> > > > > >> > > > > > should consider adding an overload with our own
> >> Pattern
> >> > class
> >> > > > > (I
> >> > > > > >> > > don't
> >> > > > > >> > > > > > think we'd want to expose re2j's at this point).
> >> > > > > >> > > > > > 2. Regarding topic ids, any major new protocol
> should
> >> > > > > integrate
> >> > > > > >> > fully
> >> > > > > >> > > > > with
> >> > > > > >> > > > > > it and should handle the topic recreation case
> >> > correctly.
> >> > > > > That's
> >> > > > > >> > the
> >> > > > > >> > > main
> >> > > > > >> > > > > > part we need to handle. I agree with David that
> we'd
> >> > want to
> >> > > > > add
> >> > > > > >> > > topic
> >> > > > > >> > > > > ids
> >> > > > > >> > > > > > to the relevant protocols that don't have it yet
> and
> >> > that we
> >> > > > > can
> >> > > > > >> > > probably
> >> > > > > >> > > > > > focus on the internals versus adding new APIs to
> the
> >> > Java
> >> > > > > >> Consumer
> >> > > > > >> > > > > (unless
> >> > > > > >> > > > > > we find that adding new APIs is required for
> >> reasonable
> >> > > > > >> semantics).
> >> > > > > >> > > > > > 3. I am still not sure about the coordinator
> storing
> >> the
> >> > > > > >> configs.
> >> > > > > >> > > It's
> >> > > > > >> > > > > > powerful for configs to be centralized in the
> >> metadata
> >> > log for
> >> > > > > >> > > various
> >> > > > > >> > > > > > reasons (auditability, visibility, consistency,
> >> etc.).
> >> > > > > >> Similarly, I
> >> > > > > >> > > am
> >> > > > > >> > > > > not
> >> > > > > >> > > > > > sure about automatically deleting configs in a way
> >> that
> >> > they
> >> > > > > >> cannot
> >> > > > > >> > > be
> >> > > > > >> > > > > > recovered. A good property for modern systems is to
> >> > minimize
> >> > > > > the
> >> > > > > >> > > number
> >> > > > > >> > > > > of
> >> > > > > >> > > > > > unrecoverable data loss scenarios.
> >> > > > > >> > > > > >
> >> > > > > >> > > > > > Ismael
> >> > > > > >> > > > > >
> >> > > > > >> > > > > > On Wed, Jul 13, 2022 at 3:47 PM David Jacot
> >> > > > > >> > > <dja...@confluent.io.invalid
> >> > > > > >> > > > > >
> >> > > > > >> > > > > > wrote:
> >> > > > > >> > > > > >
> >> > > > > >> > > > > > > Thanks Guozhang. My answers are below:
> >> > > > > >> > > > > > >
> >> > > > > >> > > > > > > > 1) the migration path, especially the last step
> >> when
> >> > > > > clients
> >> > > > > >> > > flip the
> >> > > > > >> > > > > > > flag
> >> > > > > >> > > > > > > > to enable the new protocol, in which we would
> >> have a
> >> > > > > window
> >> > > > > >> > where
> >> > > > > >> > > > > both
> >> > > > > >> > > > > > > new
> >> > > > > >> > > > > > > > protocols / rpcs and old protocols / rpcs are
> >> used
> >> > by
> >> > > > > >> members
> >> > > > > >> > of
> >> > > > > >> > > the
> >> > > > > >> > > > > same
> >> > > > > >> > > > > > > > group. How the coordinator could "mimic" the
> old
> >> > behavior
> >> > > > > >> while
> >> > > > > >> > > > > using the
> >> > > > > >> > > > > > > > new protocol is something we need to present
> >> about.
> >> > > > > >> > > > > > >
> >> > > > > >> > > > > > > Noted. I just published a new version of KIP
> which
> >> > includes
> >> > > > > >> more
> >> > > > > >> > > > > > > details about this. See the "Supporting Online
> >> > Consumer
> >> > > > > Group
> >> > > > > >> > > Upgrade"
> >> > > > > >> > > > > > > and the "Compatibility, Deprecation, and
> Migration
> >> > Plan". I
> >> > > > > >> think
> >> > > > > >> > > that
> >> > > > > >> > > > > > > I have to think through a few cases now but the
> >> > overall idea
> >> > > > > >> and
> >> > > > > >> > > > > > > mechanism should be understandable.
> >> > > > > >> > > > > > >
> >> > > > > >> > > > > > > > 2) the usage of topic ids. So far as KIP-516
> the
> >> > topic ids
> >> > > > > >> are
> >> > > > > >> > > only
> >> > > > > >> > > > > used
> >> > > > > >> > > > > > > as
> >> > > > > >> > > > > > > > part of RPCs and admin client, but they are not
> >> > exposed
> >> > > > > via
> >> > > > > >> any
> >> > > > > >> > > > > public
> >> > > > > >> > > > > > > APIs
> >> > > > > >> > > > > > > > to consumers yet. I think the question is,
> first
> >> > should we
> >> > > > > >> let
> >> > > > > >> > > the
> >> > > > > >> > > > > > > consumer
> >> > > > > >> > > > > > > > client to be maintaining the names -> ids
> mapping
> >> > itself
> >> > > > > to
> >> > > > > >> > fully
> >> > > > > >> > > > > > > leverage
> >> > > > > >> > > > > > > > on all the augmented existing RPCs and the new
> >> RPCs
> >> > with
> >> > > > > the
> >> > > > > >> > > topic
> >> > > > > >> > > > > ids;
> >> > > > > >> > > > > > > and
> >> > > > > >> > > > > > > > secondly, should we ever consider exposing the
> >> > topic ids
> >> > > > > in
> >> > > > > >> the
> >> > > > > >> > > > > consumer
> >> > > > > >> > > > > > > > public APIs as well (both subscribe/assign, as
> >> well
> >> > as in
> >> > > > > >> the
> >> > > > > >> > > > > rebalance
> >> > > > > >> > > > > > > > listener for cases like topic
> >> > deletion-and-recreation).
> >> > > > > >> > > > > > >
> >> > > > > >> > > > > > > a) Assuming that we would include converting all
> >> the
> >> > offsets
> >> > > > > >> > > related
> >> > > > > >> > > > > > > RPCs to using topic ids in this KIP, the consumer
> >> > would be
> >> > > > > >> able
> >> > > > > >> > to
> >> > > > > >> > > > > > > fully operate with topic ids. That being said, it
> >> > still has
> >> > > > > to
> >> > > > > >> > > provide
> >> > > > > >> > > > > > > the topics names in various APIs so having a
> >> mapping
> >> > in the
> >> > > > > >> > > consumer
> >> > > > > >> > > > > > > seems inevitable to me.
> >> > > > > >> > > > > > > b) I don't have a strong opinion on this. Here I
> >> > wonder if
> >> > > > > >> this
> >> > > > > >> > > goes
> >> > > > > >> > > > > > > beyond the scope of this KIP. I would rather
> focus
> >> on
> >> > the
> >> > > > > >> > internals
> >> > > > > >> > > > > > > here and we can consider this separately if we
> see
> >> > value in
> >> > > > > >> doing
> >> > > > > >> > > it.
> >> > > > > >> > > > > > >
> >> > > > > >> > > > > > > Coming back to Ismael's point about using topic
> ids
> >> > in the
> >> > > > > >> > > > > > > ConsumerGroupHeartbeatRequest, I think that there
> >> is
> >> > one
> >> > > > > >> > advantage
> >> > > > > >> > > in
> >> > > > > >> > > > > > > favour of it. The consumer will have the
> >> opportunity
> >> > to
> >> > > > > >> validate
> >> > > > > >> > > that
> >> > > > > >> > > > > > > the topics exists before passing them into the
> >> group
> >> > > > > rebalance
> >> > > > > >> > > > > > > protocol. Obviously, the coordinator will also
> >> notice
> >> > it but
> >> > > > > >> it
> >> > > > > >> > > does
> >> > > > > >> > > > > > > not really have a way to reject an invalid topic
> in
> >> > the
> >> > > > > >> response.
> >> > > > > >> > > > > > >
> >> > > > > >> > > > > > > > I'm agreeing with David on all other minor
> >> questions
> >> > > > > except
> >> > > > > >> for
> >> > > > > >> > > the
> >> > > > > >> > > > > > > > `subscribe(Pattern)` question: personally I
> think
> >> > it's not
> >> > > > > >> > > necessary
> >> > > > > >> > > > > to
> >> > > > > >> > > > > > > > deprecate the subscribe API with Pattern, but
> >> > instead we
> >> > > > > >> still
> >> > > > > >> > > use
> >> > > > > >> > > > > > > Pattern
> >> > > > > >> > > > > > > > while just documenting that our subscription
> may
> >> be
> >> > > > > >> rejected by
> >> > > > > >> > > the
> >> > > > > >> > > > > > > server.
> >> > > > > >> > > > > > > > Since the incompatible case is a very rare
> >> scenario
> >> > I felt
> >> > > > > >> > using
> >> > > > > >> > > an
> >> > > > > >> > > > > > > > overloaded `String` based subscription may be
> >> more
> >> > > > > >> vulnerable
> >> > > > > >> > to
> >> > > > > >> > > > > various
> >> > > > > >> > > > > > > > invalid regexes.
> >> > > > > >> > > > > > >
> >> > > > > >> > > > > > > That could work. I have to look at the
> differences
> >> > between
> >> > > > > the
> >> > > > > >> > two
> >> > > > > >> > > > > > > engines to better understand the potential
> issues.
> >> My
> >> > > > > >> > > understanding is
> >> > > > > >> > > > > > > that would work for all the basic regular
> >> > expressions. The
> >> > > > > >> > > differences
> >> > > > > >> > > > > > > between the two are mainly about the various
> >> character
> >> > > > > >> classes. I
> >> > > > > >> > > > > > > wonder what other people think about this.
> >> > > > > >> > > > > > >
> >> > > > > >> > > > > > > Best,
> >> > > > > >> > > > > > > David
> >> > > > > >> > > > > > >
> >> > > > > >> > > > > > > On Tue, Jul 12, 2022 at 11:28 PM Guozhang Wang <
> >> > > > > >> > wangg...@gmail.com
> >> > > > > >> > > >
> >> > > > > >> > > > > wrote:
> >> > > > > >> > > > > > > >
> >> > > > > >> > > > > > > > Thanks David! I think on the high level there
> are
> >> > two meta
> >> > > > > >> > > points we
> >> > > > > >> > > > > need
> >> > > > > >> > > > > > > > to concretize a bit more:
> >> > > > > >> > > > > > > >
> >> > > > > >> > > > > > > > 1) the migration path, especially the last step
> >> when
> >> > > > > clients
> >> > > > > >> > > flip the
> >> > > > > >> > > > > > > flag
> >> > > > > >> > > > > > > > to enable the new protocol, in which we would
> >> have a
> >> > > > > window
> >> > > > > >> > where
> >> > > > > >> > > > > both
> >> > > > > >> > > > > > > new
> >> > > > > >> > > > > > > > protocols / rpcs and old protocols / rpcs are
> >> used
> >> > by
> >> > > > > >> members
> >> > > > > >> > of
> >> > > > > >> > > the
> >> > > > > >> > > > > same
> >> > > > > >> > > > > > > > group. How the coordinator could "mimic" the
> old
> >> > behavior
> >> > > > > >> while
> >> > > > > >> > > > > using the
> >> > > > > >> > > > > > > > new protocol is something we need to present
> >> about.
> >> > > > > >> > > > > > > > 2) the usage of topic ids. So far as KIP-516
> the
> >> > topic ids
> >> > > > > >> are
> >> > > > > >> > > only
> >> > > > > >> > > > > used
> >> > > > > >> > > > > > > as
> >> > > > > >> > > > > > > > part of RPCs and admin client, but they are not
> >> > exposed
> >> > > > > via
> >> > > > > >> any
> >> > > > > >> > > > > public
> >> > > > > >> > > > > > > APIs
> >> > > > > >> > > > > > > > to consumers yet. I think the question is,
> first
> >> > should we
> >> > > > > >> let
> >> > > > > >> > > the
> >> > > > > >> > > > > > > consumer
> >> > > > > >> > > > > > > > client to be maintaining the names -> ids
> mapping
> >> > itself
> >> > > > > to
> >> > > > > >> > fully
> >> > > > > >> > > > > > > leverage
> >> > > > > >> > > > > > > > on all the augmented existing RPCs and the new
> >> RPCs
> >> > with
> >> > > > > the
> >> > > > > >> > > topic
> >> > > > > >> > > > > ids;
> >> > > > > >> > > > > > > and
> >> > > > > >> > > > > > > > secondly, should we ever consider exposing the
> >> > topic ids
> >> > > > > in
> >> > > > > >> the
> >> > > > > >> > > > > consumer
> >> > > > > >> > > > > > > > public APIs as well (both subscribe/assign, as
> >> well
> >> > as in
> >> > > > > >> the
> >> > > > > >> > > > > rebalance
> >> > > > > >> > > > > > > > listener for cases like topic
> >> > deletion-and-recreation).
> >> > > > > >> > > > > > > >
> >> > > > > >> > > > > > > > I'm agreeing with David on all other minor
> >> questions
> >> > > > > except
> >> > > > > >> for
> >> > > > > >> > > the
> >> > > > > >> > > > > > > > `subscribe(Pattern)` question: personally I
> think
> >> > it's not
> >> > > > > >> > > necessary
> >> > > > > >> > > > > to
> >> > > > > >> > > > > > > > deprecate the subscribe API with Pattern, but
> >> > instead we
> >> > > > > >> still
> >> > > > > >> > > use
> >> > > > > >> > > > > > > Pattern
> >> > > > > >> > > > > > > > while just documenting that our subscription
> may
> >> be
> >> > > > > >> rejected by
> >> > > > > >> > > the
> >> > > > > >> > > > > > > server.
> >> > > > > >> > > > > > > > Since the incompatible case is a very rare
> >> scenario
> >> > I felt
> >> > > > > >> > using
> >> > > > > >> > > an
> >> > > > > >> > > > > > > > overloaded `String` based subscription may be
> >> more
> >> > > > > >> vulnerable
> >> > > > > >> > to
> >> > > > > >> > > > > various
> >> > > > > >> > > > > > > > invalid regexes.
> >> > > > > >> > > > > > > >
> >> > > > > >> > > > > > > >
> >> > > > > >> > > > > > > > Guozhang
> >> > > > > >> > > > > > > >
> >> > > > > >> > > > > > > > On Tue, Jul 12, 2022 at 5:23 AM David Jacot
> >> > > > > >> > > > > <dja...@confluent.io.invalid
> >> > > > > >> > > > > > > >
> >> > > > > >> > > > > > > > wrote:
> >> > > > > >> > > > > > > >
> >> > > > > >> > > > > > > > > Hi Ismael,
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > Thanks for your feedback. Let me answer your
> >> > questions
> >> > > > > >> > inline.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > > 1. I think it's premature to talk about
> >> target
> >> > > > > versions
> >> > > > > >> for
> >> > > > > >> > > > > > > deprecation
> >> > > > > >> > > > > > > > > and
> >> > > > > >> > > > > > > > > > removal of the existing group protocol.
> >> Unlike
> >> > KRaft,
> >> > > > > >> this
> >> > > > > >> > > > > affects a
> >> > > > > >> > > > > > > core
> >> > > > > >> > > > > > > > > > client protocol and hence
> deprecation/removal
> >> > will be
> >> > > > > >> > heavily
> >> > > > > >> > > > > > > dependent
> >> > > > > >> > > > > > > > > on
> >> > > > > >> > > > > > > > > > how quickly applications migrate to the new
> >> > protocol.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > That makes sense. I will remove it.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > > 2. The KIP says we intend to release this
> in
> >> > 4.x, but
> >> > > > > it
> >> > > > > >> > > wasn't
> >> > > > > >> > > > > made
> >> > > > > >> > > > > > > > > clear
> >> > > > > >> > > > > > > > > > why. If we added that as a way to estimate
> >> when
> >> > we'd
> >> > > > > >> > > deprecate
> >> > > > > >> > > > > and
> >> > > > > >> > > > > > > remove
> >> > > > > >> > > > > > > > > > the group protocol, I also suggest removing
> >> > this part.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > Let me explain my reasoning. As explained, I
> >> plan
> >> > to
> >> > > > > >> rewrite
> >> > > > > >> > > the
> >> > > > > >> > > > > group
> >> > > > > >> > > > > > > > > coordinator in Java while we implement the
> new
> >> > protocol.
> >> > > > > >> This
> >> > > > > >> > > means
> >> > > > > >> > > > > > > > > that the internals will be slightly different
> >> > (e.g.
> >> > > > > >> threading
> >> > > > > >> > > > > model).
> >> > > > > >> > > > > > > > > Therefore, I wanted to tighten the switch
> from
> >> > the old
> >> > > > > >> group
> >> > > > > >> > > > > > > > > coordinator to the new group coordinator to a
> >> > major
> >> > > > > >> release.
> >> > > > > >> > > The
> >> > > > > >> > > > > > > > > alternative would be to use a flag to do the
> >> > switch
> >> > > > > >> instead
> >> > > > > >> > of
> >> > > > > >> > > > > relying
> >> > > > > >> > > > > > > > > on the software upgrade.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > > 3. We need to flesh out the details of the
> >> > migration
> >> > > > > >> story.
> >> > > > > >> > > It
> >> > > > > >> > > > > sounds
> >> > > > > >> > > > > > > > > like
> >> > > > > >> > > > > > > > > > we're saying we will support online
> >> migrations.
> >> > Is
> >> > > > > that
> >> > > > > >> > > correct?
> >> > > > > >> > > > > We
> >> > > > > >> > > > > > > > > should
> >> > > > > >> > > > > > > > > > explain this in detail. It could also be
> done
> >> > as a
> >> > > > > >> separate
> >> > > > > >> > > KIP,
> >> > > > > >> > > > > if
> >> > > > > >> > > > > > > it's
> >> > > > > >> > > > > > > > > > easier.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > Yes, we will support online migrations for
> the
> >> > group.
> >> > > > > That
> >> > > > > >> > > means
> >> > > > > >> > > > > that
> >> > > > > >> > > > > > > > > a group using the old protocol will be able
> to
> >> > switch to
> >> > > > > >> the
> >> > > > > >> > > new
> >> > > > > >> > > > > > > > > protocol.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > Let me briefly explain how that will work
> >> though.
> >> > It is
> >> > > > > >> > > basically a
> >> > > > > >> > > > > > > > > four step process:
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > 1. The cluster must be upgraded or rolled to
> a
> >> > software
> >> > > > > >> > > supporting
> >> > > > > >> > > > > the
> >> > > > > >> > > > > > > > > new group coordinator. Both the old and the
> new
> >> > > > > >> coordinator
> >> > > > > >> > > will
> >> > > > > >> > > > > > > > > support the old protocol and rely on the same
> >> > persisted
> >> > > > > >> > > metadata so
> >> > > > > >> > > > > > > > > they can work together. This point is an
> >> offline
> >> > > > > >> migration.
> >> > > > > >> > We
> >> > > > > >> > > > > cannot
> >> > > > > >> > > > > > > > > do this one live because it would require
> >> > shutting down
> >> > > > > >> the
> >> > > > > >> > > current
> >> > > > > >> > > > > > > > > coordinator and starting up the new one and
> >> that
> >> > would
> >> > > > > >> cause
> >> > > > > >> > > > > > > > > unavailabilities.
> >> > > > > >> > > > > > > > > 2. The cluster's metadata version/IBP must be
> >> > upgraded
> >> > > > > to
> >> > > > > >> X
> >> > > > > >> > in
> >> > > > > >> > > > > order
> >> > > > > >> > > > > > > > > to enable the new protocol. This cannot be
> done
> >> > before
> >> > > > > 1)
> >> > > > > >> is
> >> > > > > >> > > > > > > > > terminated because the old coordinator
> doesn't
> >> > support
> >> > > > > the
> >> > > > > >> > new
> >> > > > > >> > > > > > > > > protocol.
> >> > > > > >> > > > > > > > > 3. The consumers must be upgraded to a
> version
> >> > > > > supporting
> >> > > > > >> the
> >> > > > > >> > > > > online
> >> > > > > >> > > > > > > > > migration (must have KIP-792). If the
> consumer
> >> is
> >> > > > > already
> >> > > > > >> > > there.
> >> > > > > >> > > > > > > > > Nothing must be done at this point.
> >> > > > > >> > > > > > > > > 4. The consumers must be rolled with the
> >> feature
> >> > flag
> >> > > > > >> turned
> >> > > > > >> > > on.
> >> > > > > >> > > > > The
> >> > > > > >> > > > > > > > > consumer group is automatically converted
> when
> >> > the first
> >> > > > > >> > > consumer
> >> > > > > >> > > > > > > > > using the new protocol joins the group. While
> >> the
> >> > > > > members
> >> > > > > >> > > using the
> >> > > > > >> > > > > > > > > old protocol are being upgraded, the old
> >> protocol
> >> > is
> >> > > > > >> proxied
> >> > > > > >> > > into
> >> > > > > >> > > > > the
> >> > > > > >> > > > > > > > > new one.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > Let me clarify all of this in the KIP.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > > 4. I am happy that we are pushing the
> pattern
> >> > > > > >> subscriptions
> >> > > > > >> > > to
> >> > > > > >> > > > > the
> >> > > > > >> > > > > > > > > server,
> >> > > > > >> > > > > > > > > > but it seems like there could be some
> tricky
> >> > > > > >> compatibility
> >> > > > > >> > > > > issues.
> >> > > > > >> > > > > > > Will
> >> > > > > >> > > > > > > > > we
> >> > > > > >> > > > > > > > > > have a mechanism for users to detect that
> >> they
> >> > need to
> >> > > > > >> > update
> >> > > > > >> > > > > their
> >> > > > > >> > > > > > > regex
> >> > > > > >> > > > > > > > > > before switching to the new protocol?
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > I think that I am a bit more optimistic than
> >> you
> >> > on this
> >> > > > > >> > > point. I
> >> > > > > >> > > > > > > > > believe that the majority of the cases are
> >> simple
> >> > > > > regexes
> >> > > > > >> > which
> >> > > > > >> > > > > should
> >> > > > > >> > > > > > > > > work with the new engine. The coordinator
> will
> >> > verify
> >> > > > > the
> >> > > > > >> > regex
> >> > > > > >> > > > > anyway
> >> > > > > >> > > > > > > > > and reject the consumer if the regex is not
> >> valid.
> >> > > > > Coming
> >> > > > > >> > back
> >> > > > > >> > > to
> >> > > > > >> > > > > the
> >> > > > > >> > > > > > > > > migration path, in the worst case, the first
> >> > upgraded
> >> > > > > >> > consumer
> >> > > > > >> > > > > joining
> >> > > > > >> > > > > > > > > the group will be rejected. This should be
> used
> >> > as the
> >> > > > > >> last
> >> > > > > >> > > > > defence, I
> >> > > > > >> > > > > > > > > would say.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > One way for customers to validate their regex
> >> > before
> >> > > > > >> > upgrading
> >> > > > > >> > > > > their
> >> > > > > >> > > > > > > > > prod would be to test them with another
> group.
> >> For
> >> > > > > >> instance,
> >> > > > > >> > > that
> >> > > > > >> > > > > > > > > could be done in a pre-prod environment.
> >> Another
> >> > way
> >> > > > > >> would be
> >> > > > > >> > > to
> >> > > > > >> > > > > > > > > extend the consumer-group tool to provide a
> >> regex
> >> > > > > >> validation
> >> > > > > >> > > > > > > > > mechanism. Would this be enough in your
> >> opinion?
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > > 5. Related to the last question, will the
> >> Java
> >> > client
> >> > > > > >> allow
> >> > > > > >> > > the
> >> > > > > >> > > > > > > users to
> >> > > > > >> > > > > > > > > > stick with the current regex engine for
> >> > compatibility
> >> > > > > >> > > reasons?
> >> > > > > >> > > > > For
> >> > > > > >> > > > > > > > > example,
> >> > > > > >> > > > > > > > > > it may be handy to keep using client based
> >> > regex at
> >> > > > > >> first
> >> > > > > >> > to
> >> > > > > >> > > keep
> >> > > > > >> > > > > > > > > > migrations simple and then migrate to
> server
> >> > based
> >> > > > > >> regexes
> >> > > > > >> > > as a
> >> > > > > >> > > > > > > second
> >> > > > > >> > > > > > > > > step.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > I understand your point but I am concerned
> that
> >> > this
> >> > > > > would
> >> > > > > >> > > allow
> >> > > > > >> > > > > users
> >> > > > > >> > > > > > > > > to actually stay in this mode. That would go
> >> > against our
> >> > > > > >> goal
> >> > > > > >> > > of
> >> > > > > >> > > > > > > > > simplifying the client because we would have
> to
> >> > continue
> >> > > > > >> > > monitoring
> >> > > > > >> > > > > > > > > the metadata on the client side. I would
> rather
> >> > not do
> >> > > > > >> this.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > > 6. When we say that the group coordinator
> >> will
> >> > be
> >> > > > > >> > > responsible for
> >> > > > > >> > > > > > > storing
> >> > > > > >> > > > > > > > > > the configurations and that the
> >> configurations
> >> > will be
> >> > > > > >> > > deleted
> >> > > > > >> > > > > when
> >> > > > > >> > > > > > > the
> >> > > > > >> > > > > > > > > > group is deleted. Will a transition to DEAD
> >> > trigger
> >> > > > > >> > deletion
> >> > > > > >> > > of
> >> > > > > >> > > > > > > > > > configurations?
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > That's right. The configurations will be
> >> deleted
> >> > when
> >> > > > > the
> >> > > > > >> > > group is
> >> > > > > >> > > > > > > > > deleted. They go together.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > > 7. Will the choice to store the configs in
> >> the
> >> > group
> >> > > > > >> > > coordinator
> >> > > > > >> > > > > > > make it
> >> > > > > >> > > > > > > > > > harder to list all cluster configs and
> their
> >> > values?
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > I don't think so. The group configurations
> are
> >> > overrides
> >> > > > > >> of
> >> > > > > >> > > cluster
> >> > > > > >> > > > > > > > > configs. If you want to know all the
> overrides
> >> > though,
> >> > > > > you
> >> > > > > >> > > would
> >> > > > > >> > > > > have
> >> > > > > >> > > > > > > > > to ask all the group coordinators. You cannot
> >> > rely on
> >> > > > > the
> >> > > > > >> > > metadata
> >> > > > > >> > > > > log
> >> > > > > >> > > > > > > > > for instance.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > > 8. How would someone configure a group
> before
> >> > starting
> >> > > > > >> the
> >> > > > > >> > > > > consumers?
> >> > > > > >> > > > > > > > > Have
> >> > > > > >> > > > > > > > > > we considered allowing the explicit
> creation
> >> of
> >> > > > > groups?
> >> > > > > >> > > > > > > Alternatively,
> >> > > > > >> > > > > > > > > the
> >> > > > > >> > > > > > > > > > configs could be decoupled from the group
> >> > lifecycle.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > Yes. The group will be automatically created
> in
> >> > this
> >> > > > > case.
> >> > > > > >> > > However,
> >> > > > > >> > > > > > > > > the configs will be lost after the retention
> >> > period of
> >> > > > > the
> >> > > > > >> > > group
> >> > > > > >> > > > > > > > > passes.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > > 9. Will the Consumer.subscribe method for
> the
> >> > Java
> >> > > > > >> client
> >> > > > > >> > > still
> >> > > > > >> > > > > take
> >> > > > > >> > > > > > > a
> >> > > > > >> > > > > > > > > > `java.util.regex.Pattern` of do we have to
> >> > introduce
> >> > > > > an
> >> > > > > >> > > overload?
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > That's a very group question. I forgot about
> >> that
> >> > one.
> >> > > > > As
> >> > > > > >> the
> >> > > > > >> > > > > > > > > `java.util.regex.Pattern` is not fully
> >> compatible
> >> > with
> >> > > > > the
> >> > > > > >> > > engine
> >> > > > > >> > > > > that
> >> > > > > >> > > > > > > > > we plan to use, it might be better to
> deprecate
> >> > it and
> >> > > > > >> use an
> >> > > > > >> > > > > overload
> >> > > > > >> > > > > > > > > which takes a string. We would rely on the
> >> server
> >> > side
> >> > > > > >> > > validation.
> >> > > > > >> > > > > > > > > During the migration, I think that we could
> >> still
> >> > try to
> >> > > > > >> > > toString
> >> > > > > >> > > > > the
> >> > > > > >> > > > > > > > > regex and use it. That should work, I think,
> in
> >> > the
> >> > > > > >> majority
> >> > > > > >> > > of the
> >> > > > > >> > > > > > > > > cases.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > > 10. I agree with Justine that we should be
> >> > clearer
> >> > > > > about
> >> > > > > >> > the
> >> > > > > >> > > > > reason
> >> > > > > >> > > > > > > to
> >> > > > > >> > > > > > > > > > switch to IBP/metadata.version from the
> >> feature
> >> > flag.
> >> > > > > >> Maybe
> >> > > > > >> > > we
> >> > > > > >> > > > > mean
> >> > > > > >> > > > > > > that
> >> > > > > >> > > > > > > > > we
> >> > > > > >> > > > > > > > > > can switch the default for the feature flag
> >> to
> >> > true
> >> > > > > >> based
> >> > > > > >> > on
> >> > > > > >> > > the
> >> > > > > >> > > > > > > > > > metadata.version once we want to make it
> the
> >> > default.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > My plan was to use that feature flag mainly
> >> > during the
> >> > > > > >> > > development
> >> > > > > >> > > > > > > > > phase. I should not have mentioned it, I
> think,
> >> > because
> >> > > > > we
> >> > > > > >> > > could
> >> > > > > >> > > > > use
> >> > > > > >> > > > > > > > > an internal config for it.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > > 11. Some of the protocol APIs don't mention
> >> the
> >> > > > > required
> >> > > > > >> > > ACLs, it
> >> > > > > >> > > > > > > would
> >> > > > > >> > > > > > > > > be
> >> > > > > >> > > > > > > > > > good to add that for consistency.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > Noted.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > > 12. It is a bit odd that
> >> ConsumerGroupHeartbeat
> >> > > > > requires
> >> > > > > >> > > "Read
> >> > > > > >> > > > > Group"
> >> > > > > >> > > > > > > > > even
> >> > > > > >> > > > > > > > > > though it seems to do more than reading.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > I agree. This is how the current protocol
> works
> >> > though.
> >> > > > > We
> >> > > > > >> > only
> >> > > > > >> > > > > > > > > require "Read Group" to join a group. We
> could
> >> > consider
> >> > > > > >> > > changing
> >> > > > > >> > > > > this
> >> > > > > >> > > > > > > > > but I am not sure that it is worth it.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > > 13. How is topic recreation handled by the
> >> > consumer
> >> > > > > with
> >> > > > > >> > the
> >> > > > > >> > > new
> >> > > > > >> > > > > > > group
> >> > > > > >> > > > > > > > > > protocol? It would be good to have a
> section
> >> on
> >> > this.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > Noted. From a protocol perspective, the new
> >> topic
> >> > will
> >> > > > > >> have a
> >> > > > > >> > > new
> >> > > > > >> > > > > > > > > topic id so it will treat it like a topic
> with
> >> a
> >> > > > > different
> >> > > > > >> > > name.
> >> > > > > >> > > > > The
> >> > > > > >> > > > > > > > > only issue is that the fetch/commit offsets
> >> APIs
> >> > do not
> >> > > > > >> > support
> >> > > > > >> > > > > topic
> >> > > > > >> > > > > > > > > IDs so the consumer would reuse the offsets
> >> based
> >> > on the
> >> > > > > >> > same.
> >> > > > > >> > > I
> >> > > > > >> > > > > think
> >> > > > > >> > > > > > > > > that we should update those APIs as well in
> >> order
> >> > to be
> >> > > > > >> > > consistent
> >> > > > > >> > > > > end
> >> > > > > >> > > > > > > > > to end. That would strengthen the semantics
> of
> >> the
> >> > > > > >> consumer.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > > 14. The KIP mentions we will write the new
> >> > coordinator
> >> > > > > >> in
> >> > > > > >> > > Java.
> >> > > > > >> > > > > Even
> >> > > > > >> > > > > > > > > though
> >> > > > > >> > > > > > > > > > this is an implementation detail, do we
> plan
> >> to
> >> > have a
> >> > > > > >> new
> >> > > > > >> > > gradle
> >> > > > > >> > > > > > > module
> >> > > > > >> > > > > > > > > > for it?
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > Yes.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > > 15. Do we have a scalability goal when it
> >> comes
> >> > to how
> >> > > > > >> many
> >> > > > > >> > > > > members
> >> > > > > >> > > > > > > the
> >> > > > > >> > > > > > > > > new
> >> > > > > >> > > > > > > > > > group protocol can support?
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > We don't have numbers at the moment. The
> >> protocol
> >> > should
> >> > > > > >> > > support
> >> > > > > >> > > > > 1000s
> >> > > > > >> > > > > > > > > of members per group. We will measure this
> when
> >> > we have
> >> > > > > a
> >> > > > > >> > first
> >> > > > > >> > > > > > > > > implementation. Note that we might have other
> >> > > > > bottlenecks
> >> > > > > >> > down
> >> > > > > >> > > the
> >> > > > > >> > > > > > > > > road (e.g. offset commits).
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > > 16. Did we consider having
> SubscribedTopidIds
> >> > instead
> >> > > > > of
> >> > > > > >> > > > > > > > > > SubscribedTopicNames in
> >> > ConsumerGroupHeartbeatRequest?
> >> > > > > >> Is
> >> > > > > >> > the
> >> > > > > >> > > > > idea
> >> > > > > >> > > > > > > that
> >> > > > > >> > > > > > > > > > since we have to resolve the regex on the
> >> > server, we
> >> > > > > >> can do
> >> > > > > >> > > the
> >> > > > > >> > > > > same
> >> > > > > >> > > > > > > for
> >> > > > > >> > > > > > > > > > the topic name? The difference is that
> >> sending
> >> > the
> >> > > > > >> regex is
> >> > > > > >> > > more
> >> > > > > >> > > > > > > > > efficient
> >> > > > > >> > > > > > > > > > whereas sending the topic names is less
> >> > efficient.
> >> > > > > >> > > Furthermore,
> >> > > > > >> > > > > > > delete
> >> > > > > >> > > > > > > > > and
> >> > > > > >> > > > > > > > > > recreation is easier to handle if we have
> >> topic
> >> > ids.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > The idea was to consolidate the metadata
> lookup
> >> > on the
> >> > > > > >> server
> >> > > > > >> > > for
> >> > > > > >> > > > > both
> >> > > > > >> > > > > > > > > paths but I do agree with your point. As a
> >> second
> >> > > > > though,
> >> > > > > >> > using
> >> > > > > >> > > > > topic
> >> > > > > >> > > > > > > > > ids may be better here for the delete and
> >> > recreation
> >> > > > > case.
> >> > > > > >> > > Also, I
> >> > > > > >> > > > > > > > > suppose that we may allow users to subscribe
> >> with
> >> > topic
> >> > > > > >> ids
> >> > > > > >> > in
> >> > > > > >> > > the
> >> > > > > >> > > > > > > > > future because that is the only way to be
> >> really
> >> > robust
> >> > > > > to
> >> > > > > >> > > topic
> >> > > > > >> > > > > > > > > re-creation.
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > Best,
> >> > > > > >> > > > > > > > > David
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > > > On Tue, Jul 12, 2022 at 1:38 PM David Jacot <
> >> > > > > >> > > dja...@confluent.io>
> >> > > > > >> > > > > > > wrote:
> >> > > > > >> > > > > > > > > >
> >> > > > > >> > > > > > > > > > Hi Justine,
> >> > > > > >> > > > > > > > > >
> >> > > > > >> > > > > > > > > > Thanks for your comments. Please find my
> >> answers
> >> > > > > below.
> >> > > > > >> > > > > > > > > >
> >> > > > > >> > > > > > > > > > - Yes, the new protocol relies on topic IDs
> >> > with the
> >> > > > > >> > > exception
> >> > > > > >> > > > > of the
> >> > > > > >> > > > > > > > > > topic names based in the
> >> > > > > ConsumerGroupHeartbeatRequest.
> >> > > > > >> I
> >> > > > > >> > am
> >> > > > > >> > > not
> >> > > > > >> > > > > sure
> >> > > > > >> > > > > > > > > > if using topic names is the right call
> here.
> >> I
> >> > need to
> >> > > > > >> > think
> >> > > > > >> > > > > about it
> >> > > > > >> > > > > > > > > > a little more. Obviously, the KIP does not
> >> > change the
> >> > > > > >> > > > > fetch/commit
> >> > > > > >> > > > > > > > > > offsets RPCs to use topic IDs. This may be
> >> > something
> >> > > > > >> that
> >> > > > > >> > we
> >> > > > > >> > > > > should
> >> > > > > >> > > > > > > > > > include though as it would give better
> >> overall
> >> > > > > >> guarantee in
> >> > > > > >> > > the
> >> > > > > >> > > > > > > > > > producer.
> >> > > > > >> > > > > > > > > > - You're right. I think that I should not
> >> have
> >> > > > > mentioned
> >> > > > > >> > this
> >> > > > > >> > > > > flag at
> >> > > > > >> > > > > > > > > > all. I will remove it. We can use an
> internal
> >> > > > > >> configuration
> >> > > > > >> > > while
> >> > > > > >> > > > > > > > > > developing the feature.
> >> > > > > >> > > > > > > > > > - Both cluster types will be supported. The
> >> > change is
> >> > > > > >> > > > > orthogonal. The
> >> > > > > >> > > > > > > > > > only requirement is that the cluster uses
> >> topic
> >> > IDs.
> >> > > > > >> > > > > > > > > >
> >> > > > > >> > > > > > > > > > Best,
> >> > > > > >> > > > > > > > > > David
> >> > > > > >> > > > > > > > > >
> >> > > > > >> > > > > > > > > > On Mon, Jul 11, 2022 at 9:53 PM Guozhang
> >> Wang <
> >> > > > > >> > > > > wangg...@gmail.com>
> >> > > > > >> > > > > > > > > wrote:
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > Hi Ismael,
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > Thanks for the feedback. Here are some
> >> replies
> >> > > > > inlined
> >> > > > > >> > > below:
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > On Sat, Jul 9, 2022 at 2:53 AM Ismael
> Juma
> >> <
> >> > > > > >> > > ism...@juma.me.uk>
> >> > > > > >> > > > > > > wrote:
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > Thanks for the KIP. This has the
> >> potential
> >> > to be a
> >> > > > > >> > great
> >> > > > > >> > > > > > > > > improvement. A few
> >> > > > > >> > > > > > > > > > > > initial questions/comments:
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > 1. I think it's premature to talk about
> >> > target
> >> > > > > >> versions
> >> > > > > >> > > for
> >> > > > > >> > > > > > > > > deprecation and
> >> > > > > >> > > > > > > > > > > > removal of the existing group protocol.
> >> > Unlike
> >> > > > > >> KRaft,
> >> > > > > >> > > this
> >> > > > > >> > > > > > > affects a
> >> > > > > >> > > > > > > > > core
> >> > > > > >> > > > > > > > > > > > client protocol and hence
> >> > deprecation/removal will
> >> > > > > >> be
> >> > > > > >> > > heavily
> >> > > > > >> > > > > > > > > dependent on
> >> > > > > >> > > > > > > > > > > > how quickly applications migrate to the
> >> new
> >> > > > > >> protocol.
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > Yeah I agree with you. I think we can
> >> remove
> >> > the
> >> > > > > >> proposed
> >> > > > > >> > > > > timeline
> >> > > > > >> > > > > > > in
> >> > > > > >> > > > > > > > > the
> >> > > > > >> > > > > > > > > > > `Compatibility, Deprecation, and
> Migration
> >> > Plan` and
> >> > > > > >> > > instead
> >> > > > > >> > > > > just
> >> > > > > >> > > > > > > state
> >> > > > > >> > > > > > > > > > > that we will decide in the future about
> >> when
> >> > we
> >> > > > > would
> >> > > > > >> > > > > deprecate old
> >> > > > > >> > > > > > > > > > > protocol and behaviors.
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > 2. The KIP says we intend to release
> this
> >> > in 4.x,
> >> > > > > >> but
> >> > > > > >> > it
> >> > > > > >> > > > > wasn't
> >> > > > > >> > > > > > > made
> >> > > > > >> > > > > > > > > clear
> >> > > > > >> > > > > > > > > > > > why. If we added that as a way to
> >> estimate
> >> > when
> >> > > > > we'd
> >> > > > > >> > > > > deprecate
> >> > > > > >> > > > > > > and
> >> > > > > >> > > > > > > > > remove
> >> > > > > >> > > > > > > > > > > > the group protocol, I also suggest
> >> removing
> >> > this
> >> > > > > >> part.
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > I think that's not specifically related
> to
> >> the
> >> > > > > >> > > > > deprecation/removal
> >> > > > > >> > > > > > > > > timeline
> >> > > > > >> > > > > > > > > > > plan, but it's more for client upgrades.
> >> I.e.
> >> > the
> >> > > > > >> > > broker-side
> >> > > > > >> > > > > > > > > > > implementation may be done first, and
> then
> >> the
> >> > > > > client
> >> > > > > >> > side,
> >> > > > > >> > > > > and we
> >> > > > > >> > > > > > > > > would
> >> > > > > >> > > > > > > > > > > only mark it as "released" by the time
> >> clients
> >> > > > > >> > > implementations
> >> > > > > >> > > > > are
> >> > > > > >> > > > > > > > > done. At
> >> > > > > >> > > > > > > > > > > that time, to enable the feature the
> >> clients
> >> > need to
> >> > > > > >> > first
> >> > > > > >> > > > > swap-in
> >> > > > > >> > > > > > > the
> >> > > > > >> > > > > > > > > > > bytecode with a rolling bounce and then
> set
> >> > the flag
> >> > > > > >> > with a
> >> > > > > >> > > > > second
> >> > > > > >> > > > > > > > > rolling
> >> > > > > >> > > > > > > > > > > bounce, and hence we feel it's better to
> be
> >> > released
> >> > > > > >> in a
> >> > > > > >> > > major
> >> > > > > >> > > > > > > > > version.
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > 3. We need to flesh out the details of
> >> the
> >> > > > > migration
> >> > > > > >> > > story.
> >> > > > > >> > > > > It
> >> > > > > >> > > > > > > > > sounds like
> >> > > > > >> > > > > > > > > > > > we're saying we will support online
> >> > migrations. Is
> >> > > > > >> that
> >> > > > > >> > > > > correct?
> >> > > > > >> > > > > > > We
> >> > > > > >> > > > > > > > > should
> >> > > > > >> > > > > > > > > > > > explain this in detail. It could also
> be
> >> > done as a
> >> > > > > >> > > separate
> >> > > > > >> > > > > KIP,
> >> > > > > >> > > > > > > if
> >> > > > > >> > > > > > > > > it's
> >> > > > > >> > > > > > > > > > > > easier.
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > Yes I think that's the part we can be
> more
> >> > concrete
> >> > > > > >> about
> >> > > > > >> > > for
> >> > > > > >> > > > > sure
> >> > > > > >> > > > > > > (and
> >> > > > > >> > > > > > > > > > > this is related to your question 2)
> above).
> >> > We will
> >> > > > > >> work
> >> > > > > >> > on
> >> > > > > >> > > > > making
> >> > > > > >> > > > > > > it
> >> > > > > >> > > > > > > > > more
> >> > > > > >> > > > > > > > > > > explicit in parallel as we solicit more
> >> > feedback.
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > 4. I am happy that we are pushing the
> >> > pattern
> >> > > > > >> > > subscriptions
> >> > > > > >> > > > > to
> >> > > > > >> > > > > > > the
> >> > > > > >> > > > > > > > > server,
> >> > > > > >> > > > > > > > > > > > but it seems like there could be some
> >> tricky
> >> > > > > >> > > compatibility
> >> > > > > >> > > > > > > issues.
> >> > > > > >> > > > > > > > > Will we
> >> > > > > >> > > > > > > > > > > > have a mechanism for users to detect
> that
> >> > they
> >> > > > > need
> >> > > > > >> to
> >> > > > > >> > > update
> >> > > > > >> > > > > > > their
> >> > > > > >> > > > > > > > > regex
> >> > > > > >> > > > > > > > > > > > before switching to the new protocol?
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > Yes I think we need some tooling for
> >> non-java
> >> > client
> >> > > > > >> > users
> >> > > > > >> > > to
> >> > > > > >> > > > > sort
> >> > > > > >> > > > > > > of
> >> > > > > >> > > > > > > > > > > "dry-run" the client before switching to
> >> the
> >> > new
> >> > > > > >> > protocol.
> >> > > > > >> > > I
> >> > > > > >> > > > > do not
> >> > > > > >> > > > > > > > > have a
> >> > > > > >> > > > > > > > > > > specific idea on top of my head though,
> >> maybe
> >> > others
> >> > > > > >> like
> >> > > > > >> > > @Matt
> >> > > > > >> > > > > > > > > Howlett can
> >> > > > > >> > > > > > > > > > > chime-in here?
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > 5. Related to the last question, will
> the
> >> > Java
> >> > > > > >> client
> >> > > > > >> > > allow
> >> > > > > >> > > > > the
> >> > > > > >> > > > > > > > > users to
> >> > > > > >> > > > > > > > > > > > stick with the current regex engine for
> >> > > > > >> compatibility
> >> > > > > >> > > > > reasons?
> >> > > > > >> > > > > > > For
> >> > > > > >> > > > > > > > > example,
> >> > > > > >> > > > > > > > > > > > it may be handy to keep using client
> >> based
> >> > regex
> >> > > > > at
> >> > > > > >> > > first to
> >> > > > > >> > > > > keep
> >> > > > > >> > > > > > > > > > > > migrations simple and then migrate to
> >> > server based
> >> > > > > >> > > regexes
> >> > > > > >> > > > > as a
> >> > > > > >> > > > > > > > > second
> >> > > > > >> > > > > > > > > > > > step.
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > Honestly I have not thought about that
> for
> >> > java
> >> > > > > >> clients,
> >> > > > > >> > > and
> >> > > > > >> > > > > we can
> >> > > > > >> > > > > > > > > discuss
> >> > > > > >> > > > > > > > > > > that. What kind of compatibility issues
> do
> >> > you have
> >> > > > > in
> >> > > > > >> > > mind?
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > 6. When we say that the group
> coordinator
> >> > will be
> >> > > > > >> > > > > responsible for
> >> > > > > >> > > > > > > > > storing
> >> > > > > >> > > > > > > > > > > > the configurations and that the
> >> > configurations
> >> > > > > will
> >> > > > > >> be
> >> > > > > >> > > > > deleted
> >> > > > > >> > > > > > > when
> >> > > > > >> > > > > > > > > the
> >> > > > > >> > > > > > > > > > > > group is deleted. Will a transition to
> >> DEAD
> >> > > > > trigger
> >> > > > > >> > > deletion
> >> > > > > >> > > > > of
> >> > > > > >> > > > > > > > > > > > configurations?
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > Yes, since the DEAD state is an ending
> >> state
> >> > (we
> >> > > > > would
> >> > > > > >> > only
> >> > > > > >> > > > > > > transit to
> >> > > > > >> > > > > > > > > that
> >> > > > > >> > > > > > > > > > > state when the group is EMPTY and also
> all
> >> of
> >> > its
> >> > > > > >> > metadata
> >> > > > > >> > > are
> >> > > > > >> > > > > > > gone),
> >> > > > > >> > > > > > > > > once
> >> > > > > >> > > > > > > > > > > it's transited to DEAD this group would
> >> never
> >> > be
> >> > > > > >> revived.
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > 7. Will the choice to store the configs
> >> in
> >> > the
> >> > > > > group
> >> > > > > >> > > > > coordinator
> >> > > > > >> > > > > > > > > make it
> >> > > > > >> > > > > > > > > > > > harder to list all cluster configs and
> >> their
> >> > > > > values?
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > That's a good question, and our thoughts
> >> are
> >> > that
> >> > > > > the
> >> > > > > >> > > so-called
> >> > > > > >> > > > > > > "group
> >> > > > > >> > > > > > > > > > > configurations" are overrides of the
> >> > cluster-level
> >> > > > > >> > > > > configurations
> >> > > > > >> > > > > > > > > > > customized per group so when an admin
> list
> >> > cluster
> >> > > > > >> > configs
> >> > > > > >> > > it's
> >> > > > > >> > > > > > > okay to
> >> > > > > >> > > > > > > > > > > list just the cluster-level "defaults",
> not
> >> > showing
> >> > > > > >> any
> >> > > > > >> > > > > per-group
> >> > > > > >> > > > > > > > > > > customizations.
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > 8. How would someone configure a group
> >> > before
> >> > > > > >> starting
> >> > > > > >> > > the
> >> > > > > >> > > > > > > > > consumers? Have
> >> > > > > >> > > > > > > > > > > > we considered allowing the explicit
> >> > creation of
> >> > > > > >> groups?
> >> > > > > >> > > > > > > > > Alternatively, the
> >> > > > > >> > > > > > > > > > > > configs could be decoupled from the
> group
> >> > > > > lifecycle.
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > The configs can be created before the
> group
> >> > itself
> >> > > > > as
> >> > > > > >> an
> >> > > > > >> > > > > > > independent
> >> > > > > >> > > > > > > > > entity
> >> > > > > >> > > > > > > > > > > --- of course, this requires the
> >> corresponding
> >> > > > > >> request to
> >> > > > > >> > > be
> >> > > > > >> > > > > > > routed to
> >> > > > > >> > > > > > > > > the
> >> > > > > >> > > > > > > > > > > right coordinator based on the group id
> ---
> >> > the only
> >> > > > > >> > thing
> >> > > > > >> > > that
> >> > > > > >> > > > > > > > > differs is,
> >> > > > > >> > > > > > > > > > > when the group itself is gone we also
> check
> >> > if there
> >> > > > > >> are
> >> > > > > >> > > any
> >> > > > > >> > > > > > > > > configuration
> >> > > > > >> > > > > > > > > > > entities related to that group and delete
> >> as
> >> > well.
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > Admittedly this indeed introduces an
> >> > asymmetry on
> >> > > > > the
> >> > > > > >> > > creation
> >> > > > > >> > > > > /
> >> > > > > >> > > > > > > > > deletion
> >> > > > > >> > > > > > > > > > > lifecycles of the config entities, and we
> >> > would like
> >> > > > > >> to
> >> > > > > >> > > hear
> >> > > > > >> > > > > > > everyone's
> >> > > > > >> > > > > > > > > > > feelings whether we should aim for
> symmetry
> >> > i.e.
> >> > > > > >> totally
> >> > > > > >> > > > > decouple
> >> > > > > >> > > > > > > group
> >> > > > > >> > > > > > > > > > > configs and hence not delete them at all
> >> when
> >> > the
> >> > > > > >> group
> >> > > > > >> > is
> >> > > > > >> > > > > gone,
> >> > > > > >> > > > > > > but
> >> > > > > >> > > > > > > > > always
> >> > > > > >> > > > > > > > > > > require explicit deletion operations by
> >> > themselves.
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > 9. Will the Consumer.subscribe method
> for
> >> > the Java
> >> > > > > >> > client
> >> > > > > >> > > > > still
> >> > > > > >> > > > > > > take
> >> > > > > >> > > > > > > > > a
> >> > > > > >> > > > > > > > > > > > `java.util.regex.Pattern` of do we have
> >> to
> >> > > > > >> introduce an
> >> > > > > >> > > > > overload?
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > I think we do not need to introduce an
> >> > overload, but
> >> > > > > >> I'm
> >> > > > > >> > > all
> >> > > > > >> > > > > ears
> >> > > > > >> > > > > > > if
> >> > > > > >> > > > > > > > > there
> >> > > > > >> > > > > > > > > > > may be some compatibility issues that we
> >> may
> >> > > > > overlook.
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > 10. I agree with Justine that we should
> >> be
> >> > clearer
> >> > > > > >> > about
> >> > > > > >> > > the
> >> > > > > >> > > > > > > reason
> >> > > > > >> > > > > > > > > to
> >> > > > > >> > > > > > > > > > > > switch to IBP/metadata.version from the
> >> > feature
> >> > > > > >> flag.
> >> > > > > >> > > Maybe
> >> > > > > >> > > > > we
> >> > > > > >> > > > > > > mean
> >> > > > > >> > > > > > > > > that we
> >> > > > > >> > > > > > > > > > > > can switch the default for the feature
> >> flag
> >> > to
> >> > > > > true
> >> > > > > >> > > based on
> >> > > > > >> > > > > the
> >> > > > > >> > > > > > > > > > > > metadata.version once we want to make
> it
> >> the
> >> > > > > >> default.
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > 11. Some of the protocol APIs don't
> mention
> >> > the
> >> > > > > >> required
> >> > > > > >> > > ACLs,
> >> > > > > >> > > > > it
> >> > > > > >> > > > > > > > > would be
> >> > > > > >> > > > > > > > > > > > good to add that for consistency.
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > Ack, we can certainly do that.
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > 12. It is a bit odd that
> >> > ConsumerGroupHeartbeat
> >> > > > > >> > requires
> >> > > > > >> > > > > "Read
> >> > > > > >> > > > > > > > > Group" even
> >> > > > > >> > > > > > > > > > > > though it seems to do more than
> reading.
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > I had that thought myself as well, but in
> >> the
> >> > end we
> >> > > > > >> > could
> >> > > > > >> > > not
> >> > > > > >> > > > > > > find a
> >> > > > > >> > > > > > > > > > > better alternative: adding Write Group
> >> seems
> >> > an
> >> > > > > >> overkill
> >> > > > > >> > > here
> >> > > > > >> > > > > > > since we
> >> > > > > >> > > > > > > > > do
> >> > > > > >> > > > > > > > > > > not have it elsewhere (we only have Read
> /
> >> > Delete
> >> > > > > and
> >> > > > > >> > > Describe
> >> > > > > >> > > > > on
> >> > > > > >> > > > > > > > > groups so
> >> > > > > >> > > > > > > > > > > far). Would like to hear others thoughts.
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > 13. How is topic recreation handled by
> >> the
> >> > > > > consumer
> >> > > > > >> > with
> >> > > > > >> > > the
> >> > > > > >> > > > > new
> >> > > > > >> > > > > > > > > group
> >> > > > > >> > > > > > > > > > > > protocol? It would be good to have a
> >> > section on
> >> > > > > >> this.
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > You mean with regex subscription right?
> Yes
> >> > we can
> >> > > > > >> add a
> >> > > > > >> > > > > section
> >> > > > > >> > > > > > > about
> >> > > > > >> > > > > > > > > > > that, but basically the idea is that
> >> consumer
> >> > would
> >> > > > > be
> >> > > > > >> > > totally
> >> > > > > >> > > > > > > > > agnostic in
> >> > > > > >> > > > > > > > > > > the new protocol as it's handled all by
> the
> >> > brokers.
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > 14. The KIP mentions we will write the
> >> new
> >> > > > > >> coordinator
> >> > > > > >> > in
> >> > > > > >> > > > > Java.
> >> > > > > >> > > > > > > Even
> >> > > > > >> > > > > > > > > though
> >> > > > > >> > > > > > > > > > > > this is an implementation detail, do we
> >> > plan to
> >> > > > > >> have a
> >> > > > > >> > > new
> >> > > > > >> > > > > gradle
> >> > > > > >> > > > > > > > > module
> >> > > > > >> > > > > > > > > > > > for it?
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > We have not thought about that. But I
> think
> >> > the
> >> > > > > answer
> >> > > > > >> > > should
> >> > > > > >> > > > > be
> >> > > > > >> > > > > > > yes.
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > 15. Do we have a scalability goal when
> it
> >> > comes to
> >> > > > > >> how
> >> > > > > >> > > many
> >> > > > > >> > > > > > > members
> >> > > > > >> > > > > > > > > the new
> >> > > > > >> > > > > > > > > > > > group protocol can support?
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > Within a group, I think we should shoot
> for
> >> > 1000s of
> >> > > > > >> > > members.
> >> > > > > >> > > > > But
> >> > > > > >> > > > > > > that
> >> > > > > >> > > > > > > > > > > scalability goals also depend on the
> offset
> >> > > > > management
> >> > > > > >> > > (commit,
> >> > > > > >> > > > > > > fetch)
> >> > > > > >> > > > > > > > > > > capabilities of the coordinator which we
> >> did
> >> > not
> >> > > > > >> cover in
> >> > > > > >> > > this
> >> > > > > >> > > > > > > KIP, so
> >> > > > > >> > > > > > > > > it's
> >> > > > > >> > > > > > > > > > > hard to give a number that applies
> >> > universally.
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > 16. Did we consider having
> >> > SubscribedTopidIds
> >> > > > > >> instead
> >> > > > > >> > of
> >> > > > > >> > > > > > > > > > > > SubscribedTopicNames in
> >> > > > > >> ConsumerGroupHeartbeatRequest?
> >> > > > > >> > > Is the
> >> > > > > >> > > > > > > idea
> >> > > > > >> > > > > > > > > that
> >> > > > > >> > > > > > > > > > > > since we have to resolve the regex on
> the
> >> > server,
> >> > > > > we
> >> > > > > >> > can
> >> > > > > >> > > do
> >> > > > > >> > > > > the
> >> > > > > >> > > > > > > same
> >> > > > > >> > > > > > > > > for
> >> > > > > >> > > > > > > > > > > > the topic name? The difference is that
> >> > sending the
> >> > > > > >> > regex
> >> > > > > >> > > is
> >> > > > > >> > > > > more
> >> > > > > >> > > > > > > > > efficient
> >> > > > > >> > > > > > > > > > > > whereas sending the topic names is less
> >> > efficient.
> >> > > > > >> > > > > Furthermore,
> >> > > > > >> > > > > > > > > delete and
> >> > > > > >> > > > > > > > > > > > recreation is easier to handle if we
> have
> >> > topic
> >> > > > > ids.
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > The main reason to still let the clients
> >> send
> >> > names
> >> > > > > >> is to
> >> > > > > >> > > keep
> >> > > > > >> > > > > the
> >> > > > > >> > > > > > > > > > > reasoning of names -> ids on the broker /
> >> > admin
> >> > > > > client
> >> > > > > >> > > only.
> >> > > > > >> > > > > Note
> >> > > > > >> > > > > > > that
> >> > > > > >> > > > > > > > > > > although we added topic id in KIP-516, we
> >> > never
> >> > > > > >> > > implemented the
> >> > > > > >> > > > > > > logic
> >> > > > > >> > > > > > > > > on
> >> > > > > >> > > > > > > > > > > consumer/producers leveraging the related
> >> > newer
> >> > > > > >> versioned
> >> > > > > >> > > RPCs,
> >> > > > > >> > > > > > > > > instead we
> >> > > > > >> > > > > > > > > > > just set the topic id as empty UUID. We
> >> want
> >> > to keep
> >> > > > > >> the
> >> > > > > >> > > > > > > > > consumer/producer
> >> > > > > >> > > > > > > > > > > to be thin and only delegate the
> reasoning
> >> on
> >> > broker
> >> > > > > >> and
> >> > > > > >> > > > > > > potentially
> >> > > > > >> > > > > > > > > admin
> >> > > > > >> > > > > > > > > > > clients.
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > Thanks,
> >> > > > > >> > > > > > > > > > > > Ismael
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > On Wed, Jul 6, 2022 at 10:45 AM David
> >> Jacot
> >> > > > > >> > > > > > > > > <dja...@confluent.io.invalid>
> >> > > > > >> > > > > > > > > > > > wrote:
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > > Hi all,
> >> > > > > >> > > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > > I would like to start a discussion
> >> thread
> >> > on
> >> > > > > >> KIP-848:
> >> > > > > >> > > The
> >> > > > > >> > > > > Next
> >> > > > > >> > > > > > > > > > > > > Generation of the Consumer Rebalance
> >> > Protocol.
> >> > > > > >> With
> >> > > > > >> > > this
> >> > > > > >> > > > > KIP,
> >> > > > > >> > > > > > > we
> >> > > > > >> > > > > > > > > aim
> >> > > > > >> > > > > > > > > > > > > to make the rebalance protocol (for
> >> > consumers)
> >> > > > > >> more
> >> > > > > >> > > > > reliable,
> >> > > > > >> > > > > > > more
> >> > > > > >> > > > > > > > > > > > > scalable, easier to implement for
> >> > clients, and
> >> > > > > >> easier
> >> > > > > >> > > to
> >> > > > > >> > > > > debug
> >> > > > > >> > > > > > > for
> >> > > > > >> > > > > > > > > > > > > operators.
> >> > > > > >> > > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > > The KIP is here:
> >> > > > > >> > > > > https://cwiki.apache.org/confluence/x/HhD1D.
> >> > > > > >> > > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > > Please take a look and let me know
> what
> >> > you
> >> > > > > think.
> >> > > > > >> > > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > > Best,
> >> > > > > >> > > > > > > > > > > > > David
> >> > > > > >> > > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > > > PS: I will be away from July 18th to
> >> > August 8th.
> >> > > > > >> That
> >> > > > > >> > > gives
> >> > > > > >> > > > > > > you a
> >> > > > > >> > > > > > > > > bit
> >> > > > > >> > > > > > > > > > > > > of time to read and digest this long
> >> KIP.
> >> > > > > >> > > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > >
> >> > > > > >> > > > > > > > > > > --
> >> > > > > >> > > > > > > > > > > -- Guozhang
> >> > > > > >> > > > > > > > >
> >> > > > > >> > > > > > > >
> >> > > > > >> > > > > > > >
> >> > > > > >> > > > > > > > --
> >> > > > > >> > > > > > > > -- Guozhang
> >> > > > > >> > > > > > >
> >> > > > > >> > > > >
> >> > > > > >> > >
> >> > > > > >> > >
> >> > > > > >> >
> >> > > > > >>
> >> > > > > >>
> >> > > > > >> --
> >> > > > > >> -- Guozhang
> >> > > > > >>
> >> > > > > >
> >> > > > >
> >> >
> >>
> >
>

Reply via email to