Hi,

I've updated the KIP with benchmark results focusing more directly on the
redirect case, please review and +1 in the voting thread if convinced.

Thank you,
Crispin

On Wed, Jul 26, 2023 at 11:13 PM Luke Chen <show...@gmail.com> wrote:

> Thanks for adding the benchmark results, Crispin!
> IMO, 2~5% performance improvement is small, but given the change is small,
> cost is also small (only append endpoint info when NOT_LEADER_OR_FOLLOWER..
> etc error), I think it is worth doing it.
>
> Thank you.
> Luke
>
> On Wed, Jul 26, 2023 at 12:33 AM Ismael Juma <m...@ismaeljuma.com> wrote:
>
> > Thanks Crispin!
> >
> > Ismael
> >
> > On Mon, Jul 24, 2023 at 8:16 PM Crispin Bernier
> > <cbern...@confluent.io.invalid> wrote:
> >
> > > I updated the wiki to include both results along with their average.
> > >
> > > Thank you,
> > > Crispin
> > >
> > > On Mon, Jul 24, 2023 at 10:58 AM Ismael Juma <m...@ismaeljuma.com>
> wrote:
> > >
> > > > Hi Crispin,
> > > >
> > > > One additional question, the wiki says "The results are averaged
> over 2
> > > > runs.". Can you please provide some measure of variance in the
> > > > distribution, i.e. were both results similar to each other for both
> > > cases?
> > > >
> > > > Ismael
> > > >
> > > > On Fri, Jul 21, 2023 at 11:31 AM Ismael Juma <m...@ismaeljuma.com>
> > wrote:
> > > >
> > > > > Thanks for the update Crispin - very helpful to have actual
> > performance
> > > > > data. 2-5% for the default configuration is a bit on the low side
> for
> > > > this
> > > > > kind of proposal.
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Thu, Jul 20, 2023 at 11:33 PM Crispin Bernier
> > > > > <cbern...@confluent.io.invalid> wrote:
> > > > >
> > > > >> Benchmark numbers have been posted on the KIP, please review.
> > > > >>
> > > > >> On 2023/07/20 13:03:00 Mayank Shekhar Narula wrote:
> > > > >> > Jun
> > > > >> >
> > > > >> > Thanks for the feedback.
> > > > >> >
> > > > >> > Numbers to follow.
> > > > >> >
> > > > >> > If we don't plan to
> > > > >> > > bump up the FetchResponse version, we could just remove the
> > > > reference
> > > > >> to
> > > > >> > > version 16.
> > > > >> >
> > > > >> > Fixed.
> > > > >> >
> > > > >> > On Thu, Jul 20, 2023 at 1:28 AM Jun Rao
> > <ju...@confluent.io.invalid
> > > >
> > > > >> wrote:
> > > > >> >
> > > > >> > > Hi, Mayank,
> > > > >> > >
> > > > >> > > Thanks for the KIP. I agree with others that it would be
> useful
> > to
> > > > >> see the
> > > > >> > > performance results. Otherwise, just a minor comment. If we
> > don't
> > > > >> plan to
> > > > >> > > bump up the FetchResponse version, we could just remove the
> > > > reference
> > > > >> to
> > > > >> > > version 16.
> > > > >> > >
> > > > >> > > Jun
> > > > >> > >
> > > > >> > > On Wed, Jul 19, 2023 at 2:31 PM Mayank Shekhar Narula <
> > > > >> > > mayanks.nar...@gmail.com> wrote:
> > > > >> > >
> > > > >> > > > Luke
> > > > >> > > >
> > > > >> > > > Thanks for the interest in the KIP.
> > > > >> > > >
> > > > >> > > > But what if the consumer was fetching from the follower?
> > > > >> > > >
> > > > >> > > > We already include `PreferredReadReplica` in the fetch
> > response.
> > > > >> > > > > Should we put the node info of PreferredReadReplica under
> > this
> > > > >> case,
> > > > >> > > > > instead of the leader's info?
> > > > >> > > > >
> > > > >> > > >
> > > > >> > > > PreferredReadReplica is the decided on the leader. Looking
> at
> > > the
> > > > >> Java
> > > > >> > > > client code, AbstractFetch::selectReadReplica, first fetch
> > > request
> > > > >> goes
> > > > >> > > to
> > > > >> > > > Leader of the partition -> Sends back PreferredReadReplica
> ->
> > > Next
> > > > >> fetch
> > > > >> > > > uses PreferredReadReplica. So as long as leader is
> available,
> > > > >> > > > PreferredReadReplica would be found in subsequent fetches.
> > > > >> > > >
> > > > >> > > > Also, under this case, should we include the leader's info
> in
> > > the
> > > > >> > > response?
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > In this case, I think the follower would fail the fetch if
> it
> > > > knows
> > > > >> a
> > > > >> > > > different leader. If the follower knows a newer leader, it
> > would
> > > > >> return
> > > > >> > > new
> > > > >> > > > leader information in the response, for the client to act
> on.
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > Will we include the leader/node info in the response when
> > having
> > > > >> > > > > `UNKNOWN_LEADER_EPOCH` error?
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > My understanding is UNKNOWN_LEADER_EPOCH when a request
> from a
> > > > >> client
> > > > >> > > has a
> > > > >> > > > newer epoch than the broker. So the client is already up to
> > date
> > > > on
> > > > >> new
> > > > >> > > > leader information, it's the broker that has the catching up
> > to
> > > > do.
> > > > >> I
> > > > >> > > think
> > > > >> > > > there might be some optimisations to make sure the broker
> > > > refreshes
> > > > >> its
> > > > >> > > > metadata quickly, so it can quickly recover to handle
> requests
> > > > that
> > > > >> > > > previously returned UNKNOWN_LEADER_EPOCH. But this work is
> > > outside
> > > > >> the
> > > > >> > > > scope of this KIP, as for now this KIP focusses on
> client-side
> > > > >> > > > optimisations.
> > > > >> > > >
> > > > >> > > > Mayank
> > > > >> > > >
> > > > >> > > > On Tue, Jul 18, 2023 at 8:51 AM Luke Chen <sh...@gmail.com>
> > > > wrote:
> > > > >> > > >
> > > > >> > > > > Hi Mayank,
> > > > >> > > > >
> > > > >> > > > > Thanks for the KIP!
> > > > >> > > > >
> > > > >> > > > > Some questions:
> > > > >> > > > > 1. I can see most of the cases we only care about consumer
> > > fetch
> > > > >> from
> > > > >> > > the
> > > > >> > > > > leader.
> > > > >> > > > > But what if the consumer was fetching from the follower?
> > > > >> > > > > We already include `PreferredReadReplica` in the fetch
> > > response.
> > > > >> > > > > Should we put the node info of PreferredReadReplica under
> > this
> > > > >> case,
> > > > >> > > > > instead of the leader's info?
> > > > >> > > > > Also, under this case, should we include the leader's info
> > in
> > > > the
> > > > >> > > > response?
> > > > >> > > > >
> > > > >> > > > > 2. Will we include the leader/node info in the response
> when
> > > > >> having
> > > > >> > > > > `UNKNOWN_LEADER_EPOCH` error?
> > > > >> > > > > I think it's fine we ignore the `UNKNOWN_LEADER_EPOCH`
> error
> > > > >> since when
> > > > >> > > > > this happens, the node might have some error which should
> > > > refresh
> > > > >> the
> > > > >> > > > > metadata. On the other hand, it might also be good if we
> can
> > > > heal
> > > > >> the
> > > > >> > > > node
> > > > >> > > > > soon to do produce/consume works.
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > > Thank you.
> > > > >> > > > > Luke
> > > > >> > > > >
> > > > >> > > > > On Tue, Jul 18, 2023 at 2:00 AM Philip Nee <
> ph...@gmail.com
> > >
> > > > >> > > wrote:
> > > > >> > > > >
> > > > >> > > > > > Hey Mayank:
> > > > >> > > > > >
> > > > >> > > > > > For #1: I think fetch and produce behave a bit
> differently
> > > on
> > > > >> > > metadata.
> > > > >> > > > > > Maybe it is worth highlighting the changes for each
> client
> > > in
> > > > >> detail.
> > > > >> > > > In
> > > > >> > > > > > producer did you mean by the metadata timeout before
> > sending
> > > > out
> > > > >> > > > produce
> > > > >> > > > > > requests? For consumer: I think for fetches it requires
> > user
> > > > to
> > > > >> retry
> > > > >> > > > if
> > > > >> > > > > > the position does not exist on the leader. I don't have
> > the
> > > > >> detail on
> > > > >> > > > top
> > > > >> > > > > > of my head, but I think we should lay out these
> behavioral
> > > > >> changes.
> > > > >> > > > > >
> > > > >> > > > > > For #3: Thanks for the clarification.
> > > > >> > > > > >
> > > > >> > > > > > On Mon, Jul 17, 2023 at 10:39 AM Mayank Shekhar Narula <
> > > > >> > > > > > mayanks.nar...@gmail.com> wrote:
> > > > >> > > > > >
> > > > >> > > > > > > Philip
> > > > >> > > > > > >
> > > > >> > > > > > > 1. Good call out about "poll" behaviour, my
> > understanding
> > > is
> > > > >> the
> > > > >> > > > same.
> > > > >> > > > > I
> > > > >> > > > > > am
> > > > >> > > > > > > assuming it's about the motivation of the KIP. There
> > with
> > > > >> async, my
> > > > >> > > > > > > intention was to convey that the client doesn't wait
> for
> > > the
> > > > >> > > > > > > metadata-refresh before a subsequent retry of the
> > produce
> > > or
> > > > >> fetch
> > > > >> > > > > > request
> > > > >> > > > > > > that failed due to stale metadata(i.e. going to an old
> > > > >> leader). The
> > > > >> > > > > only
> > > > >> > > > > > > wait client has is the configured retry-delay.
> > > > >> > > > > > >
> > > > >> > > > > > > 2. Yes, in theory other APIs could benefit from this
> > too.
> > > > But
> > > > >> that
> > > > >> > > is
> > > > >> > > > > > > outside of the scope of the KIP.
> > > > >> > > > > > >
> > > > >> > > > > > > 3. Do you mean the response for the Metadata RPC? I
> > think
> > > > >> brokers
> > > > >> > > > > always
> > > > >> > > > > > > have a view of the cluster, although it can be
> stale,it
> > > > would
> > > > >> > > always
> > > > >> > > > > > return
> > > > >> > > > > > > a leader(whether old or new).
> > > > >> > > > > > >
> > > > >> > > > > > > Mayank
> > > > >> > > > > > >
> > > > >> > > > > > > On Fri, Jul 14, 2023 at 8:53 PM Philip Nee <
> > > ph...@gmail.com
> > > > >
> > > > >> > > > > wrote:
> > > > >> > > > > > >
> > > > >> > > > > > > > Hey Mayank,
> > > > >> > > > > > > >
> > > > >> > > > > > > > Thanks for the KIP. I think this is a great
> proposal,
> > > and
> > > > >> I'm in
> > > > >> > > > > favor
> > > > >> > > > > > > > of this idea.  A few comments:
> > > > >> > > > > > > >
> > > > >> > > > > > > > 1. Claiming metadata refresh is done asynchronously
> is
> > > > >> > > misleading.
> > > > >> > > > > The
> > > > >> > > > > > > > metadata refresh requires Network Client to be
> > > physically
> > > > >> polled,
> > > > >> > > > > which
> > > > >> > > > > > > is
> > > > >> > > > > > > > done in a separate thread in Producer and Admin
> Client
> > > > >> (IIRC!)
> > > > >> > > but
> > > > >> > > > > not
> > > > >> > > > > > > > Consumer.
> > > > >> > > > > > > > 2. There are other API calls that might result in
> > > > >> > > > > > NOT_LEADER_OR_FOLLOWER
> > > > >> > > > > > > > response, but it seems like this KIP only wants to
> > > update
> > > > on
> > > > >> > > fetch
> > > > >> > > > > and
> > > > >> > > > > > > > produce. Do we want to make the leader information
> > > > >> available for
> > > > >> > > > > other
> > > > >> > > > > > > API
> > > > >> > > > > > > > calls?
> > > > >> > > > > > > > 3. Do you know what would happen during a leader
> > > election?
> > > > >> I'm
> > > > >> > > not
> > > > >> > > > > sure
> > > > >> > > > > > > > about this process and I wonder if the current
> > metadata
> > > > >> response
> > > > >> > > > uses
> > > > >> > > > > > the
> > > > >> > > > > > > > old leader or null as the leader isn't readily
> > available
> > > > >> yet.
> > > > >> > > > > > > >
> > > > >> > > > > > > > Thanks,
> > > > >> > > > > > > > P
> > > > >> > > > > > > >
> > > > >> > > > > > > > On Fri, Jul 14, 2023 at 11:30 AM Kirk True <
> > > > >> ki...@kirktrue.pro>
> > > > >> > > > > wrote:
> > > > >> > > > > > > >
> > > > >> > > > > > > > > Hi Mayank,
> > > > >> > > > > > > > >
> > > > >> > > > > > > > > > On Jul 14, 2023, at 11:25 AM, Mayank Shekhar
> > Narula
> > > <
> > > > >> > > > > > > > > mayanks.nar...@gmail.com> wrote:
> > > > >> > > > > > > > > >
> > > > >> > > > > > > > > > Kirk
> > > > >> > > > > > > > > >
> > > > >> > > > > > > > > >
> > > > >> > > > > > > > > >> Is the requested restructuring of the response
> > > > >> “simply” to
> > > > >> > > > > > preserve
> > > > >> > > > > > > > > bytes,
> > > > >> > > > > > > > > >> or is it possible that the fetch response
> > > > >> could/should/would
> > > > >> > > > > > return
> > > > >> > > > > > > > > >> leadership changes for partitions that we’re
> > > > >> specifically
> > > > >> > > > > > requested?
> > > > >> > > > > > > > > >>
> > > > >> > > > > > > > > >
> > > > >> > > > > > > > > > Moving endpoints to top-level fields would
> > preserve
> > > > >> bytes,
> > > > >> > > > > > otherwise
> > > > >> > > > > > > > the
> > > > >> > > > > > > > > > endpoint-information would be duplicated if
> > included
> > > > >> with the
> > > > >> > > > > > > > > > partition-data in the response. Right now, the
> > > > top-level
> > > > >> > > field
> > > > >> > > > > will
> > > > >> > > > > > > > only
> > > > >> > > > > > > > > be
> > > > >> > > > > > > > > > set in case leader changes for any requested
> > > > >> partitions. But
> > > > >> > > it
> > > > >> > > > > can
> > > > >> > > > > > > be
> > > > >> > > > > > > > > > re-used in the future, for which Jose has a
> > use-case
> > > > in
> > > > >> mind
> > > > >> > > > > shared
> > > > >> > > > > > > up
> > > > >> > > > > > > > in
> > > > >> > > > > > > > > > the thread. KIP is now upto date with endpoint
> > info
> > > > >> being at
> > > > >> > > > > > > top-level.
> > > > >> > > > > > > > >
> > > > >> > > > > > > > >
> > > > >> > > > > > > > > I didn’t catch before that there was a separate
> > > section
> > > > >> for the
> > > > >> > > > > full
> > > > >> > > > > > > node
> > > > >> > > > > > > > > information, not just the ID and epoch.
> > > > >> > > > > > > > >
> > > > >> > > > > > > > > Thanks!
> > > > >> > > > > > > > >
> > > > >> > > > > > > > > >>> 3. In the future, I may use this information
> in
> > > the
> > > > >> > > > > > KRaft/Metadata
> > > > >> > > > > > > > > >>> implementation of FETCH. In that
> implementation
> > > not
> > > > >> all of
> > > > >> > > > the
> > > > >> > > > > > > > > >>> replicas are brokers.
> > > > >> > > > > > > > > >>
> > > > >> > > > > > > > > >> Side point: any references to the change you’re
> > > > >> referring
> > > > >> > > to?
> > > > >> > > > > The
> > > > >> > > > > > > idea
> > > > >> > > > > > > > > of
> > > > >> > > > > > > > > >> non-brokers serving as replicas is blowing my
> > mind
> > > a
> > > > >> bit :)
> > > > >> > > > > > > > > >>
> > > > >> > > > > > > > > >>
> > > > >> > > > > > > > > > Jose, I missed this as well, would love to know
> > more
> > > > >> about
> > > > >> > > > > > non-broker
> > > > >> > > > > > > > > > serving as replica!
> > > > >> > > > > > > > > > --
> > > > >> > > > > > > > > > Regards,
> > > > >> > > > > > > > > > Mayank Shekhar Narula
> > > > >> > > > > > > > >
> > > > >> > > > > > > > >
> > > > >> > > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > > > --
> > > > >> > > > > > > Regards,
> > > > >> > > > > > > Mayank Shekhar Narula
> > > > >> > > > > > >
> > > > >> > > > > >
> > > > >> > > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > --
> > > > >> > > > Regards,
> > > > >> > > > Mayank Shekhar Narula
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >> >
> > > > >> > --
> > > > >> > Regards,
> > > > >> > Mayank Shekhar Narula
> > > > >> >
> > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to