Hi Lucas,
I agree, if we want to go forward with a separate controller plane and data
plane and completely isolate them, having a separate port for controller
with a separate Acceptor and a Processor sounds ideal to me.

Thanks,

Mayuresh


On Mon, Jul 23, 2018 at 11:04 PM Becket Qin <becket....@gmail.com> wrote:

> Hi Lucas,
>
> Yes, I agree that a dedicated end to end control flow would be ideal.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Tue, Jul 24, 2018 at 1:05 PM, Lucas Wang <lucasatu...@gmail.com> wrote:
>
> > Thanks for the comment, Becket.
> > So far, we've been trying to avoid making any request handler thread
> > special.
> > But if we were to follow that path in order to make the two planes more
> > isolated,
> > what do you think about also having a dedicated processor thread,
> > and dedicated port for the controller?
> >
> > Today one processor thread can handle multiple connections, let's say 100
> > connections
> >
> > represented by connection0, ... connection99, among which connection0-98
> > are from clients, while connection99 is from
> >
> > the controller. Further let's say after one selector polling, there are
> > incoming requests on all connections.
> >
> > When the request queue is full, (either the data request being full in
> the
> > two queue design, or
> >
> > the one single queue being full in the deque design), the processor
> thread
> > will be blocked first
> >
> > when trying to enqueue the data request from connection0, then possibly
> > blocked for the data request
> >
> > from connection1, ... etc even though the controller request is ready to
> be
> > enqueued.
> >
> > To solve this problem, it seems we would need to have a separate port
> > dedicated to
> >
> > the controller, a dedicated processor thread, a dedicated controller
> > request queue,
> >
> > and pinning of one request handler thread for controller requests.
> >
> > Thanks,
> > Lucas
> >
> >
> > On Mon, Jul 23, 2018 at 6:00 PM, Becket Qin <becket....@gmail.com>
> wrote:
> >
> > > Personally I am not fond of the dequeue approach simply because it is
> > > against the basic idea of isolating the controller plane and data
> plane.
> > > With a single dequeue, theoretically speaking the controller requests
> can
> > > starve the clients requests. I would prefer the approach with a
> separate
> > > controller request queue and a dedicated controller request handler
> > thread.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Tue, Jul 24, 2018 at 8:16 AM, Lucas Wang <lucasatu...@gmail.com>
> > wrote:
> > >
> > > > Sure, I can summarize the usage of correlation id. But before I do
> > that,
> > > it
> > > > seems
> > > > the same out-of-order processing can also happen to Produce requests
> > sent
> > > > by producers,
> > > > following the same example you described earlier.
> > > > If that's the case, I think this probably deserves a separate doc and
> > > > design independent of this KIP.
> > > >
> > > > Lucas
> > > >
> > > >
> > > >
> > > > On Mon, Jul 23, 2018 at 12:39 PM, Dong Lin <lindon...@gmail.com>
> > wrote:
> > > >
> > > > > Hey Lucas,
> > > > >
> > > > > Could you update the KIP if you are confident with the approach
> which
> > > > uses
> > > > > correlation id? The idea around correlation id is kind of scattered
> > > > across
> > > > > multiple emails. It will be useful if other reviews can read the
> KIP
> > to
> > > > > understand the latest proposal.
> > > > >
> > > > > Thanks,
> > > > > Dong
> > > > >
> > > > > On Mon, Jul 23, 2018 at 12:32 PM, Mayuresh Gharat <
> > > > > gharatmayures...@gmail.com> wrote:
> > > > >
> > > > > > I like the idea of the dequeue implementation by Lucas. This will
> > > help
> > > > us
> > > > > > avoid additional queue for controller and additional configs in
> > > Kafka.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Mayuresh
> > > > > >
> > > > > > On Sun, Jul 22, 2018 at 2:58 AM Becket Qin <becket....@gmail.com
> >
> > > > wrote:
> > > > > >
> > > > > > > Hi Jun,
> > > > > > >
> > > > > > > The usage of correlation ID might still be useful to address
> the
> > > > cases
> > > > > > > that the controller epoch and leader epoch check are not
> > sufficient
> > > > to
> > > > > > > guarantee correct behavior. For example, if the controller
> sends
> > a
> > > > > > > LeaderAndIsrRequest followed by a StopReplicaRequest, and the
> > > broker
> > > > > > > processes it in the reverse order, the replica may still be
> > wrongly
> > > > > > > recreated, right?
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jiangjie (Becket) Qin
> > > > > > >
> > > > > > > > On Jul 22, 2018, at 11:47 AM, Jun Rao <j...@confluent.io>
> > wrote:
> > > > > > > >
> > > > > > > > Hmm, since we already use controller epoch and leader epoch
> for
> > > > > > properly
> > > > > > > > caching the latest partition state, do we really need
> > correlation
> > > > id
> > > > > > for
> > > > > > > > ordering the controller requests?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > > > On Fri, Jul 20, 2018 at 2:18 PM, Becket Qin <
> > > becket....@gmail.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > >> Lucas and Mayuresh,
> > > > > > > >>
> > > > > > > >> Good idea. The correlation id should work.
> > > > > > > >>
> > > > > > > >> In the ControllerChannelManager, a request will be resent
> > until
> > > a
> > > > > > > response
> > > > > > > >> is received. So if the controller to broker connection
> > > disconnects
> > > > > > after
> > > > > > > >> controller sends R1_a, but before the response of R1_a is
> > > > received,
> > > > > a
> > > > > > > >> disconnection may cause the controller to resend R1_b. i.e.
> > > until
> > > > R1
> > > > > > is
> > > > > > > >> acked, R2 won't be sent by the controller.
> > > > > > > >> This gives two guarantees:
> > > > > > > >> 1. Correlation id wise: R1_a < R1_b < R2.
> > > > > > > >> 2. On the broker side, when R2 is seen, R1 must have been
> > > > processed
> > > > > at
> > > > > > > >> least once.
> > > > > > > >>
> > > > > > > >> So on the broker side, with a single thread controller
> request
> > > > > > handler,
> > > > > > > the
> > > > > > > >> logic should be:
> > > > > > > >> 1. Process what ever request seen in the controller request
> > > queue
> > > > > > > >> 2. For the given epoch, drop request if its correlation id
> is
> > > > > smaller
> > > > > > > than
> > > > > > > >> that of the last processed request.
> > > > > > > >>
> > > > > > > >> Thanks,
> > > > > > > >>
> > > > > > > >> Jiangjie (Becket) Qin
> > > > > > > >>
> > > > > > > >> On Fri, Jul 20, 2018 at 8:07 AM, Jun Rao <j...@confluent.io>
> > > > wrote:
> > > > > > > >>
> > > > > > > >>> I agree that there is no strong ordering when there are
> more
> > > than
> > > > > one
> > > > > > > >>> socket connections. Currently, we rely on controllerEpoch
> and
> > > > > > > leaderEpoch
> > > > > > > >>> to ensure that the receiving broker picks up the latest
> state
> > > for
> > > > > > each
> > > > > > > >>> partition.
> > > > > > > >>>
> > > > > > > >>> One potential issue with the dequeue approach is that if
> the
> > > > queue
> > > > > is
> > > > > > > >> full,
> > > > > > > >>> there is no guarantee that the controller requests will be
> > > > enqueued
> > > > > > > >>> quickly.
> > > > > > > >>>
> > > > > > > >>> Thanks,
> > > > > > > >>>
> > > > > > > >>> Jun
> > > > > > > >>>
> > > > > > > >>> On Fri, Jul 20, 2018 at 5:25 AM, Mayuresh Gharat <
> > > > > > > >>> gharatmayures...@gmail.com
> > > > > > > >>>> wrote:
> > > > > > > >>>
> > > > > > > >>>> Yea, the correlationId is only set to 0 in the
> NetworkClient
> > > > > > > >> constructor.
> > > > > > > >>>> Since we reuse the same NetworkClient between Controller
> and
> > > the
> > > > > > > >> broker,
> > > > > > > >>> a
> > > > > > > >>>> disconnection should not cause it to reset to 0, in which
> > case
> > > > it
> > > > > > can
> > > > > > > >> be
> > > > > > > >>>> used to reject obsolete requests.
> > > > > > > >>>>
> > > > > > > >>>> Thanks,
> > > > > > > >>>>
> > > > > > > >>>> Mayuresh
> > > > > > > >>>>
> > > > > > > >>>> On Thu, Jul 19, 2018 at 1:52 PM Lucas Wang <
> > > > lucasatu...@gmail.com
> > > > > >
> > > > > > > >>> wrote:
> > > > > > > >>>>
> > > > > > > >>>>> @Dong,
> > > > > > > >>>>> Great example and explanation, thanks!
> > > > > > > >>>>>
> > > > > > > >>>>> @All
> > > > > > > >>>>> Regarding the example given by Dong, it seems even if we
> > use
> > > a
> > > > > > queue,
> > > > > > > >>>> and a
> > > > > > > >>>>> dedicated controller request handling thread,
> > > > > > > >>>>> the same result can still happen because R1_a will be
> sent
> > on
> > > > one
> > > > > > > >>>>> connection, and R1_b & R2 will be sent on a different
> > > > connection,
> > > > > > > >>>>> and there is no ordering between different connections on
> > the
> > > > > > broker
> > > > > > > >>>> side.
> > > > > > > >>>>> I was discussing with Mayuresh offline, and it seems
> > > > correlation
> > > > > id
> > > > > > > >>>> within
> > > > > > > >>>>> the same NetworkClient object is monotonically increasing
> > and
> > > > > never
> > > > > > > >>>> reset,
> > > > > > > >>>>> hence a broker can leverage that to properly reject
> > obsolete
> > > > > > > >> requests.
> > > > > > > >>>>> Thoughts?
> > > > > > > >>>>>
> > > > > > > >>>>> Thanks,
> > > > > > > >>>>> Lucas
> > > > > > > >>>>>
> > > > > > > >>>>> On Thu, Jul 19, 2018 at 12:11 PM, Mayuresh Gharat <
> > > > > > > >>>>> gharatmayures...@gmail.com> wrote:
> > > > > > > >>>>>
> > > > > > > >>>>>> Actually nvm, correlationId is reset in case of
> connection
> > > > > loss, I
> > > > > > > >>>> think.
> > > > > > > >>>>>>
> > > > > > > >>>>>> Thanks,
> > > > > > > >>>>>>
> > > > > > > >>>>>> Mayuresh
> > > > > > > >>>>>>
> > > > > > > >>>>>> On Thu, Jul 19, 2018 at 11:11 AM Mayuresh Gharat <
> > > > > > > >>>>>> gharatmayures...@gmail.com>
> > > > > > > >>>>>> wrote:
> > > > > > > >>>>>>
> > > > > > > >>>>>>> I agree with Dong that out-of-order processing can
> happen
> > > > with
> > > > > > > >>>> having 2
> > > > > > > >>>>>>> separate queues as well and it can even happen today.
> > > > > > > >>>>>>> Can we use the correlationId in the request from the
> > > > controller
> > > > > > > >> to
> > > > > > > >>>> the
> > > > > > > >>>>>>> broker to handle ordering ?
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> Thanks,
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> Mayuresh
> > > > > > > >>>>>>>
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> On Thu, Jul 19, 2018 at 6:41 AM Becket Qin <
> > > > > becket....@gmail.com
> > > > > > > >>>
> > > > > > > >>>>> wrote:
> > > > > > > >>>>>>>
> > > > > > > >>>>>>>> Good point, Joel. I agree that a dedicated controller
> > > > request
> > > > > > > >>>> handling
> > > > > > > >>>>>>>> thread would be a better isolation. It also solves the
> > > > > > > >> reordering
> > > > > > > >>>>> issue.
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>>> On Thu, Jul 19, 2018 at 2:23 PM, Joel Koshy <
> > > > > > > >> jjkosh...@gmail.com>
> > > > > > > >>>>>> wrote:
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>>>> Good example. I think this scenario can occur in the
> > > > current
> > > > > > > >>> code
> > > > > > > >>>> as
> > > > > > > >>>>>>>> well
> > > > > > > >>>>>>>>> but with even lower probability given that there are
> > > other
> > > > > > > >>>>>>>> non-controller
> > > > > > > >>>>>>>>> requests interleaved. It is still sketchy though and
> I
> > > > think
> > > > > a
> > > > > > > >>>> safer
> > > > > > > >>>>>>>>> approach would be separate queues and pinning
> > controller
> > > > > > > >> request
> > > > > > > >>>>>>>> handling
> > > > > > > >>>>>>>>> to one handler thread.
> > > > > > > >>>>>>>>>
> > > > > > > >>>>>>>>> On Wed, Jul 18, 2018 at 11:12 PM, Dong Lin <
> > > > > > > >> lindon...@gmail.com
> > > > > > > >>>>
> > > > > > > >>>>>> wrote:
> > > > > > > >>>>>>>>>
> > > > > > > >>>>>>>>>> Hey Becket,
> > > > > > > >>>>>>>>>>
> > > > > > > >>>>>>>>>> I think you are right that there may be out-of-order
> > > > > > > >>> processing.
> > > > > > > >>>>>>>> However,
> > > > > > > >>>>>>>>>> it seems that out-of-order processing may also
> happen
> > > even
> > > > > > > >> if
> > > > > > > >>> we
> > > > > > > >>>>>> use a
> > > > > > > >>>>>>>>>> separate queue.
> > > > > > > >>>>>>>>>>
> > > > > > > >>>>>>>>>> Here is the example:
> > > > > > > >>>>>>>>>>
> > > > > > > >>>>>>>>>> - Controller sends R1 and got disconnected before
> > > > receiving
> > > > > > > >>>>>> response.
> > > > > > > >>>>>>>>> Then
> > > > > > > >>>>>>>>>> it reconnects and sends R2. Both requests now stay
> in
> > > the
> > > > > > > >>>>> controller
> > > > > > > >>>>>>>>>> request queue in the order they are sent.
> > > > > > > >>>>>>>>>> - thread1 takes R1_a from the request queue and then
> > > > thread2
> > > > > > > >>>> takes
> > > > > > > >>>>>> R2
> > > > > > > >>>>>>>>> from
> > > > > > > >>>>>>>>>> the request queue almost at the same time.
> > > > > > > >>>>>>>>>> - So R1_a and R2 are processed in parallel. There is
> > > > chance
> > > > > > > >>> that
> > > > > > > >>>>>> R2's
> > > > > > > >>>>>>>>>> processing is completed before R1.
> > > > > > > >>>>>>>>>>
> > > > > > > >>>>>>>>>> If out-of-order processing can happen for both
> > > approaches
> > > > > > > >> with
> > > > > > > >>>>> very
> > > > > > > >>>>>>>> low
> > > > > > > >>>>>>>>>> probability, it may not be worthwhile to add the
> extra
> > > > > > > >> queue.
> > > > > > > >>>> What
> > > > > > > >>>>>> do
> > > > > > > >>>>>>>> you
> > > > > > > >>>>>>>>>> think?
> > > > > > > >>>>>>>>>>
> > > > > > > >>>>>>>>>> Thanks,
> > > > > > > >>>>>>>>>> Dong
> > > > > > > >>>>>>>>>>
> > > > > > > >>>>>>>>>>
> > > > > > > >>>>>>>>>> On Wed, Jul 18, 2018 at 6:17 PM, Becket Qin <
> > > > > > > >>>> becket....@gmail.com
> > > > > > > >>>>>>
> > > > > > > >>>>>>>>> wrote:
> > > > > > > >>>>>>>>>>
> > > > > > > >>>>>>>>>>> Hi Mayuresh/Joel,
> > > > > > > >>>>>>>>>>>
> > > > > > > >>>>>>>>>>> Using the request channel as a dequeue was bright
> up
> > > some
> > > > > > > >>> time
> > > > > > > >>>>> ago
> > > > > > > >>>>>>>> when
> > > > > > > >>>>>>>>>> we
> > > > > > > >>>>>>>>>>> initially thinking of prioritizing the request. The
> > > > > > > >> concern
> > > > > > > >>>> was
> > > > > > > >>>>>> that
> > > > > > > >>>>>>>>> the
> > > > > > > >>>>>>>>>>> controller requests are supposed to be processed in
> > > > order.
> > > > > > > >>> If
> > > > > > > >>>> we
> > > > > > > >>>>>> can
> > > > > > > >>>>>>>>>> ensure
> > > > > > > >>>>>>>>>>> that there is one controller request in the request
> > > > > > > >> channel,
> > > > > > > >>>> the
> > > > > > > >>>>>>>> order
> > > > > > > >>>>>>>>> is
> > > > > > > >>>>>>>>>>> not a concern. But in cases that there are more
> than
> > > one
> > > > > > > >>>>>> controller
> > > > > > > >>>>>>>>>> request
> > > > > > > >>>>>>>>>>> inserted into the queue, the controller request
> order
> > > may
> > > > > > > >>>> change
> > > > > > > >>>>>> and
> > > > > > > >>>>>>>>>> cause
> > > > > > > >>>>>>>>>>> problem. For example, think about the following
> > > sequence:
> > > > > > > >>>>>>>>>>> 1. Controller successfully sent a request R1 to
> > broker
> > > > > > > >>>>>>>>>>> 2. Broker receives R1 and put the request to the
> head
> > > of
> > > > > > > >> the
> > > > > > > >>>>>> request
> > > > > > > >>>>>>>>>> queue.
> > > > > > > >>>>>>>>>>> 3. Controller to broker connection failed and the
> > > > > > > >> controller
> > > > > > > >>>>>>>>> reconnected
> > > > > > > >>>>>>>>>> to
> > > > > > > >>>>>>>>>>> the broker.
> > > > > > > >>>>>>>>>>> 4. Controller sends a request R2 to the broker
> > > > > > > >>>>>>>>>>> 5. Broker receives R2 and add it to the head of the
> > > > > > > >> request
> > > > > > > >>>>> queue.
> > > > > > > >>>>>>>>>>> Now on the broker side, R2 will be processed before
> > R1
> > > is
> > > > > > > >>>>>> processed,
> > > > > > > >>>>>>>>>> which
> > > > > > > >>>>>>>>>>> may cause problem.
> > > > > > > >>>>>>>>>>>
> > > > > > > >>>>>>>>>>> Thanks,
> > > > > > > >>>>>>>>>>>
> > > > > > > >>>>>>>>>>> Jiangjie (Becket) Qin
> > > > > > > >>>>>>>>>>>
> > > > > > > >>>>>>>>>>>
> > > > > > > >>>>>>>>>>>
> > > > > > > >>>>>>>>>>> On Thu, Jul 19, 2018 at 3:23 AM, Joel Koshy <
> > > > > > > >>>>> jjkosh...@gmail.com>
> > > > > > > >>>>>>>>> wrote:
> > > > > > > >>>>>>>>>>>
> > > > > > > >>>>>>>>>>>> @Mayuresh - I like your idea. It appears to be a
> > > simpler
> > > > > > > >>>> less
> > > > > > > >>>>>>>>> invasive
> > > > > > > >>>>>>>>>>>> alternative and it should work. Jun/Becket/others,
> > do
> > > > > > > >> you
> > > > > > > >>>> see
> > > > > > > >>>>>> any
> > > > > > > >>>>>>>>>>> pitfalls
> > > > > > > >>>>>>>>>>>> with this approach?
> > > > > > > >>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>> On Wed, Jul 18, 2018 at 12:03 PM, Lucas Wang <
> > > > > > > >>>>>>>> lucasatu...@gmail.com>
> > > > > > > >>>>>>>>>>>> wrote:
> > > > > > > >>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>> @Mayuresh,
> > > > > > > >>>>>>>>>>>>> That's a very interesting idea that I haven't
> > thought
> > > > > > > >>>>> before.
> > > > > > > >>>>>>>>>>>>> It seems to solve our problem at hand pretty
> well,
> > > and
> > > > > > > >>>> also
> > > > > > > >>>>>>>>>>>>> avoids the need to have a new size metric and
> > > capacity
> > > > > > > >>>>> config
> > > > > > > >>>>>>>>>>>>> for the controller request queue. In fact, if we
> > were
> > > > > > > >> to
> > > > > > > >>>>> adopt
> > > > > > > >>>>>>>>>>>>> this design, there is no public interface change,
> > and
> > > > > > > >> we
> > > > > > > >>>>>>>>>>>>> probably don't need a KIP.
> > > > > > > >>>>>>>>>>>>> Also implementation wise, it seems
> > > > > > > >>>>>>>>>>>>> the java class LinkedBlockingQueue can readily
> > > satisfy
> > > > > > > >>> the
> > > > > > > >>>>>>>>>> requirement
> > > > > > > >>>>>>>>>>>>> by supporting a capacity, and also allowing
> > inserting
> > > > > > > >> at
> > > > > > > >>>>> both
> > > > > > > >>>>>>>> ends.
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>> My only concern is that this design is tied to
> the
> > > > > > > >>>>> coincidence
> > > > > > > >>>>>>>> that
> > > > > > > >>>>>>>>>>>>> we have two request priorities and there are two
> > ends
> > > > > > > >>> to a
> > > > > > > >>>>>>>> deque.
> > > > > > > >>>>>>>>>>>>> Hence by using the proposed design, it seems the
> > > > > > > >> network
> > > > > > > >>>>> layer
> > > > > > > >>>>>>>> is
> > > > > > > >>>>>>>>>>>>> more tightly coupled with upper layer logic, e.g.
> > if
> > > > > > > >> we
> > > > > > > >>>> were
> > > > > > > >>>>>> to
> > > > > > > >>>>>>>> add
> > > > > > > >>>>>>>>>>>>> an extra priority level in the future for some
> > > reason,
> > > > > > > >>> we
> > > > > > > >>>>>> would
> > > > > > > >>>>>>>>>>> probably
> > > > > > > >>>>>>>>>>>>> need to go back to the design of separate queues,
> > one
> > > > > > > >>> for
> > > > > > > >>>>> each
> > > > > > > >>>>>>>>>> priority
> > > > > > > >>>>>>>>>>>>> level.
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>> In summary, I'm ok with both designs and lean
> > toward
> > > > > > > >>> your
> > > > > > > >>>>>>>> suggested
> > > > > > > >>>>>>>>>>>>> approach.
> > > > > > > >>>>>>>>>>>>> Let's hear what others think.
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>> @Becket,
> > > > > > > >>>>>>>>>>>>> In light of Mayuresh's suggested new design, I'm
> > > > > > > >>> answering
> > > > > > > >>>>>> your
> > > > > > > >>>>>>>>>>> question
> > > > > > > >>>>>>>>>>>>> only in the context
> > > > > > > >>>>>>>>>>>>> of the current KIP design: I think your
> suggestion
> > > > > > > >> makes
> > > > > > > >>>>>> sense,
> > > > > > > >>>>>>>> and
> > > > > > > >>>>>>>>>> I'm
> > > > > > > >>>>>>>>>>>> ok
> > > > > > > >>>>>>>>>>>>> with removing the capacity config and
> > > > > > > >>>>>>>>>>>>> just relying on the default value of 20 being
> > > > > > > >> sufficient
> > > > > > > >>>>>> enough.
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>> Thanks,
> > > > > > > >>>>>>>>>>>>> Lucas
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>> On Wed, Jul 18, 2018 at 9:57 AM, Mayuresh Gharat
> <
> > > > > > > >>>>>>>>>>>>> gharatmayures...@gmail.com
> > > > > > > >>>>>>>>>>>>>> wrote:
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>> Hi Lucas,
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>> Seems like the main intent here is to prioritize
> > the
> > > > > > > >>>>>>>> controller
> > > > > > > >>>>>>>>>>> request
> > > > > > > >>>>>>>>>>>>>> over any other requests.
> > > > > > > >>>>>>>>>>>>>> In that case, we can change the request queue
> to a
> > > > > > > >>>>> dequeue,
> > > > > > > >>>>>>>> where
> > > > > > > >>>>>>>>>> you
> > > > > > > >>>>>>>>>>>>>> always insert the normal requests (produce,
> > > > > > > >>>> consume,..etc)
> > > > > > > >>>>>> to
> > > > > > > >>>>>>>> the
> > > > > > > >>>>>>>>>> end
> > > > > > > >>>>>>>>>>>> of
> > > > > > > >>>>>>>>>>>>>> the dequeue, but if its a controller request,
> you
> > > > > > > >>> insert
> > > > > > > >>>>> it
> > > > > > > >>>>>> to
> > > > > > > >>>>>>>>> the
> > > > > > > >>>>>>>>>>> head
> > > > > > > >>>>>>>>>>>>> of
> > > > > > > >>>>>>>>>>>>>> the queue. This ensures that the controller
> > request
> > > > > > > >>> will
> > > > > > > >>>>> be
> > > > > > > >>>>>>>> given
> > > > > > > >>>>>>>>>>>> higher
> > > > > > > >>>>>>>>>>>>>> priority over other requests.
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>> Also since we only read one request from the
> > socket
> > > > > > > >>> and
> > > > > > > >>>>> mute
> > > > > > > >>>>>>>> it
> > > > > > > >>>>>>>>> and
> > > > > > > >>>>>>>>>>>> only
> > > > > > > >>>>>>>>>>>>>> unmute it after handling the request, this would
> > > > > > > >>> ensure
> > > > > > > >>>>> that
> > > > > > > >>>>>>>> we
> > > > > > > >>>>>>>>>> don't
> > > > > > > >>>>>>>>>>>>>> handle controller requests out of order.
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>> With this approach we can avoid the second queue
> > and
> > > > > > > >>> the
> > > > > > > >>>>>>>>> additional
> > > > > > > >>>>>>>>>>>>> config
> > > > > > > >>>>>>>>>>>>>> for the size of the queue.
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>> What do you think ?
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>> Thanks,
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>> Mayuresh
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>> On Wed, Jul 18, 2018 at 3:05 AM Becket Qin <
> > > > > > > >>>>>>>> becket....@gmail.com
> > > > > > > >>>>>>>>>>
> > > > > > > >>>>>>>>>>>> wrote:
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> Hey Joel,
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> Thank for the detail explanation. I agree the
> > > > > > > >>> current
> > > > > > > >>>>>> design
> > > > > > > >>>>>>>>>> makes
> > > > > > > >>>>>>>>>>>>> sense.
> > > > > > > >>>>>>>>>>>>>>> My confusion is about whether the new config
> for
> > > > > > > >> the
> > > > > > > >>>>>>>> controller
> > > > > > > >>>>>>>>>>> queue
> > > > > > > >>>>>>>>>>>>>>> capacity is necessary. I cannot think of a case
> > in
> > > > > > > >>>> which
> > > > > > > >>>>>>>> users
> > > > > > > >>>>>>>>>>> would
> > > > > > > >>>>>>>>>>>>>> change
> > > > > > > >>>>>>>>>>>>>>> it.
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> Thanks,
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> Jiangjie (Becket) Qin
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> On Wed, Jul 18, 2018 at 6:00 PM, Becket Qin <
> > > > > > > >>>>>>>>>> becket....@gmail.com>
> > > > > > > >>>>>>>>>>>>>> wrote:
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>> Hi Lucas,
> > > > > > > >>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>> I guess my question can be rephrased to "do we
> > > > > > > >>>> expect
> > > > > > > >>>>>>>> user to
> > > > > > > >>>>>>>>>>> ever
> > > > > > > >>>>>>>>>>>>>> change
> > > > > > > >>>>>>>>>>>>>>>> the controller request queue capacity"? If we
> > > > > > > >>> agree
> > > > > > > >>>>> that
> > > > > > > >>>>>>>> 20
> > > > > > > >>>>>>>>> is
> > > > > > > >>>>>>>>>>>>> already
> > > > > > > >>>>>>>>>>>>>> a
> > > > > > > >>>>>>>>>>>>>>>> very generous default number and we do not
> > > > > > > >> expect
> > > > > > > >>>> user
> > > > > > > >>>>>> to
> > > > > > > >>>>>>>>>> change
> > > > > > > >>>>>>>>>>>> it,
> > > > > > > >>>>>>>>>>>>> is
> > > > > > > >>>>>>>>>>>>>>> it
> > > > > > > >>>>>>>>>>>>>>>> still necessary to expose this as a config?
> > > > > > > >>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>> Thanks,
> > > > > > > >>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
> > > > > > > >>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>> On Wed, Jul 18, 2018 at 2:29 AM, Lucas Wang <
> > > > > > > >>>>>>>>>>> lucasatu...@gmail.com
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> wrote:
> > > > > > > >>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>> @Becket
> > > > > > > >>>>>>>>>>>>>>>>> 1. Thanks for the comment. You are right that
> > > > > > > >>>>> normally
> > > > > > > >>>>>>>> there
> > > > > > > >>>>>>>>>>>> should
> > > > > > > >>>>>>>>>>>>> be
> > > > > > > >>>>>>>>>>>>>>>>> just
> > > > > > > >>>>>>>>>>>>>>>>> one controller request because of muting,
> > > > > > > >>>>>>>>>>>>>>>>> and I had NOT intended to say there would be
> > > > > > > >> many
> > > > > > > >>>>>>>> enqueued
> > > > > > > >>>>>>>>>>>>> controller
> > > > > > > >>>>>>>>>>>>>>>>> requests.
> > > > > > > >>>>>>>>>>>>>>>>> I went through the KIP again, and I'm not
> sure
> > > > > > > >>>> which
> > > > > > > >>>>>> part
> > > > > > > >>>>>>>>>>> conveys
> > > > > > > >>>>>>>>>>>>> that
> > > > > > > >>>>>>>>>>>>>>>>> info.
> > > > > > > >>>>>>>>>>>>>>>>> I'd be happy to revise if you point it out
> the
> > > > > > > >>>>> section.
> > > > > > > >>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>> 2. Though it should not happen in normal
> > > > > > > >>>> conditions,
> > > > > > > >>>>>> the
> > > > > > > >>>>>>>>>> current
> > > > > > > >>>>>>>>>>>>>> design
> > > > > > > >>>>>>>>>>>>>>>>> does not preclude multiple controllers
> running
> > > > > > > >>>>>>>>>>>>>>>>> at the same time, hence if we don't have the
> > > > > > > >>>>> controller
> > > > > > > >>>>>>>>> queue
> > > > > > > >>>>>>>>>>>>> capacity
> > > > > > > >>>>>>>>>>>>>>>>> config and simply make its capacity to be 1,
> > > > > > > >>>>>>>>>>>>>>>>> network threads handling requests from
> > > > > > > >> different
> > > > > > > >>>>>>>> controllers
> > > > > > > >>>>>>>>>>> will
> > > > > > > >>>>>>>>>>>> be
> > > > > > > >>>>>>>>>>>>>>>>> blocked during those troublesome times,
> > > > > > > >>>>>>>>>>>>>>>>> which is probably not what we want. On the
> > > > > > > >> other
> > > > > > > >>>>> hand,
> > > > > > > >>>>>>>>> adding
> > > > > > > >>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>> extra
> > > > > > > >>>>>>>>>>>>>>>>> config with a default value, say 20, guards
> us
> > > > > > > >>> from
> > > > > > > >>>>>>>> issues
> > > > > > > >>>>>>>>> in
> > > > > > > >>>>>>>>>>>> those
> > > > > > > >>>>>>>>>>>>>>>>> troublesome times, and IMO there isn't much
> > > > > > > >>>> downside
> > > > > > > >>>>> of
> > > > > > > >>>>>>>>> adding
> > > > > > > >>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>> extra
> > > > > > > >>>>>>>>>>>>>>>>> config.
> > > > > > > >>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>> @Mayuresh
> > > > > > > >>>>>>>>>>>>>>>>> Good catch, this sentence is an obsolete
> > > > > > > >>> statement
> > > > > > > >>>>>> based
> > > > > > > >>>>>>>> on
> > > > > > > >>>>>>>>> a
> > > > > > > >>>>>>>>>>>>> previous
> > > > > > > >>>>>>>>>>>>>>>>> design. I've revised the wording in the KIP.
> > > > > > > >>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>> Thanks,
> > > > > > > >>>>>>>>>>>>>>>>> Lucas
> > > > > > > >>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>> On Tue, Jul 17, 2018 at 10:33 AM, Mayuresh
> > > > > > > >>> Gharat <
> > > > > > > >>>>>>>>>>>>>>>>> gharatmayures...@gmail.com> wrote:
> > > > > > > >>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>> Hi Lucas,
> > > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>> Thanks for the KIP.
> > > > > > > >>>>>>>>>>>>>>>>>> I am trying to understand why you think "The
> > > > > > > >>>> memory
> > > > > > > >>>>>>>>>>> consumption
> > > > > > > >>>>>>>>>>>>> can
> > > > > > > >>>>>>>>>>>>>>> rise
> > > > > > > >>>>>>>>>>>>>>>>>> given the total number of queued requests
> can
> > > > > > > >>> go
> > > > > > > >>>> up
> > > > > > > >>>>>> to
> > > > > > > >>>>>>>> 2x"
> > > > > > > >>>>>>>>>> in
> > > > > > > >>>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>>> impact
> > > > > > > >>>>>>>>>>>>>>>>>> section. Normally the requests from
> > > > > > > >> controller
> > > > > > > >>>> to a
> > > > > > > >>>>>>>> Broker
> > > > > > > >>>>>>>>>> are
> > > > > > > >>>>>>>>>>>> not
> > > > > > > >>>>>>>>>>>>>>> high
> > > > > > > >>>>>>>>>>>>>>>>>> volume, right ?
> > > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>> Thanks,
> > > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>> Mayuresh
> > > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>> On Tue, Jul 17, 2018 at 5:06 AM Becket Qin <
> > > > > > > >>>>>>>>>>>> becket....@gmail.com>
> > > > > > > >>>>>>>>>>>>>>>>> wrote:
> > > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>>> Thanks for the KIP, Lucas. Separating the
> > > > > > > >>>> control
> > > > > > > >>>>>>>> plane
> > > > > > > >>>>>>>>>> from
> > > > > > > >>>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>>> data
> > > > > > > >>>>>>>>>>>>>>>>>> plane
> > > > > > > >>>>>>>>>>>>>>>>>>> makes a lot of sense.
> > > > > > > >>>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>>> In the KIP you mentioned that the
> > > > > > > >> controller
> > > > > > > >>>>>> request
> > > > > > > >>>>>>>>> queue
> > > > > > > >>>>>>>>>>> may
> > > > > > > >>>>>>>>>>>>>> have
> > > > > > > >>>>>>>>>>>>>>>>> many
> > > > > > > >>>>>>>>>>>>>>>>>>> requests in it. Will this be a common case?
> > > > > > > >>> The
> > > > > > > >>>>>>>>> controller
> > > > > > > >>>>>>>>>>>>>> requests
> > > > > > > >>>>>>>>>>>>>>>>> still
> > > > > > > >>>>>>>>>>>>>>>>>>> goes through the SocketServer. The
> > > > > > > >>> SocketServer
> > > > > > > >>>>>> will
> > > > > > > >>>>>>>>> mute
> > > > > > > >>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>>> channel
> > > > > > > >>>>>>>>>>>>>>>>>> once
> > > > > > > >>>>>>>>>>>>>>>>>>> a request is read and put into the request
> > > > > > > >>>>> channel.
> > > > > > > >>>>>>>> So
> > > > > > > >>>>>>>>>>>> assuming
> > > > > > > >>>>>>>>>>>>>>> there
> > > > > > > >>>>>>>>>>>>>>>>> is
> > > > > > > >>>>>>>>>>>>>>>>>>> only one connection between controller and
> > > > > > > >>> each
> > > > > > > >>>>>>>> broker,
> > > > > > > >>>>>>>>> on
> > > > > > > >>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>>> broker
> > > > > > > >>>>>>>>>>>>>>>>>> side,
> > > > > > > >>>>>>>>>>>>>>>>>>> there should be only one controller request
> > > > > > > >>> in
> > > > > > > >>>>> the
> > > > > > > >>>>>>>>>>> controller
> > > > > > > >>>>>>>>>>>>>>> request
> > > > > > > >>>>>>>>>>>>>>>>>> queue
> > > > > > > >>>>>>>>>>>>>>>>>>> at any given time. If that is the case, do
> > > > > > > >> we
> > > > > > > >>>>> need
> > > > > > > >>>>>> a
> > > > > > > >>>>>>>>>>> separate
> > > > > > > >>>>>>>>>>>>>>>>> controller
> > > > > > > >>>>>>>>>>>>>>>>>>> request queue capacity config? The default
> > > > > > > >>>> value
> > > > > > > >>>>> 20
> > > > > > > >>>>>>>>> means
> > > > > > > >>>>>>>>>>> that
> > > > > > > >>>>>>>>>>>>> we
> > > > > > > >>>>>>>>>>>>>>>>> expect
> > > > > > > >>>>>>>>>>>>>>>>>>> there are 20 controller switches to happen
> > > > > > > >>> in a
> > > > > > > >>>>>> short
> > > > > > > >>>>>>>>>> period
> > > > > > > >>>>>>>>>>>> of
> > > > > > > >>>>>>>>>>>>>>> time.
> > > > > > > >>>>>>>>>>>>>>>>> I
> > > > > > > >>>>>>>>>>>>>>>>>> am
> > > > > > > >>>>>>>>>>>>>>>>>>> not sure whether someone should increase
> > > > > > > >> the
> > > > > > > >>>>>>>> controller
> > > > > > > >>>>>>>>>>>> request
> > > > > > > >>>>>>>>>>>>>>> queue
> > > > > > > >>>>>>>>>>>>>>>>>>> capacity to handle such case, as it seems
> > > > > > > >>>>>> indicating
> > > > > > > >>>>>>>>>>> something
> > > > > > > >>>>>>>>>>>>>> very
> > > > > > > >>>>>>>>>>>>>>>>> wrong
> > > > > > > >>>>>>>>>>>>>>>>>>> has happened.
> > > > > > > >>>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>>> Thanks,
> > > > > > > >>>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
> > > > > > > >>>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>>> On Fri, Jul 13, 2018 at 1:10 PM, Dong Lin <
> > > > > > > >>>>>>>>>>>> lindon...@gmail.com>
> > > > > > > >>>>>>>>>>>>>>>>> wrote:
> > > > > > > >>>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>>>> Thanks for the update Lucas.
> > > > > > > >>>>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>>>> I think the motivation section is
> > > > > > > >>> intuitive.
> > > > > > > >>>> It
> > > > > > > >>>>>>>> will
> > > > > > > >>>>>>>>> be
> > > > > > > >>>>>>>>>>> good
> > > > > > > >>>>>>>>>>>>> to
> > > > > > > >>>>>>>>>>>>>>>>> learn
> > > > > > > >>>>>>>>>>>>>>>>>>> more
> > > > > > > >>>>>>>>>>>>>>>>>>>> about the comments from other reviewers.
> > > > > > > >>>>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>>>> On Thu, Jul 12, 2018 at 9:48 PM, Lucas
> > > > > > > >>> Wang <
> > > > > > > >>>>>>>>>>>>>>> lucasatu...@gmail.com>
> > > > > > > >>>>>>>>>>>>>>>>>>> wrote:
> > > > > > > >>>>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>>>>> Hi Dong,
> > > > > > > >>>>>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>>>>> I've updated the motivation section of
> > > > > > > >>> the
> > > > > > > >>>>> KIP
> > > > > > > >>>>>> by
> > > > > > > >>>>>>>>>>>> explaining
> > > > > > > >>>>>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>>>>>> cases
> > > > > > > >>>>>>>>>>>>>>>>>>>> that
> > > > > > > >>>>>>>>>>>>>>>>>>>>> would have user impacts.
> > > > > > > >>>>>>>>>>>>>>>>>>>>> Please take a look at let me know your
> > > > > > > >>>>>> comments.
> > > > > > > >>>>>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>>>>> Thanks,
> > > > > > > >>>>>>>>>>>>>>>>>>>>> Lucas
> > > > > > > >>>>>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>>>>> On Mon, Jul 9, 2018 at 5:53 PM, Lucas
> > > > > > > >>> Wang
> > > > > > > >>>> <
> > > > > > > >>>>>>>>>>>>>>> lucasatu...@gmail.com
> > > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>>>> wrote:
> > > > > > > >>>>>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> Hi Dong,
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> The simulation of disk being slow is
> > > > > > > >>>> merely
> > > > > > > >>>>>>>> for me
> > > > > > > >>>>>>>>>> to
> > > > > > > >>>>>>>>>>>>> easily
> > > > > > > >>>>>>>>>>>>>>>>>>> construct
> > > > > > > >>>>>>>>>>>>>>>>>>>> a
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> testing scenario
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> with a backlog of produce requests.
> > > > > > > >> In
> > > > > > > >>>>>>>> production,
> > > > > > > >>>>>>>>>>> other
> > > > > > > >>>>>>>>>>>>>> than
> > > > > > > >>>>>>>>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>>>>>>> disk
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> being slow, a backlog of
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> produce requests may also be caused
> > > > > > > >> by
> > > > > > > >>>> high
> > > > > > > >>>>>>>>> produce
> > > > > > > >>>>>>>>>>> QPS.
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> In that case, we may not want to kill
> > > > > > > >>> the
> > > > > > > >>>>>>>> broker
> > > > > > > >>>>>>>>> and
> > > > > > > >>>>>>>>>>>>> that's
> > > > > > > >>>>>>>>>>>>>>> when
> > > > > > > >>>>>>>>>>>>>>>>>> this
> > > > > > > >>>>>>>>>>>>>>>>>>>> KIP
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> can be useful, both for JBOD
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> and non-JBOD setup.
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> Going back to your previous question
> > > > > > > >>>> about
> > > > > > > >>>>>> each
> > > > > > > >>>>>>>>>>>>>> ProduceRequest
> > > > > > > >>>>>>>>>>>>>>>>>>> covering
> > > > > > > >>>>>>>>>>>>>>>>>>>>> 20
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> partitions that are randomly
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> distributed, let's say a LeaderAndIsr
> > > > > > > >>>>> request
> > > > > > > >>>>>>>> is
> > > > > > > >>>>>>>>>>>> enqueued
> > > > > > > >>>>>>>>>>>>>> that
> > > > > > > >>>>>>>>>>>>>>>>>> tries
> > > > > > > >>>>>>>>>>>>>>>>>>> to
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> switch the current broker, say
> > > > > > > >> broker0,
> > > > > > > >>>>> from
> > > > > > > >>>>>>>>> leader
> > > > > > > >>>>>>>>>> to
> > > > > > > >>>>>>>>>>>>>>> follower
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> *for one of the partitions*, say
> > > > > > > >>>> *test-0*.
> > > > > > > >>>>>> For
> > > > > > > >>>>>>>> the
> > > > > > > >>>>>>>>>>> sake
> > > > > > > >>>>>>>>>>>> of
> > > > > > > >>>>>>>>>>>>>>>>>> argument,
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> let's also assume the other brokers,
> > > > > > > >>> say
> > > > > > > >>>>>>>> broker1,
> > > > > > > >>>>>>>>>> have
> > > > > > > >>>>>>>>>>>>>>> *stopped*
> > > > > > > >>>>>>>>>>>>>>>>>>>> fetching
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> from
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> the current broker, i.e. broker0.
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> 1. If the enqueued produce requests
> > > > > > > >>> have
> > > > > > > >>>>>> acks =
> > > > > > > >>>>>>>>> -1
> > > > > > > >>>>>>>>>>>> (ALL)
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>  1.1 without this KIP, the
> > > > > > > >>>> ProduceRequests
> > > > > > > >>>>>>>> ahead
> > > > > > > >>>>>>>>> of
> > > > > > > >>>>>>>>>>>>>>>>> LeaderAndISR
> > > > > > > >>>>>>>>>>>>>>>>>>> will
> > > > > > > >>>>>>>>>>>>>>>>>>>> be
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> put into the purgatory,
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>        and since they'll never be
> > > > > > > >>>>> replicated
> > > > > > > >>>>>>>> to
> > > > > > > >>>>>>>>>> other
> > > > > > > >>>>>>>>>>>>>> brokers
> > > > > > > >>>>>>>>>>>>>>>>>>> (because
> > > > > > > >>>>>>>>>>>>>>>>>>>>> of
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> the assumption made above), they will
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>        be completed either when the
> > > > > > > >>>>>>>> LeaderAndISR
> > > > > > > >>>>>>>>>>>> request
> > > > > > > >>>>>>>>>>>>> is
> > > > > > > >>>>>>>>>>>>>>>>>>> processed
> > > > > > > >>>>>>>>>>>>>>>>>>>> or
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> when the timeout happens.
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>  1.2 With this KIP, broker0 will
> > > > > > > >>>>> immediately
> > > > > > > >>>>>>>>>>> transition
> > > > > > > >>>>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>>>>>>> partition
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> test-0 to become a follower,
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>        after the current broker sees
> > > > > > > >>> the
> > > > > > > >>>>>>>>>> replication
> > > > > > > >>>>>>>>>>> of
> > > > > > > >>>>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>>>>>>> remaining
> > > > > > > >>>>>>>>>>>>>>>>>>>> 19
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> partitions, it can send a response
> > > > > > > >>>>> indicating
> > > > > > > >>>>>>>> that
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>        it's no longer the leader for
> > > > > > > >>> the
> > > > > > > >>>>>>>>> "test-0".
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>  To see the latency difference
> > > > > > > >> between
> > > > > > > >>>> 1.1
> > > > > > > >>>>>> and
> > > > > > > >>>>>>>>> 1.2,
> > > > > > > >>>>>>>>>>>> let's
> > > > > > > >>>>>>>>>>>>>> say
> > > > > > > >>>>>>>>>>>>>>>>>> there
> > > > > > > >>>>>>>>>>>>>>>>>>>> are
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> 24K produce requests ahead of the
> > > > > > > >>>>>> LeaderAndISR,
> > > > > > > >>>>>>>>> and
> > > > > > > >>>>>>>>>>>> there
> > > > > > > >>>>>>>>>>>>>> are
> > > > > > > >>>>>>>>>>>>>>> 8
> > > > > > > >>>>>>>>>>>>>>>>> io
> > > > > > > >>>>>>>>>>>>>>>>>>>>> threads,
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>  so each io thread will process
> > > > > > > >>>>>> approximately
> > > > > > > >>>>>>>>> 3000
> > > > > > > >>>>>>>>>>>>> produce
> > > > > > > >>>>>>>>>>>>>>>>>> requests.
> > > > > > > >>>>>>>>>>>>>>>>>>>> Now
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> let's investigate the io thread that
> > > > > > > >>>>> finally
> > > > > > > >>>>>>>>>> processed
> > > > > > > >>>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>>>>>>>> LeaderAndISR.
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>  For the 3000 produce requests, if
> > > > > > > >> we
> > > > > > > >>>>> model
> > > > > > > >>>>>>>> the
> > > > > > > >>>>>>>>>> time
> > > > > > > >>>>>>>>>>>> when
> > > > > > > >>>>>>>>>>>>>>> their
> > > > > > > >>>>>>>>>>>>>>>>>>>>> remaining
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> 19 partitions catch up as t0, t1,
> > > > > > > >>>> ...t2999,
> > > > > > > >>>>>> and
> > > > > > > >>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>>> LeaderAndISR
> > > > > > > >>>>>>>>>>>>>>>>>>>> request
> > > > > > > >>>>>>>>>>>>>>>>>>>>> is
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> processed at time t3000.
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>  Without this KIP, the 1st produce
> > > > > > > >>>> request
> > > > > > > >>>>>>>> would
> > > > > > > >>>>>>>>>> have
> > > > > > > >>>>>>>>>>>>>> waited
> > > > > > > >>>>>>>>>>>>>>> an
> > > > > > > >>>>>>>>>>>>>>>>>>> extra
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> t3000 - t0 time in the purgatory, the
> > > > > > > >>> 2nd
> > > > > > > >>>>> an
> > > > > > > >>>>>>>> extra
> > > > > > > >>>>>>>>>>> time
> > > > > > > >>>>>>>>>>>> of
> > > > > > > >>>>>>>>>>>>>>>>> t3000 -
> > > > > > > >>>>>>>>>>>>>>>>>>> t1,
> > > > > > > >>>>>>>>>>>>>>>>>>>>> etc.
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>  Roughly speaking, the latency
> > > > > > > >>>> difference
> > > > > > > >>>>> is
> > > > > > > >>>>>>>>> bigger
> > > > > > > >>>>>>>>>>> for
> > > > > > > >>>>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>>>>>> earlier
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> produce requests than for the later
> > > > > > > >>> ones.
> > > > > > > >>>>> For
> > > > > > > >>>>>>>> the
> > > > > > > >>>>>>>>>> same
> > > > > > > >>>>>>>>>>>>>> reason,
> > > > > > > >>>>>>>>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>>>>>>> more
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> ProduceRequests queued
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>  before the LeaderAndISR, the bigger
> > > > > > > >>>>> benefit
> > > > > > > >>>>>>>> we
> > > > > > > >>>>>>>>> get
> > > > > > > >>>>>>>>>>>>> (capped
> > > > > > > >>>>>>>>>>>>>>> by
> > > > > > > >>>>>>>>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> produce timeout).
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> 2. If the enqueued produce requests
> > > > > > > >>> have
> > > > > > > >>>>>>>> acks=0 or
> > > > > > > >>>>>>>>>>>> acks=1
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>  There will be no latency
> > > > > > > >> differences
> > > > > > > >>> in
> > > > > > > >>>>>> this
> > > > > > > >>>>>>>>> case,
> > > > > > > >>>>>>>>>>> but
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>  2.1 without this KIP, the records
> > > > > > > >> of
> > > > > > > >>>>>>>> partition
> > > > > > > >>>>>>>>>>> test-0
> > > > > > > >>>>>>>>>>>> in
> > > > > > > >>>>>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> ProduceRequests ahead of the
> > > > > > > >>> LeaderAndISR
> > > > > > > >>>>>> will
> > > > > > > >>>>>>>> be
> > > > > > > >>>>>>>>>>>> appended
> > > > > > > >>>>>>>>>>>>>> to
> > > > > > > >>>>>>>>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>>>>>>> local
> > > > > > > >>>>>>>>>>>>>>>>>>>>> log,
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>        and eventually be truncated
> > > > > > > >>> after
> > > > > > > >>>>>>>>> processing
> > > > > > > >>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>>>>>>> LeaderAndISR.
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> This is what's referred to as
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>        "some unofficial definition
> > > > > > > >> of
> > > > > > > >>>> data
> > > > > > > >>>>>>>> loss
> > > > > > > >>>>>>>>> in
> > > > > > > >>>>>>>>>>>> terms
> > > > > > > >>>>>>>>>>>>> of
> > > > > > > >>>>>>>>>>>>>>>>>> messages
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> beyond the high watermark".
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>  2.2 with this KIP, we can mitigate
> > > > > > > >>> the
> > > > > > > >>>>>> effect
> > > > > > > >>>>>>>>>> since
> > > > > > > >>>>>>>>>>> if
> > > > > > > >>>>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>>>>>>>> LeaderAndISR
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> is immediately processed, the
> > > > > > > >> response
> > > > > > > >>> to
> > > > > > > >>>>>>>>> producers
> > > > > > > >>>>>>>>>>> will
> > > > > > > >>>>>>>>>>>>>> have
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>        the NotLeaderForPartition
> > > > > > > >>> error,
> > > > > > > >>>>>>>> causing
> > > > > > > >>>>>>>>>>>> producers
> > > > > > > >>>>>>>>>>>>>> to
> > > > > > > >>>>>>>>>>>>>>>>> retry
> > > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>>>>>>>> This explanation above is the benefit
> > > > > > > >>> for
> > > > > > > >>>>>>>> reducing
>


-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125

Reply via email to