Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

Becket Qin Mon, 23 Jul 2018 18:00:43 -0700

Personally I am not fond of the dequeue approach simply because it is
against the basic idea of isolating the controller plane and data plane.
With a single dequeue, theoretically speaking the controller requests can
starve the clients requests. I would prefer the approach with a separate
controller request queue and a dedicated controller request handler thread.


Thanks,

Jiangjie (Becket) Qin

On Tue, Jul 24, 2018 at 8:16 AM, Lucas Wang <[email protected]> wrote:

> Sure, I can summarize the usage of correlation id. But before I do that, it
> seems
> the same out-of-order processing can also happen to Produce requests sent
> by producers,
> following the same example you described earlier.
> If that's the case, I think this probably deserves a separate doc and
> design independent of this KIP.
>
> Lucas
>
>
>
> On Mon, Jul 23, 2018 at 12:39 PM, Dong Lin <[email protected]> wrote:
>
> > Hey Lucas,
> >
> > Could you update the KIP if you are confident with the approach which
> uses
> > correlation id? The idea around correlation id is kind of scattered
> across
> > multiple emails. It will be useful if other reviews can read the KIP to
> > understand the latest proposal.
> >
> > Thanks,
> > Dong
> >
> > On Mon, Jul 23, 2018 at 12:32 PM, Mayuresh Gharat <
> > [email protected]> wrote:
> >
> > > I like the idea of the dequeue implementation by Lucas. This will help
> us
> > > avoid additional queue for controller and additional configs in Kafka.
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > > On Sun, Jul 22, 2018 at 2:58 AM Becket Qin <[email protected]>
> wrote:
> > >
> > > > Hi Jun,
> > > >
> > > > The usage of correlation ID might still be useful to address the
> cases
> > > > that the controller epoch and leader epoch check are not sufficient
> to
> > > > guarantee correct behavior. For example, if the controller sends a
> > > > LeaderAndIsrRequest followed by a StopReplicaRequest, and the broker
> > > > processes it in the reverse order, the replica may still be wrongly
> > > > recreated, right?
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > > On Jul 22, 2018, at 11:47 AM, Jun Rao <[email protected]> wrote:
> > > > >
> > > > > Hmm, since we already use controller epoch and leader epoch for
> > > properly
> > > > > caching the latest partition state, do we really need correlation
> id
> > > for
> > > > > ordering the controller requests?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Fri, Jul 20, 2018 at 2:18 PM, Becket Qin <[email protected]>
> > > > wrote:
> > > > >
> > > > >> Lucas and Mayuresh,
> > > > >>
> > > > >> Good idea. The correlation id should work.
> > > > >>
> > > > >> In the ControllerChannelManager, a request will be resent until a
> > > > response
> > > > >> is received. So if the controller to broker connection disconnects
> > > after
> > > > >> controller sends R1_a, but before the response of R1_a is
> received,
> > a
> > > > >> disconnection may cause the controller to resend R1_b. i.e. until
> R1
> > > is
> > > > >> acked, R2 won't be sent by the controller.
> > > > >> This gives two guarantees:
> > > > >> 1. Correlation id wise: R1_a < R1_b < R2.
> > > > >> 2. On the broker side, when R2 is seen, R1 must have been
> processed
> > at
> > > > >> least once.
> > > > >>
> > > > >> So on the broker side, with a single thread controller request
> > > handler,
> > > > the
> > > > >> logic should be:
> > > > >> 1. Process what ever request seen in the controller request queue
> > > > >> 2. For the given epoch, drop request if its correlation id is
> > smaller
> > > > than
> > > > >> that of the last processed request.
> > > > >>
> > > > >> Thanks,
> > > > >>
> > > > >> Jiangjie (Becket) Qin
> > > > >>
> > > > >> On Fri, Jul 20, 2018 at 8:07 AM, Jun Rao <[email protected]>
> wrote:
> > > > >>
> > > > >>> I agree that there is no strong ordering when there are more than
> > one
> > > > >>> socket connections. Currently, we rely on controllerEpoch and
> > > > leaderEpoch
> > > > >>> to ensure that the receiving broker picks up the latest state for
> > > each
> > > > >>> partition.
> > > > >>>
> > > > >>> One potential issue with the dequeue approach is that if the
> queue
> > is
> > > > >> full,
> > > > >>> there is no guarantee that the controller requests will be
> enqueued
> > > > >>> quickly.
> > > > >>>
> > > > >>> Thanks,
> > > > >>>
> > > > >>> Jun
> > > > >>>
> > > > >>> On Fri, Jul 20, 2018 at 5:25 AM, Mayuresh Gharat <
> > > > >>> [email protected]
> > > > >>>> wrote:
> > > > >>>
> > > > >>>> Yea, the correlationId is only set to 0 in the NetworkClient
> > > > >> constructor.
> > > > >>>> Since we reuse the same NetworkClient between Controller and the
> > > > >> broker,
> > > > >>> a
> > > > >>>> disconnection should not cause it to reset to 0, in which case
> it
> > > can
> > > > >> be
> > > > >>>> used to reject obsolete requests.
> > > > >>>>
> > > > >>>> Thanks,
> > > > >>>>
> > > > >>>> Mayuresh
> > > > >>>>
> > > > >>>> On Thu, Jul 19, 2018 at 1:52 PM Lucas Wang <
> [email protected]
> > >
> > > > >>> wrote:
> > > > >>>>
> > > > >>>>> @Dong,
> > > > >>>>> Great example and explanation, thanks!
> > > > >>>>>
> > > > >>>>> @All
> > > > >>>>> Regarding the example given by Dong, it seems even if we use a
> > > queue,
> > > > >>>> and a
> > > > >>>>> dedicated controller request handling thread,
> > > > >>>>> the same result can still happen because R1_a will be sent on
> one
> > > > >>>>> connection, and R1_b & R2 will be sent on a different
> connection,
> > > > >>>>> and there is no ordering between different connections on the
> > > broker
> > > > >>>> side.
> > > > >>>>> I was discussing with Mayuresh offline, and it seems
> correlation
> > id
> > > > >>>> within
> > > > >>>>> the same NetworkClient object is monotonically increasing and
> > never
> > > > >>>> reset,
> > > > >>>>> hence a broker can leverage that to properly reject obsolete
> > > > >> requests.
> > > > >>>>> Thoughts?
> > > > >>>>>
> > > > >>>>> Thanks,
> > > > >>>>> Lucas
> > > > >>>>>
> > > > >>>>> On Thu, Jul 19, 2018 at 12:11 PM, Mayuresh Gharat <
> > > > >>>>> [email protected]> wrote:
> > > > >>>>>
> > > > >>>>>> Actually nvm, correlationId is reset in case of connection
> > loss, I
> > > > >>>> think.
> > > > >>>>>>
> > > > >>>>>> Thanks,
> > > > >>>>>>
> > > > >>>>>> Mayuresh
> > > > >>>>>>
> > > > >>>>>> On Thu, Jul 19, 2018 at 11:11 AM Mayuresh Gharat <
> > > > >>>>>> [email protected]>
> > > > >>>>>> wrote:
> > > > >>>>>>
> > > > >>>>>>> I agree with Dong that out-of-order processing can happen
> with
> > > > >>>> having 2
> > > > >>>>>>> separate queues as well and it can even happen today.
> > > > >>>>>>> Can we use the correlationId in the request from the
> controller
> > > > >> to
> > > > >>>> the
> > > > >>>>>>> broker to handle ordering ?
> > > > >>>>>>>
> > > > >>>>>>> Thanks,
> > > > >>>>>>>
> > > > >>>>>>> Mayuresh
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> On Thu, Jul 19, 2018 at 6:41 AM Becket Qin <
> > [email protected]
> > > > >>>
> > > > >>>>> wrote:
> > > > >>>>>>>
> > > > >>>>>>>> Good point, Joel. I agree that a dedicated controller
> request
> > > > >>>> handling
> > > > >>>>>>>> thread would be a better isolation. It also solves the
> > > > >> reordering
> > > > >>>>> issue.
> > > > >>>>>>>>
> > > > >>>>>>>> On Thu, Jul 19, 2018 at 2:23 PM, Joel Koshy <
> > > > >> [email protected]>
> > > > >>>>>> wrote:
> > > > >>>>>>>>
> > > > >>>>>>>>> Good example. I think this scenario can occur in the
> current
> > > > >>> code
> > > > >>>> as
> > > > >>>>>>>> well
> > > > >>>>>>>>> but with even lower probability given that there are other
> > > > >>>>>>>> non-controller
> > > > >>>>>>>>> requests interleaved. It is still sketchy though and I
> think
> > a
> > > > >>>> safer
> > > > >>>>>>>>> approach would be separate queues and pinning controller
> > > > >> request
> > > > >>>>>>>> handling
> > > > >>>>>>>>> to one handler thread.
> > > > >>>>>>>>>
> > > > >>>>>>>>> On Wed, Jul 18, 2018 at 11:12 PM, Dong Lin <
> > > > >> [email protected]
> > > > >>>>
> > > > >>>>>> wrote:
> > > > >>>>>>>>>
> > > > >>>>>>>>>> Hey Becket,
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> I think you are right that there may be out-of-order
> > > > >>> processing.
> > > > >>>>>>>> However,
> > > > >>>>>>>>>> it seems that out-of-order processing may also happen even
> > > > >> if
> > > > >>> we
> > > > >>>>>> use a
> > > > >>>>>>>>>> separate queue.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Here is the example:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> - Controller sends R1 and got disconnected before
> receiving
> > > > >>>>>> response.
> > > > >>>>>>>>> Then
> > > > >>>>>>>>>> it reconnects and sends R2. Both requests now stay in the
> > > > >>>>> controller
> > > > >>>>>>>>>> request queue in the order they are sent.
> > > > >>>>>>>>>> - thread1 takes R1_a from the request queue and then
> thread2
> > > > >>>> takes
> > > > >>>>>> R2
> > > > >>>>>>>>> from
> > > > >>>>>>>>>> the request queue almost at the same time.
> > > > >>>>>>>>>> - So R1_a and R2 are processed in parallel. There is
> chance
> > > > >>> that
> > > > >>>>>> R2's
> > > > >>>>>>>>>> processing is completed before R1.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> If out-of-order processing can happen for both approaches
> > > > >> with
> > > > >>>>> very
> > > > >>>>>>>> low
> > > > >>>>>>>>>> probability, it may not be worthwhile to add the extra
> > > > >> queue.
> > > > >>>> What
> > > > >>>>>> do
> > > > >>>>>>>> you
> > > > >>>>>>>>>> think?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Thanks,
> > > > >>>>>>>>>> Dong
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> On Wed, Jul 18, 2018 at 6:17 PM, Becket Qin <
> > > > >>>> [email protected]
> > > > >>>>>>
> > > > >>>>>>>>> wrote:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>> Hi Mayuresh/Joel,
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Using the request channel as a dequeue was bright up some
> > > > >>> time
> > > > >>>>> ago
> > > > >>>>>>>> when
> > > > >>>>>>>>>> we
> > > > >>>>>>>>>>> initially thinking of prioritizing the request. The
> > > > >> concern
> > > > >>>> was
> > > > >>>>>> that
> > > > >>>>>>>>> the
> > > > >>>>>>>>>>> controller requests are supposed to be processed in
> order.
> > > > >>> If
> > > > >>>> we
> > > > >>>>>> can
> > > > >>>>>>>>>> ensure
> > > > >>>>>>>>>>> that there is one controller request in the request
> > > > >> channel,
> > > > >>>> the
> > > > >>>>>>>> order
> > > > >>>>>>>>> is
> > > > >>>>>>>>>>> not a concern. But in cases that there are more than one
> > > > >>>>>> controller
> > > > >>>>>>>>>> request
> > > > >>>>>>>>>>> inserted into the queue, the controller request order may
> > > > >>>> change
> > > > >>>>>> and
> > > > >>>>>>>>>> cause
> > > > >>>>>>>>>>> problem. For example, think about the following sequence:
> > > > >>>>>>>>>>> 1. Controller successfully sent a request R1 to broker
> > > > >>>>>>>>>>> 2. Broker receives R1 and put the request to the head of
> > > > >> the
> > > > >>>>>> request
> > > > >>>>>>>>>> queue.
> > > > >>>>>>>>>>> 3. Controller to broker connection failed and the
> > > > >> controller
> > > > >>>>>>>>> reconnected
> > > > >>>>>>>>>> to
> > > > >>>>>>>>>>> the broker.
> > > > >>>>>>>>>>> 4. Controller sends a request R2 to the broker
> > > > >>>>>>>>>>> 5. Broker receives R2 and add it to the head of the
> > > > >> request
> > > > >>>>> queue.
> > > > >>>>>>>>>>> Now on the broker side, R2 will be processed before R1 is
> > > > >>>>>> processed,
> > > > >>>>>>>>>> which
> > > > >>>>>>>>>>> may cause problem.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Jiangjie (Becket) Qin
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> On Thu, Jul 19, 2018 at 3:23 AM, Joel Koshy <
> > > > >>>>> [email protected]>
> > > > >>>>>>>>> wrote:
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> @Mayuresh - I like your idea. It appears to be a simpler
> > > > >>>> less
> > > > >>>>>>>>> invasive
> > > > >>>>>>>>>>>> alternative and it should work. Jun/Becket/others, do
> > > > >> you
> > > > >>>> see
> > > > >>>>>> any
> > > > >>>>>>>>>>> pitfalls
> > > > >>>>>>>>>>>> with this approach?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> On Wed, Jul 18, 2018 at 12:03 PM, Lucas Wang <
> > > > >>>>>>>> [email protected]>
> > > > >>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> @Mayuresh,
> > > > >>>>>>>>>>>>> That's a very interesting idea that I haven't thought
> > > > >>>>> before.
> > > > >>>>>>>>>>>>> It seems to solve our problem at hand pretty well, and
> > > > >>>> also
> > > > >>>>>>>>>>>>> avoids the need to have a new size metric and capacity
> > > > >>>>> config
> > > > >>>>>>>>>>>>> for the controller request queue. In fact, if we were
> > > > >> to
> > > > >>>>> adopt
> > > > >>>>>>>>>>>>> this design, there is no public interface change, and
> > > > >> we
> > > > >>>>>>>>>>>>> probably don't need a KIP.
> > > > >>>>>>>>>>>>> Also implementation wise, it seems
> > > > >>>>>>>>>>>>> the java class LinkedBlockingQueue can readily satisfy
> > > > >>> the
> > > > >>>>>>>>>> requirement
> > > > >>>>>>>>>>>>> by supporting a capacity, and also allowing inserting
> > > > >> at
> > > > >>>>> both
> > > > >>>>>>>> ends.
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> My only concern is that this design is tied to the
> > > > >>>>> coincidence
> > > > >>>>>>>> that
> > > > >>>>>>>>>>>>> we have two request priorities and there are two ends
> > > > >>> to a
> > > > >>>>>>>> deque.
> > > > >>>>>>>>>>>>> Hence by using the proposed design, it seems the
> > > > >> network
> > > > >>>>> layer
> > > > >>>>>>>> is
> > > > >>>>>>>>>>>>> more tightly coupled with upper layer logic, e.g. if
> > > > >> we
> > > > >>>> were
> > > > >>>>>> to
> > > > >>>>>>>> add
> > > > >>>>>>>>>>>>> an extra priority level in the future for some reason,
> > > > >>> we
> > > > >>>>>> would
> > > > >>>>>>>>>>> probably
> > > > >>>>>>>>>>>>> need to go back to the design of separate queues, one
> > > > >>> for
> > > > >>>>> each
> > > > >>>>>>>>>> priority
> > > > >>>>>>>>>>>>> level.
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> In summary, I'm ok with both designs and lean toward
> > > > >>> your
> > > > >>>>>>>> suggested
> > > > >>>>>>>>>>>>> approach.
> > > > >>>>>>>>>>>>> Let's hear what others think.
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> @Becket,
> > > > >>>>>>>>>>>>> In light of Mayuresh's suggested new design, I'm
> > > > >>> answering
> > > > >>>>>> your
> > > > >>>>>>>>>>> question
> > > > >>>>>>>>>>>>> only in the context
> > > > >>>>>>>>>>>>> of the current KIP design: I think your suggestion
> > > > >> makes
> > > > >>>>>> sense,
> > > > >>>>>>>> and
> > > > >>>>>>>>>> I'm
> > > > >>>>>>>>>>>> ok
> > > > >>>>>>>>>>>>> with removing the capacity config and
> > > > >>>>>>>>>>>>> just relying on the default value of 20 being
> > > > >> sufficient
> > > > >>>>>> enough.
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>>> Lucas
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> On Wed, Jul 18, 2018 at 9:57 AM, Mayuresh Gharat <
> > > > >>>>>>>>>>>>> [email protected]
> > > > >>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> Hi Lucas,
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> Seems like the main intent here is to prioritize the
> > > > >>>>>>>> controller
> > > > >>>>>>>>>>> request
> > > > >>>>>>>>>>>>>> over any other requests.
> > > > >>>>>>>>>>>>>> In that case, we can change the request queue to a
> > > > >>>>> dequeue,
> > > > >>>>>>>> where
> > > > >>>>>>>>>> you
> > > > >>>>>>>>>>>>>> always insert the normal requests (produce,
> > > > >>>> consume,..etc)
> > > > >>>>>> to
> > > > >>>>>>>> the
> > > > >>>>>>>>>> end
> > > > >>>>>>>>>>>> of
> > > > >>>>>>>>>>>>>> the dequeue, but if its a controller request, you
> > > > >>> insert
> > > > >>>>> it
> > > > >>>>>> to
> > > > >>>>>>>>> the
> > > > >>>>>>>>>>> head
> > > > >>>>>>>>>>>>> of
> > > > >>>>>>>>>>>>>> the queue. This ensures that the controller request
> > > > >>> will
> > > > >>>>> be
> > > > >>>>>>>> given
> > > > >>>>>>>>>>>> higher
> > > > >>>>>>>>>>>>>> priority over other requests.
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> Also since we only read one request from the socket
> > > > >>> and
> > > > >>>>> mute
> > > > >>>>>>>> it
> > > > >>>>>>>>> and
> > > > >>>>>>>>>>>> only
> > > > >>>>>>>>>>>>>> unmute it after handling the request, this would
> > > > >>> ensure
> > > > >>>>> that
> > > > >>>>>>>> we
> > > > >>>>>>>>>> don't
> > > > >>>>>>>>>>>>>> handle controller requests out of order.
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> With this approach we can avoid the second queue and
> > > > >>> the
> > > > >>>>>>>>> additional
> > > > >>>>>>>>>>>>> config
> > > > >>>>>>>>>>>>>> for the size of the queue.
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> What do you think ?
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> Mayuresh
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> On Wed, Jul 18, 2018 at 3:05 AM Becket Qin <
> > > > >>>>>>>> [email protected]
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Hey Joel,
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Thank for the detail explanation. I agree the
> > > > >>> current
> > > > >>>>>> design
> > > > >>>>>>>>>> makes
> > > > >>>>>>>>>>>>> sense.
> > > > >>>>>>>>>>>>>>> My confusion is about whether the new config for
> > > > >> the
> > > > >>>>>>>> controller
> > > > >>>>>>>>>>> queue
> > > > >>>>>>>>>>>>>>> capacity is necessary. I cannot think of a case in
> > > > >>>> which
> > > > >>>>>>>> users
> > > > >>>>>>>>>>> would
> > > > >>>>>>>>>>>>>> change
> > > > >>>>>>>>>>>>>>> it.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Jiangjie (Becket) Qin
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> On Wed, Jul 18, 2018 at 6:00 PM, Becket Qin <
> > > > >>>>>>>>>> [email protected]>
> > > > >>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> Hi Lucas,
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> I guess my question can be rephrased to "do we
> > > > >>>> expect
> > > > >>>>>>>> user to
> > > > >>>>>>>>>>> ever
> > > > >>>>>>>>>>>>>> change
> > > > >>>>>>>>>>>>>>>> the controller request queue capacity"? If we
> > > > >>> agree
> > > > >>>>> that
> > > > >>>>>>>> 20
> > > > >>>>>>>>> is
> > > > >>>>>>>>>>>>> already
> > > > >>>>>>>>>>>>>> a
> > > > >>>>>>>>>>>>>>>> very generous default number and we do not
> > > > >> expect
> > > > >>>> user
> > > > >>>>>> to
> > > > >>>>>>>>>> change
> > > > >>>>>>>>>>>> it,
> > > > >>>>>>>>>>>>> is
> > > > >>>>>>>>>>>>>>> it
> > > > >>>>>>>>>>>>>>>> still necessary to expose this as a config?
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> On Wed, Jul 18, 2018 at 2:29 AM, Lucas Wang <
> > > > >>>>>>>>>>> [email protected]
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>> @Becket
> > > > >>>>>>>>>>>>>>>>> 1. Thanks for the comment. You are right that
> > > > >>>>> normally
> > > > >>>>>>>> there
> > > > >>>>>>>>>>>> should
> > > > >>>>>>>>>>>>> be
> > > > >>>>>>>>>>>>>>>>> just
> > > > >>>>>>>>>>>>>>>>> one controller request because of muting,
> > > > >>>>>>>>>>>>>>>>> and I had NOT intended to say there would be
> > > > >> many
> > > > >>>>>>>> enqueued
> > > > >>>>>>>>>>>>> controller
> > > > >>>>>>>>>>>>>>>>> requests.
> > > > >>>>>>>>>>>>>>>>> I went through the KIP again, and I'm not sure
> > > > >>>> which
> > > > >>>>>> part
> > > > >>>>>>>>>>> conveys
> > > > >>>>>>>>>>>>> that
> > > > >>>>>>>>>>>>>>>>> info.
> > > > >>>>>>>>>>>>>>>>> I'd be happy to revise if you point it out the
> > > > >>>>> section.
> > > > >>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>> 2. Though it should not happen in normal
> > > > >>>> conditions,
> > > > >>>>>> the
> > > > >>>>>>>>>> current
> > > > >>>>>>>>>>>>>> design
> > > > >>>>>>>>>>>>>>>>> does not preclude multiple controllers running
> > > > >>>>>>>>>>>>>>>>> at the same time, hence if we don't have the
> > > > >>>>> controller
> > > > >>>>>>>>> queue
> > > > >>>>>>>>>>>>> capacity
> > > > >>>>>>>>>>>>>>>>> config and simply make its capacity to be 1,
> > > > >>>>>>>>>>>>>>>>> network threads handling requests from
> > > > >> different
> > > > >>>>>>>> controllers
> > > > >>>>>>>>>>> will
> > > > >>>>>>>>>>>> be
> > > > >>>>>>>>>>>>>>>>> blocked during those troublesome times,
> > > > >>>>>>>>>>>>>>>>> which is probably not what we want. On the
> > > > >> other
> > > > >>>>> hand,
> > > > >>>>>>>>> adding
> > > > >>>>>>>>>>> the
> > > > >>>>>>>>>>>>>> extra
> > > > >>>>>>>>>>>>>>>>> config with a default value, say 20, guards us
> > > > >>> from
> > > > >>>>>>>> issues
> > > > >>>>>>>>> in
> > > > >>>>>>>>>>>> those
> > > > >>>>>>>>>>>>>>>>> troublesome times, and IMO there isn't much
> > > > >>>> downside
> > > > >>>>> of
> > > > >>>>>>>>> adding
> > > > >>>>>>>>>>> the
> > > > >>>>>>>>>>>>>> extra
> > > > >>>>>>>>>>>>>>>>> config.
> > > > >>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>> @Mayuresh
> > > > >>>>>>>>>>>>>>>>> Good catch, this sentence is an obsolete
> > > > >>> statement
> > > > >>>>>> based
> > > > >>>>>>>> on
> > > > >>>>>>>>> a
> > > > >>>>>>>>>>>>> previous
> > > > >>>>>>>>>>>>>>>>> design. I've revised the wording in the KIP.
> > > > >>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>>>>>>> Lucas
> > > > >>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>> On Tue, Jul 17, 2018 at 10:33 AM, Mayuresh
> > > > >>> Gharat <
> > > > >>>>>>>>>>>>>>>>> [email protected]> wrote:
> > > > >>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>> Hi Lucas,
> > > > >>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>> Thanks for the KIP.
> > > > >>>>>>>>>>>>>>>>>> I am trying to understand why you think "The
> > > > >>>> memory
> > > > >>>>>>>>>>> consumption
> > > > >>>>>>>>>>>>> can
> > > > >>>>>>>>>>>>>>> rise
> > > > >>>>>>>>>>>>>>>>>> given the total number of queued requests can
> > > > >>> go
> > > > >>>> up
> > > > >>>>>> to
> > > > >>>>>>>> 2x"
> > > > >>>>>>>>>> in
> > > > >>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>> impact
> > > > >>>>>>>>>>>>>>>>>> section. Normally the requests from
> > > > >> controller
> > > > >>>> to a
> > > > >>>>>>>> Broker
> > > > >>>>>>>>>> are
> > > > >>>>>>>>>>>> not
> > > > >>>>>>>>>>>>>>> high
> > > > >>>>>>>>>>>>>>>>>> volume, right ?
> > > > >>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>> Mayuresh
> > > > >>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>> On Tue, Jul 17, 2018 at 5:06 AM Becket Qin <
> > > > >>>>>>>>>>>> [email protected]>
> > > > >>>>>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>> Thanks for the KIP, Lucas. Separating the
> > > > >>>> control
> > > > >>>>>>>> plane
> > > > >>>>>>>>>> from
> > > > >>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>> data
> > > > >>>>>>>>>>>>>>>>>> plane
> > > > >>>>>>>>>>>>>>>>>>> makes a lot of sense.
> > > > >>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>> In the KIP you mentioned that the
> > > > >> controller
> > > > >>>>>> request
> > > > >>>>>>>>> queue
> > > > >>>>>>>>>>> may
> > > > >>>>>>>>>>>>>> have
> > > > >>>>>>>>>>>>>>>>> many
> > > > >>>>>>>>>>>>>>>>>>> requests in it. Will this be a common case?
> > > > >>> The
> > > > >>>>>>>>> controller
> > > > >>>>>>>>>>>>>> requests
> > > > >>>>>>>>>>>>>>>>> still
> > > > >>>>>>>>>>>>>>>>>>> goes through the SocketServer. The
> > > > >>> SocketServer
> > > > >>>>>> will
> > > > >>>>>>>>> mute
> > > > >>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>> channel
> > > > >>>>>>>>>>>>>>>>>> once
> > > > >>>>>>>>>>>>>>>>>>> a request is read and put into the request
> > > > >>>>> channel.
> > > > >>>>>>>> So
> > > > >>>>>>>>>>>> assuming
> > > > >>>>>>>>>>>>>>> there
> > > > >>>>>>>>>>>>>>>>> is
> > > > >>>>>>>>>>>>>>>>>>> only one connection between controller and
> > > > >>> each
> > > > >>>>>>>> broker,
> > > > >>>>>>>>> on
> > > > >>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>> broker
> > > > >>>>>>>>>>>>>>>>>> side,
> > > > >>>>>>>>>>>>>>>>>>> there should be only one controller request
> > > > >>> in
> > > > >>>>> the
> > > > >>>>>>>>>>> controller
> > > > >>>>>>>>>>>>>>> request
> > > > >>>>>>>>>>>>>>>>>> queue
> > > > >>>>>>>>>>>>>>>>>>> at any given time. If that is the case, do
> > > > >> we
> > > > >>>>> need
> > > > >>>>>> a
> > > > >>>>>>>>>>> separate
> > > > >>>>>>>>>>>>>>>>> controller
> > > > >>>>>>>>>>>>>>>>>>> request queue capacity config? The default
> > > > >>>> value
> > > > >>>>> 20
> > > > >>>>>>>>> means
> > > > >>>>>>>>>>> that
> > > > >>>>>>>>>>>>> we
> > > > >>>>>>>>>>>>>>>>> expect
> > > > >>>>>>>>>>>>>>>>>>> there are 20 controller switches to happen
> > > > >>> in a
> > > > >>>>>> short
> > > > >>>>>>>>>> period
> > > > >>>>>>>>>>>> of
> > > > >>>>>>>>>>>>>>> time.
> > > > >>>>>>>>>>>>>>>>> I
> > > > >>>>>>>>>>>>>>>>>> am
> > > > >>>>>>>>>>>>>>>>>>> not sure whether someone should increase
> > > > >> the
> > > > >>>>>>>> controller
> > > > >>>>>>>>>>>> request
> > > > >>>>>>>>>>>>>>> queue
> > > > >>>>>>>>>>>>>>>>>>> capacity to handle such case, as it seems
> > > > >>>>>> indicating
> > > > >>>>>>>>>>> something
> > > > >>>>>>>>>>>>>> very
> > > > >>>>>>>>>>>>>>>>> wrong
> > > > >>>>>>>>>>>>>>>>>>> has happened.
> > > > >>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
> > > > >>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>> On Fri, Jul 13, 2018 at 1:10 PM, Dong Lin <
> > > > >>>>>>>>>>>> [email protected]>
> > > > >>>>>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>> Thanks for the update Lucas.
> > > > >>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>> I think the motivation section is
> > > > >>> intuitive.
> > > > >>>> It
> > > > >>>>>>>> will
> > > > >>>>>>>>> be
> > > > >>>>>>>>>>> good
> > > > >>>>>>>>>>>>> to
> > > > >>>>>>>>>>>>>>>>> learn
> > > > >>>>>>>>>>>>>>>>>>> more
> > > > >>>>>>>>>>>>>>>>>>>> about the comments from other reviewers.
> > > > >>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>> On Thu, Jul 12, 2018 at 9:48 PM, Lucas
> > > > >>> Wang <
> > > > >>>>>>>>>>>>>>> [email protected]>
> > > > >>>>>>>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>> Hi Dong,
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>> I've updated the motivation section of
> > > > >>> the
> > > > >>>>> KIP
> > > > >>>>>> by
> > > > >>>>>>>>>>>> explaining
> > > > >>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>> cases
> > > > >>>>>>>>>>>>>>>>>>>> that
> > > > >>>>>>>>>>>>>>>>>>>>> would have user impacts.
> > > > >>>>>>>>>>>>>>>>>>>>> Please take a look at let me know your
> > > > >>>>>> comments.
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>>>>>>>>>>> Lucas
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>> On Mon, Jul 9, 2018 at 5:53 PM, Lucas
> > > > >>> Wang
> > > > >>>> <
> > > > >>>>>>>>>>>>>>> [email protected]
> > > > >>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>> Hi Dong,
> > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>> The simulation of disk being slow is
> > > > >>>> merely
> > > > >>>>>>>> for me
> > > > >>>>>>>>>> to
> > > > >>>>>>>>>>>>> easily
> > > > >>>>>>>>>>>>>>>>>>> construct
> > > > >>>>>>>>>>>>>>>>>>>> a
> > > > >>>>>>>>>>>>>>>>>>>>>> testing scenario
> > > > >>>>>>>>>>>>>>>>>>>>>> with a backlog of produce requests.
> > > > >> In
> > > > >>>>>>>> production,
> > > > >>>>>>>>>>> other
> > > > >>>>>>>>>>>>>> than
> > > > >>>>>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>> disk
> > > > >>>>>>>>>>>>>>>>>>>>>> being slow, a backlog of
> > > > >>>>>>>>>>>>>>>>>>>>>> produce requests may also be caused
> > > > >> by
> > > > >>>> high
> > > > >>>>>>>>> produce
> > > > >>>>>>>>>>> QPS.
> > > > >>>>>>>>>>>>>>>>>>>>>> In that case, we may not want to kill
> > > > >>> the
> > > > >>>>>>>> broker
> > > > >>>>>>>>> and
> > > > >>>>>>>>>>>>> that's
> > > > >>>>>>>>>>>>>>> when
> > > > >>>>>>>>>>>>>>>>>> this
> > > > >>>>>>>>>>>>>>>>>>>> KIP
> > > > >>>>>>>>>>>>>>>>>>>>>> can be useful, both for JBOD
> > > > >>>>>>>>>>>>>>>>>>>>>> and non-JBOD setup.
> > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>> Going back to your previous question
> > > > >>>> about
> > > > >>>>>> each
> > > > >>>>>>>>>>>>>> ProduceRequest
> > > > >>>>>>>>>>>>>>>>>>> covering
> > > > >>>>>>>>>>>>>>>>>>>>> 20
> > > > >>>>>>>>>>>>>>>>>>>>>> partitions that are randomly
> > > > >>>>>>>>>>>>>>>>>>>>>> distributed, let's say a LeaderAndIsr
> > > > >>>>> request
> > > > >>>>>>>> is
> > > > >>>>>>>>>>>> enqueued
> > > > >>>>>>>>>>>>>> that
> > > > >>>>>>>>>>>>>>>>>> tries
> > > > >>>>>>>>>>>>>>>>>>> to
> > > > >>>>>>>>>>>>>>>>>>>>>> switch the current broker, say
> > > > >> broker0,
> > > > >>>>> from
> > > > >>>>>>>>> leader
> > > > >>>>>>>>>> to
> > > > >>>>>>>>>>>>>>> follower
> > > > >>>>>>>>>>>>>>>>>>>>>> *for one of the partitions*, say
> > > > >>>> *test-0*.
> > > > >>>>>> For
> > > > >>>>>>>> the
> > > > >>>>>>>>>>> sake
> > > > >>>>>>>>>>>> of
> > > > >>>>>>>>>>>>>>>>>> argument,
> > > > >>>>>>>>>>>>>>>>>>>>>> let's also assume the other brokers,
> > > > >>> say
> > > > >>>>>>>> broker1,
> > > > >>>>>>>>>> have
> > > > >>>>>>>>>>>>>>> *stopped*
> > > > >>>>>>>>>>>>>>>>>>>> fetching
> > > > >>>>>>>>>>>>>>>>>>>>>> from
> > > > >>>>>>>>>>>>>>>>>>>>>> the current broker, i.e. broker0.
> > > > >>>>>>>>>>>>>>>>>>>>>> 1. If the enqueued produce requests
> > > > >>> have
> > > > >>>>>> acks =
> > > > >>>>>>>>> -1
> > > > >>>>>>>>>>>> (ALL)
> > > > >>>>>>>>>>>>>>>>>>>>>>  1.1 without this KIP, the
> > > > >>>> ProduceRequests
> > > > >>>>>>>> ahead
> > > > >>>>>>>>> of
> > > > >>>>>>>>>>>>>>>>> LeaderAndISR
> > > > >>>>>>>>>>>>>>>>>>> will
> > > > >>>>>>>>>>>>>>>>>>>> be
> > > > >>>>>>>>>>>>>>>>>>>>>> put into the purgatory,
> > > > >>>>>>>>>>>>>>>>>>>>>>        and since they'll never be
> > > > >>>>> replicated
> > > > >>>>>>>> to
> > > > >>>>>>>>>> other
> > > > >>>>>>>>>>>>>> brokers
> > > > >>>>>>>>>>>>>>>>>>> (because
> > > > >>>>>>>>>>>>>>>>>>>>> of
> > > > >>>>>>>>>>>>>>>>>>>>>> the assumption made above), they will
> > > > >>>>>>>>>>>>>>>>>>>>>>        be completed either when the
> > > > >>>>>>>> LeaderAndISR
> > > > >>>>>>>>>>>> request
> > > > >>>>>>>>>>>>> is
> > > > >>>>>>>>>>>>>>>>>>> processed
> > > > >>>>>>>>>>>>>>>>>>>> or
> > > > >>>>>>>>>>>>>>>>>>>>>> when the timeout happens.
> > > > >>>>>>>>>>>>>>>>>>>>>>  1.2 With this KIP, broker0 will
> > > > >>>>> immediately
> > > > >>>>>>>>>>> transition
> > > > >>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>> partition
> > > > >>>>>>>>>>>>>>>>>>>>>> test-0 to become a follower,
> > > > >>>>>>>>>>>>>>>>>>>>>>        after the current broker sees
> > > > >>> the
> > > > >>>>>>>>>> replication
> > > > >>>>>>>>>>> of
> > > > >>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>> remaining
> > > > >>>>>>>>>>>>>>>>>>>> 19
> > > > >>>>>>>>>>>>>>>>>>>>>> partitions, it can send a response
> > > > >>>>> indicating
> > > > >>>>>>>> that
> > > > >>>>>>>>>>>>>>>>>>>>>>        it's no longer the leader for
> > > > >>> the
> > > > >>>>>>>>> "test-0".
> > > > >>>>>>>>>>>>>>>>>>>>>>  To see the latency difference
> > > > >> between
> > > > >>>> 1.1
> > > > >>>>>> and
> > > > >>>>>>>>> 1.2,
> > > > >>>>>>>>>>>> let's
> > > > >>>>>>>>>>>>>> say
> > > > >>>>>>>>>>>>>>>>>> there
> > > > >>>>>>>>>>>>>>>>>>>> are
> > > > >>>>>>>>>>>>>>>>>>>>>> 24K produce requests ahead of the
> > > > >>>>>> LeaderAndISR,
> > > > >>>>>>>>> and
> > > > >>>>>>>>>>>> there
> > > > >>>>>>>>>>>>>> are
> > > > >>>>>>>>>>>>>>> 8
> > > > >>>>>>>>>>>>>>>>> io
> > > > >>>>>>>>>>>>>>>>>>>>> threads,
> > > > >>>>>>>>>>>>>>>>>>>>>>  so each io thread will process
> > > > >>>>>> approximately
> > > > >>>>>>>>> 3000
> > > > >>>>>>>>>>>>> produce
> > > > >>>>>>>>>>>>>>>>>> requests.
> > > > >>>>>>>>>>>>>>>>>>>> Now
> > > > >>>>>>>>>>>>>>>>>>>>>> let's investigate the io thread that
> > > > >>>>> finally
> > > > >>>>>>>>>> processed
> > > > >>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>>> LeaderAndISR.
> > > > >>>>>>>>>>>>>>>>>>>>>>  For the 3000 produce requests, if
> > > > >> we
> > > > >>>>> model
> > > > >>>>>>>> the
> > > > >>>>>>>>>> time
> > > > >>>>>>>>>>>> when
> > > > >>>>>>>>>>>>>>> their
> > > > >>>>>>>>>>>>>>>>>>>>> remaining
> > > > >>>>>>>>>>>>>>>>>>>>>> 19 partitions catch up as t0, t1,
> > > > >>>> ...t2999,
> > > > >>>>>> and
> > > > >>>>>>>>> the
> > > > >>>>>>>>>>>>>>> LeaderAndISR
> > > > >>>>>>>>>>>>>>>>>>>> request
> > > > >>>>>>>>>>>>>>>>>>>>> is
> > > > >>>>>>>>>>>>>>>>>>>>>> processed at time t3000.
> > > > >>>>>>>>>>>>>>>>>>>>>>  Without this KIP, the 1st produce
> > > > >>>> request
> > > > >>>>>>>> would
> > > > >>>>>>>>>> have
> > > > >>>>>>>>>>>>>> waited
> > > > >>>>>>>>>>>>>>> an
> > > > >>>>>>>>>>>>>>>>>>> extra
> > > > >>>>>>>>>>>>>>>>>>>>>> t3000 - t0 time in the purgatory, the
> > > > >>> 2nd
> > > > >>>>> an
> > > > >>>>>>>> extra
> > > > >>>>>>>>>>> time
> > > > >>>>>>>>>>>> of
> > > > >>>>>>>>>>>>>>>>> t3000 -
> > > > >>>>>>>>>>>>>>>>>>> t1,
> > > > >>>>>>>>>>>>>>>>>>>>> etc.
> > > > >>>>>>>>>>>>>>>>>>>>>>  Roughly speaking, the latency
> > > > >>>> difference
> > > > >>>>> is
> > > > >>>>>>>>> bigger
> > > > >>>>>>>>>>> for
> > > > >>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>> earlier
> > > > >>>>>>>>>>>>>>>>>>>>>> produce requests than for the later
> > > > >>> ones.
> > > > >>>>> For
> > > > >>>>>>>> the
> > > > >>>>>>>>>> same
> > > > >>>>>>>>>>>>>> reason,
> > > > >>>>>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>> more
> > > > >>>>>>>>>>>>>>>>>>>>>> ProduceRequests queued
> > > > >>>>>>>>>>>>>>>>>>>>>>  before the LeaderAndISR, the bigger
> > > > >>>>> benefit
> > > > >>>>>>>> we
> > > > >>>>>>>>> get
> > > > >>>>>>>>>>>>> (capped
> > > > >>>>>>>>>>>>>>> by
> > > > >>>>>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>>>>> produce timeout).
> > > > >>>>>>>>>>>>>>>>>>>>>> 2. If the enqueued produce requests
> > > > >>> have
> > > > >>>>>>>> acks=0 or
> > > > >>>>>>>>>>>> acks=1
> > > > >>>>>>>>>>>>>>>>>>>>>>  There will be no latency
> > > > >> differences
> > > > >>> in
> > > > >>>>>> this
> > > > >>>>>>>>> case,
> > > > >>>>>>>>>>> but
> > > > >>>>>>>>>>>>>>>>>>>>>>  2.1 without this KIP, the records
> > > > >> of
> > > > >>>>>>>> partition
> > > > >>>>>>>>>>> test-0
> > > > >>>>>>>>>>>> in
> > > > >>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>>>>> ProduceRequests ahead of the
> > > > >>> LeaderAndISR
> > > > >>>>>> will
> > > > >>>>>>>> be
> > > > >>>>>>>>>>>> appended
> > > > >>>>>>>>>>>>>> to
> > > > >>>>>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>> local
> > > > >>>>>>>>>>>>>>>>>>>>> log,
> > > > >>>>>>>>>>>>>>>>>>>>>>        and eventually be truncated
> > > > >>> after
> > > > >>>>>>>>> processing
> > > > >>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>> LeaderAndISR.
> > > > >>>>>>>>>>>>>>>>>>>>>> This is what's referred to as
> > > > >>>>>>>>>>>>>>>>>>>>>>        "some unofficial definition
> > > > >> of
> > > > >>>> data
> > > > >>>>>>>> loss
> > > > >>>>>>>>> in
> > > > >>>>>>>>>>>> terms
> > > > >>>>>>>>>>>>> of
> > > > >>>>>>>>>>>>>>>>>> messages
> > > > >>>>>>>>>>>>>>>>>>>>>> beyond the high watermark".
> > > > >>>>>>>>>>>>>>>>>>>>>>  2.2 with this KIP, we can mitigate
> > > > >>> the
> > > > >>>>>> effect
> > > > >>>>>>>>>> since
> > > > >>>>>>>>>>> if
> > > > >>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>>> LeaderAndISR
> > > > >>>>>>>>>>>>>>>>>>>>>> is immediately processed, the
> > > > >> response
> > > > >>> to
> > > > >>>>>>>>> producers
> > > > >>>>>>>>>>> will
> > > > >>>>>>>>>>>>>> have
> > > > >>>>>>>>>>>>>>>>>>>>>>        the NotLeaderForPartition
> > > > >>> error,
> > > > >>>>>>>> causing
> > > > >>>>>>>>>>>> producers
> > > > >>>>>>>>>>>>>> to
> > > > >>>>>>>>>>>>>>>>> retry
> > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>> This explanation above is the benefit
> > > > >>> for
> > > > >>>>>>>> reducing
> > > > >>>>>>>>>> the
> > > > >>>>>>>>>>>>>> latency
> > > > >>>>>>>>>>>>>>>>> of a
> > > > >>>>>>>>>>>>>>>>>>>>> broker
> > > > >>>>>>>>>>>>>>>>>>>>>> becoming the follower,
> > > > >>>>>>>>>>>>>>>>>>>>>> closely related is reducing the
> > > > >> latency
> > > > >>>> of
> > > > >>>>> a
> > > > >>>>>>>>> broker
> > > > >>>>>>>>>>>>> becoming
> > > > >>>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>>> leader.
> > > > >>>>>>>>>>>>>>>>>>>>>> In this case, the benefit is even
> > > > >> more
> > > > >>>>>>>> obvious, if
> > > > >>>>>>>>>>> other
> > > > >>>>>>>>>>>>>>> brokers
> > > > >>>>>>>>>>>>>>>>>> have
> > > > >>>>>>>>>>>>>>>>>>>>>> resigned leadership, and the
> > > > >>>>>>>>>>>>>>>>>>>>>> current broker should take
> > > > >> leadership.
> > > > >>>> Any
> > > > >>>>>>>> delay
> > > > >>>>>>>>> in
> > > > >>>>>>>>>>>>>> processing
> > > > >>>>>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>>>>> LeaderAndISR will be perceived
> > > > >>>>>>>>>>>>>>>>>>>>>> by clients as unavailability. In
> > > > >>> extreme
> > > > >>>>>> cases,
> > > > >>>>>>>>> this
> > > > >>>>>>>>>>> can
> > > > >>>>>>>>>>>>>> cause
> > > > >>>>>>>>>>>>>>>>>> failed
> > > > >>>>>>>>>>>>>>>>>>>>>> produce requests if the retries are
> > > > >>>>>>>>>>>>>>>>>>>>>> exhausted.
> > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>> Another two types of controller
> > > > >>> requests
> > > > >>>>> are
> > > > >>>>>>>>>>>>> UpdateMetadata
> > > > >>>>>>>>>>>>>>> and
> > > > >>>>>>>>>>>>>>>>>>>>>> StopReplica, which I'll briefly
> > > > >> discuss
> > > > >>>> as
> > > > >>>>>>>>> follows:
> > > > >>>>>>>>>>>>>>>>>>>>>> For UpdateMetadata requests, delayed
> > > > >>>>>> processing
> > > > >>>>>>>>>> means
> > > > >>>>>>>>>>>>>> clients
> > > > >>>>>>>>>>>>>>>>>>> receiving
> > > > >>>>>>>>>>>>>>>>>>>>>> stale metadata, e.g. with the wrong
> > > > >>>>>> leadership
> > > > >>>>>>>>> info
> > > > >>>>>>>>>>>>>>>>>>>>>> for certain partitions, and the
> > > > >> effect
> > > > >>> is
> > > > >>>>>> more
> > > > >>>>>>>>>> retries
> > > > >>>>>>>>>>>> or
> > > > >>>>>>>>>>>>>> even
> > > > >>>>>>>>>>>>>>>>>> fatal
> > > > >>>>>>>>>>>>>>>>>>>>>> failure if the retries are exhausted.
> > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>> For StopReplica requests, a long
> > > > >>> queuing
> > > > >>>>> time
> > > > >>>>>>>> may
> > > > >>>>>>>>>>>> degrade
> > > > >>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>>> performance
> > > > >>>>>>>>>>>>>>>>>>>>>> of topic deletion.
> > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>> Regarding your last question of the
> > > > >>> delay
> > > > >>>>> for
> > > > >>>>>>>>>>>>>>>>>> DescribeLogDirsRequest,
> > > > >>>>>>>>>>>>>>>>>>>> you
> > > > >>>>>>>>>>>>>>>>>>>>>> are right
> > > > >>>>>>>>>>>>>>>>>>>>>> that this KIP cannot help with the
> > > > >>>> latency
> > > > >>>>> in
> > > > >>>>>>>>>> getting
> > > > >>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>> log
> > > > >>>>>>>>>>>>>>>>> dirs
> > > > >>>>>>>>>>>>>>>>>>>> info,
> > > > >>>>>>>>>>>>>>>>>>>>>> and it's only relevant
> > > > >>>>>>>>>>>>>>>>>>>>>> when controller requests are
> > > > >> involved.
> > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>> Regards,
> > > > >>>>>>>>>>>>>>>>>>>>>> Lucas
> > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>> On Tue, Jul 3, 2018 at 5:11 PM, Dong
> > > > >>> Lin
> > > > >>>> <
> > > > >>>>>>>>>>>>>> [email protected]
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>> Hey Jun,
> > > > >>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>> Thanks much for the comments. It is
> > > > >>> good
> > > > >>>>>>>> point.
> > > > >>>>>>>>> So
> > > > >>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>> feature
> > > > >>>>>>>>>>>>>>>>> may
> > > > >>>>>>>>>>>>>>>>>>> be
> > > > >>>>>>>>>>>>>>>>>>>>>>> useful for JBOD use-case. I have one
> > > > >>>>>> question
> > > > >>>>>>>>>> below.
> > > > >>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>> Hey Lucas,
> > > > >>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>> Do you think this feature is also
> > > > >>> useful
> > > > >>>>> for
> > > > >>>>>>>>>> non-JBOD
> > > > >>>>>>>>>>>>> setup
> > > > >>>>>>>>>>>>>>> or
> > > > >>>>>>>>>>>>>>>>> it
> > > > >>>>>>>>>>>>>>>>>> is
> > > > >>>>>>>>>>>>>>>>>>>>> only
> > > > >>>>>>>>>>>>>>>>>>>>>>> useful for the JBOD setup? It may be
> > > > >>>>> useful
> > > > >>>>>> to
> > > > >>>>>>>>>>>> understand
> > > > >>>>>>>>>>>>>>> this.
> > > > >>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>> When the broker is setup using JBOD,
> > > > >>> in
> > > > >>>>>> order
> > > > >>>>>>>> to
> > > > >>>>>>>>>> move
> > > > >>>>>>>>>>>>>> leaders
> > > > >>>>>>>>>>>>>>>>> on
> > > > >>>>>>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>>>>>> failed
> > > > >>>>>>>>>>>>>>>>>>>>>>> disk to other disks, the system
> > > > >>> operator
> > > > >>>>>> first
> > > > >>>>>>>>>> needs
> > > > >>>>>>>>>>> to
> > > > >>>>>>>>>>>>> get
> > > > >>>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>> list
> > > > >>>>>>>>>>>>>>>>>>>> of
> > > > >>>>>>>>>>>>>>>>>>>>>>> partitions on the failed disk. This
> > > > >> is
> > > > >>>>>>>> currently
> > > > >>>>>>>>>>>> achieved
> > > > >>>>>>>>>>>>>>> using
> > > > >>>>>>>>>>>>>>>>>>>>>>> AdminClient.describeLogDirs(), which
> > > > >>>> sends
> > > > >>>>>>>>>>>>>>>>> DescribeLogDirsRequest
> > > > >>>>>>>>>>>>>>>>>> to
> > > > >>>>>>>>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>>>>>> broker. If we only prioritize the
> > > > >>>>> controller
> > > > >>>>>>>>>>> requests,
> > > > >>>>>>>>>>>>> then
> > > > >>>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>>>>>> DescribeLogDirsRequest
> > > > >>>>>>>>>>>>>>>>>>>>>>> may still take a long time to be
> > > > >>>> processed
> > > > >>>>>> by
> > > > >>>>>>>> the
> > > > >>>>>>>>>>>> broker.
> > > > >>>>>>>>>>>>>> So
> > > > >>>>>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>>> overall
> > > > >>>>>>>>>>>>>>>>>>>>>>> time to move leaders away from the
> > > > >>>> failed
> > > > >>>>>> disk
> > > > >>>>>>>>> may
> > > > >>>>>>>>>>>> still
> > > > >>>>>>>>>>>>> be
> > > > >>>>>>>>>>>>>>>>> long
> > > > >>>>>>>>>>>>>>>>>>> even
> > > > >>>>>>>>>>>>>>>>>>>>> with
> > > > >>>>>>>>>>>>>>>>>>>>>>> this KIP. What do you think?
> > > > >>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>>>>>>>>>>>>> Dong
> > > > >>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>> On Tue, Jul 3, 2018 at 4:38 PM,
> > > > >> Lucas
> > > > >>>>> Wang <
> > > > >>>>>>>>>>>>>>>>> [email protected]
> > > > >>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>> Thanks for the insightful comment,
> > > > >>>> Jun.
> > > > >>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>> @Dong,
> > > > >>>>>>>>>>>>>>>>>>>>>>>> Since both of the two comments in
> > > > >>> your
> > > > >>>>>>>> previous
> > > > >>>>>>>>>>> email
> > > > >>>>>>>>>>>>> are
> > > > >>>>>>>>>>>>>>>>> about
> > > > >>>>>>>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>>>>>>> benefits of this KIP and whether
> > > > >>> it's
> > > > >>>>>>>> useful,
> > > > >>>>>>>>>>>>>>>>>>>>>>>> in light of Jun's last comment, do
> > > > >>> you
> > > > >>>>>> agree
> > > > >>>>>>>>> that
> > > > >>>>>>>>>>>> this
> > > > >>>>>>>>>>>>>> KIP
> > > > >>>>>>>>>>>>>>>>> can
> > > > >>>>>>>>>>>>>>>>>> be
> > > > >>>>>>>>>>>>>>>>>>>>>>>> beneficial in the case mentioned
> > > > >> by
> > > > >>>> Jun?
> > > > >>>>>>>>>>>>>>>>>>>>>>>> Please let me know, thanks!
> > > > >>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>> Regards,
> > > > >>>>>>>>>>>>>>>>>>>>>>>> Lucas
> > > > >>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>> On Tue, Jul 3, 2018 at 2:07 PM,
> > > > >> Jun
> > > > >>>> Rao
> > > > >>>>> <
> > > > >>>>>>>>>>>>>> [email protected]>
> > > > >>>>>>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>>> Hi, Lucas, Dong,
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>>> If all disks on a broker are
> > > > >> slow,
> > > > >>>> one
> > > > >>>>>>>>> probably
> > > > >>>>>>>>>>>>> should
> > > > >>>>>>>>>>>>>>> just
> > > > >>>>>>>>>>>>>>>>>> kill
> > > > >>>>>>>>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>>>>>>>> broker. In that case, this KIP
> > > > >> may
> > > > >>>> not
> > > > >>>>>>>> help.
> > > > >>>>>>>>> If
> > > > >>>>>>>>>>>> only
> > > > >>>>>>>>>>>>>> one
> > > > >>>>>>>>>>>>>>> of
> > > > >>>>>>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>>>> disks
> > > > >>>>>>>>>>>>>>>>>>>>>>> on
> > > > >>>>>>>>>>>>>>>>>>>>>>>> a
> > > > >>>>>>>>>>>>>>>>>>>>>>>>> broker is slow, one may want to
> > > > >>> fail
> > > > >>>>>> that
> > > > >>>>>>>>> disk
> > > > >>>>>>>>>>> and
> > > > >>>>>>>>>>>>> move
> > > > >>>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>>> leaders
> > > > >>>>>>>>>>>>>>>>>>>>> on
> > > > >>>>>>>>>>>>>>>>>>>>>>>> that
> > > > >>>>>>>>>>>>>>>>>>>>>>>>> disk to other brokers. In that
> > > > >>> case,
> > > > >>>>>> being
> > > > >>>>>>>>> able
> > > > >>>>>>>>>>> to
> > > > >>>>>>>>>>>>>>> process
> > > > >>>>>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>>>>>>> LeaderAndIsr
> > > > >>>>>>>>>>>>>>>>>>>>>>>>> requests faster will potentially
> > > > >>>> help
> > > > >>>>>> the
> > > > >>>>>>>>>>> producers
> > > > >>>>>>>>>>>>>>> recover
> > > > >>>>>>>>>>>>>>>>>>>> quicker.
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>>> Jun
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Jul 2, 2018 at 7:56 PM,
> > > > >>> Dong
> > > > >>>>>> Lin <
> > > > >>>>>>>>>>>>>>>>> [email protected]
> > > > >>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Hey Lucas,
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the reply. Some
> > > > >>> follow
> > > > >>>> up
> > > > >>>>>>>>>> questions
> > > > >>>>>>>>>>>>> below.
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Regarding 1, if each
> > > > >>>> ProduceRequest
> > > > >>>>>>>> covers
> > > > >>>>>>>>> 20
> > > > >>>>>>>>>>>>>>> partitions
> > > > >>>>>>>>>>>>>>>>>> that
> > > > >>>>>>>>>>>>>>>>>>>> are
> > > > >>>>>>>>>>>>>>>>>>>>>>>>> randomly
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> distributed across all
> > > > >>> partitions,
> > > > >>>>>> then
> > > > >>>>>>>>> each
> > > > >>>>>>>>>>>>>>>>> ProduceRequest
> > > > >>>>>>>>>>>>>>>>>>> will
> > > > >>>>>>>>>>>>>>>>>>>>>>> likely
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> cover some partitions for
> > > > >> which
> > > > >>>> the
> > > > >>>>>>>> broker
> > > > >>>>>>>>> is
> > > > >>>>>>>>>>>> still
> > > > >>>>>>>>>>>>>>>>> leader
> > > > >>>>>>>>>>>>>>>>>>> after
> > > > >>>>>>>>>>>>>>>>>>>>> it
> > > > >>>>>>>>>>>>>>>>>>>>>>>>> quickly
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> processes the
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> LeaderAndIsrRequest. Then
> > > > >> broker
> > > > >>>>> will
> > > > >>>>>>>> still
> > > > >>>>>>>>>> be
> > > > >>>>>>>>>>>> slow
> > > > >>>>>>>>>>>>>> in
> > > > >>>>>>>>>>>>>>>>>>>> processing
> > > > >>>>>>>>>>>>>>>>>>>>>>> these
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> ProduceRequest and request
> > > > >> will
> > > > >>>>> still
> > > > >>>>>> be
> > > > >>>>>>>>> very
> > > > >>>>>>>>>>>> high
> > > > >>>>>>>>>>>>>> with
> > > > >>>>>>>>>>>>>>>>> this
> > > > >>>>>>>>>>>>>>>>>>>> KIP.
> > > > >>>>>>>>>>>>>>>>>>>>> It
> > > > >>>>>>>>>>>>>>>>>>>>>>>>> seems
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> that most ProduceRequest will
> > > > >>>> still
> > > > >>>>>>>> timeout
> > > > >>>>>>>>>>> after
> > > > >>>>>>>>>>>>> 30
> > > > >>>>>>>>>>>>>>>>>> seconds.
> > > > >>>>>>>>>>>>>>>>>>> Is
> > > > >>>>>>>>>>>>>>>>>>>>>>> this
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> understanding correct?
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Regarding 2, if most
> > > > >>>> ProduceRequest
> > > > >>>>>> will
> > > > >>>>>>>>>> still
> > > > >>>>>>>>>>>>>> timeout
> > > > >>>>>>>>>>>>>>>>> after
> > > > >>>>>>>>>>>>>>>>>>> 30
> > > > >>>>>>>>>>>>>>>>>>>>>>>> seconds,
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> then it is less clear how this
> > > > >>> KIP
> > > > >>>>>>>> reduces
> > > > >>>>>>>>>>>> average
> > > > >>>>>>>>>>>>>>>>> produce
> > > > >>>>>>>>>>>>>>>>>>>>> latency.
> > > > >>>>>>>>>>>>>>>>>>>>>>> Can
> > > > >>>>>>>>>>>>>>>>>>>>>>>>> you
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> clarify what metrics can be
> > > > >>>> improved
> > > > >>>>>> by
> > > > >>>>>>>>> this
> > > > >>>>>>>>>>> KIP?
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Not sure why system operator
> > > > >>>>> directly
> > > > >>>>>>>> cares
> > > > >>>>>>>>>>>> number
> > > > >>>>>>>>>>>>> of
> > > > >>>>>>>>>>>>>>>>>>> truncated
> > > > >>>>>>>>>>>>>>>>>>>>>>>> messages.
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Do you mean this KIP can
> > > > >> improve
> > > > >>>>>> average
> > > > >>>>>>>>>>>> throughput
> > > > >>>>>>>>>>>>>> or
> > > > >>>>>>>>>>>>>>>>>> reduce
> > > > >>>>>>>>>>>>>>>>>>>>>>> message
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> duplication? It will be good
> > > > >> to
> > > > >>>>>>>> understand
> > > > >>>>>>>>>>> this.
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Dong
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, 3 Jul 2018 at 7:12 AM
> > > > >>>> Lucas
> > > > >>>>>>>> Wang <
> > > > >>>>>>>>>>>>>>>>>>> [email protected]
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Dong,
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for your valuable
> > > > >>>> comments.
> > > > >>>>>>>> Please
> > > > >>>>>>>>>> see
> > > > >>>>>>>>>>>> my
> > > > >>>>>>>>>>>>>>> reply
> > > > >>>>>>>>>>>>>>>>>>> below.
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> 1. The Google doc showed
> > > > >> only
> > > > >>> 1
> > > > >>>>>>>>> partition.
> > > > >>>>>>>>>>> Now
> > > > >>>>>>>>>>>>>> let's
> > > > >>>>>>>>>>>>>>>>>>> consider
> > > > >>>>>>>>>>>>>>>>>>>> a
> > > > >>>>>>>>>>>>>>>>>>>>>>> more
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> common
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> scenario
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> where broker0 is the leader
> > > > >> of
> > > > >>>>> many
> > > > >>>>>>>>>>> partitions.
> > > > >>>>>>>>>>>>> And
> > > > >>>>>>>>>>>>>>>>> let's
> > > > >>>>>>>>>>>>>>>>>>> say
> > > > >>>>>>>>>>>>>>>>>>>>> for
> > > > >>>>>>>>>>>>>>>>>>>>>>>> some
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> reason its IO becomes slow.
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> The number of leader
> > > > >>> partitions
> > > > >>>> on
> > > > >>>>>>>>> broker0
> > > > >>>>>>>>>> is
> > > > >>>>>>>>>>>> so
> > > > >>>>>>>>>>>>>>> large,
> > > > >>>>>>>>>>>>>>>>>> say
> > > > >>>>>>>>>>>>>>>>>>>> 10K,
> > > > >>>>>>>>>>>>>>>>>>>>>>> that
> > > > >>>>>>>>>>>>>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> cluster is skewed,
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> and the operator would like
> > > > >> to
> > > > >>>>> shift
> > > > >>>>>>>> the
> > > > >>>>>>>>>>>>> leadership
> > > > >>>>>>>>>>>>>>>>> for a
> > > > >>>>>>>>>>>>>>>>>>> lot
> > > > >>>>>>>>>>>>>>>>>>>> of
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> partitions, say 9K, to other
> > > > >>>>>> brokers,
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> either manually or through
> > > > >>> some
> > > > >>>>>>>> service
> > > > >>>>>>>>>> like
> > > > >>>>>>>>>>>>> cruise
> > > > >>>>>>>>>>>>>>>>>> control.
> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> With this KIP, not only will
> > > > >>> the
> > > >
> > >
> > >
> > > --
> > > -Regards,
> > > Mayuresh R. Gharat
> > > (862) 250-7125
> > >
> >
>

Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

Reply via email to