A kind reminder for review of this KIP. Thank you very much! Lucas
On Wed, Jul 25, 2018 at 10:23 PM, Lucas Wang <lucasatu...@gmail.com> wrote: > Hi All, > > I've updated the KIP by adding the dedicated endpoints for controller > connections, > and pinning threads for controller requests. > Also I've updated the title of this KIP. Please take a look and let me > know your feedback. > > Thanks a lot for your time! > Lucas > > On Tue, Jul 24, 2018 at 10:19 AM, Mayuresh Gharat < > gharatmayures...@gmail.com> wrote: > >> Hi Lucas, >> I agree, if we want to go forward with a separate controller plane and >> data >> plane and completely isolate them, having a separate port for controller >> with a separate Acceptor and a Processor sounds ideal to me. >> >> Thanks, >> >> Mayuresh >> >> >> On Mon, Jul 23, 2018 at 11:04 PM Becket Qin <becket....@gmail.com> wrote: >> >> > Hi Lucas, >> > >> > Yes, I agree that a dedicated end to end control flow would be ideal. >> > >> > Thanks, >> > >> > Jiangjie (Becket) Qin >> > >> > On Tue, Jul 24, 2018 at 1:05 PM, Lucas Wang <lucasatu...@gmail.com> >> wrote: >> > >> > > Thanks for the comment, Becket. >> > > So far, we've been trying to avoid making any request handler thread >> > > special. >> > > But if we were to follow that path in order to make the two planes >> more >> > > isolated, >> > > what do you think about also having a dedicated processor thread, >> > > and dedicated port for the controller? >> > > >> > > Today one processor thread can handle multiple connections, let's say >> 100 >> > > connections >> > > >> > > represented by connection0, ... connection99, among which >> connection0-98 >> > > are from clients, while connection99 is from >> > > >> > > the controller. Further let's say after one selector polling, there >> are >> > > incoming requests on all connections. >> > > >> > > When the request queue is full, (either the data request being full in >> > the >> > > two queue design, or >> > > >> > > the one single queue being full in the deque design), the processor >> > thread >> > > will be blocked first >> > > >> > > when trying to enqueue the data request from connection0, then >> possibly >> > > blocked for the data request >> > > >> > > from connection1, ... etc even though the controller request is ready >> to >> > be >> > > enqueued. >> > > >> > > To solve this problem, it seems we would need to have a separate port >> > > dedicated to >> > > >> > > the controller, a dedicated processor thread, a dedicated controller >> > > request queue, >> > > >> > > and pinning of one request handler thread for controller requests. >> > > >> > > Thanks, >> > > Lucas >> > > >> > > >> > > On Mon, Jul 23, 2018 at 6:00 PM, Becket Qin <becket....@gmail.com> >> > wrote: >> > > >> > > > Personally I am not fond of the dequeue approach simply because it >> is >> > > > against the basic idea of isolating the controller plane and data >> > plane. >> > > > With a single dequeue, theoretically speaking the controller >> requests >> > can >> > > > starve the clients requests. I would prefer the approach with a >> > separate >> > > > controller request queue and a dedicated controller request handler >> > > thread. >> > > > >> > > > Thanks, >> > > > >> > > > Jiangjie (Becket) Qin >> > > > >> > > > On Tue, Jul 24, 2018 at 8:16 AM, Lucas Wang <lucasatu...@gmail.com> >> > > wrote: >> > > > >> > > > > Sure, I can summarize the usage of correlation id. But before I do >> > > that, >> > > > it >> > > > > seems >> > > > > the same out-of-order processing can also happen to Produce >> requests >> > > sent >> > > > > by producers, >> > > > > following the same example you described earlier. >> > > > > If that's the case, I think this probably deserves a separate doc >> and >> > > > > design independent of this KIP. >> > > > > >> > > > > Lucas >> > > > > >> > > > > >> > > > > >> > > > > On Mon, Jul 23, 2018 at 12:39 PM, Dong Lin <lindon...@gmail.com> >> > > wrote: >> > > > > >> > > > > > Hey Lucas, >> > > > > > >> > > > > > Could you update the KIP if you are confident with the approach >> > which >> > > > > uses >> > > > > > correlation id? The idea around correlation id is kind of >> scattered >> > > > > across >> > > > > > multiple emails. It will be useful if other reviews can read the >> > KIP >> > > to >> > > > > > understand the latest proposal. >> > > > > > >> > > > > > Thanks, >> > > > > > Dong >> > > > > > >> > > > > > On Mon, Jul 23, 2018 at 12:32 PM, Mayuresh Gharat < >> > > > > > gharatmayures...@gmail.com> wrote: >> > > > > > >> > > > > > > I like the idea of the dequeue implementation by Lucas. This >> will >> > > > help >> > > > > us >> > > > > > > avoid additional queue for controller and additional configs >> in >> > > > Kafka. >> > > > > > > >> > > > > > > Thanks, >> > > > > > > >> > > > > > > Mayuresh >> > > > > > > >> > > > > > > On Sun, Jul 22, 2018 at 2:58 AM Becket Qin < >> becket....@gmail.com >> > > >> > > > > wrote: >> > > > > > > >> > > > > > > > Hi Jun, >> > > > > > > > >> > > > > > > > The usage of correlation ID might still be useful to address >> > the >> > > > > cases >> > > > > > > > that the controller epoch and leader epoch check are not >> > > sufficient >> > > > > to >> > > > > > > > guarantee correct behavior. For example, if the controller >> > sends >> > > a >> > > > > > > > LeaderAndIsrRequest followed by a StopReplicaRequest, and >> the >> > > > broker >> > > > > > > > processes it in the reverse order, the replica may still be >> > > wrongly >> > > > > > > > recreated, right? >> > > > > > > > >> > > > > > > > Thanks, >> > > > > > > > >> > > > > > > > Jiangjie (Becket) Qin >> > > > > > > > >> > > > > > > > > On Jul 22, 2018, at 11:47 AM, Jun Rao <j...@confluent.io> >> > > wrote: >> > > > > > > > > >> > > > > > > > > Hmm, since we already use controller epoch and leader >> epoch >> > for >> > > > > > > properly >> > > > > > > > > caching the latest partition state, do we really need >> > > correlation >> > > > > id >> > > > > > > for >> > > > > > > > > ordering the controller requests? >> > > > > > > > > >> > > > > > > > > Thanks, >> > > > > > > > > >> > > > > > > > > Jun >> > > > > > > > > >> > > > > > > > > On Fri, Jul 20, 2018 at 2:18 PM, Becket Qin < >> > > > becket....@gmail.com> >> > > > > > > > wrote: >> > > > > > > > > >> > > > > > > > >> Lucas and Mayuresh, >> > > > > > > > >> >> > > > > > > > >> Good idea. The correlation id should work. >> > > > > > > > >> >> > > > > > > > >> In the ControllerChannelManager, a request will be resent >> > > until >> > > > a >> > > > > > > > response >> > > > > > > > >> is received. So if the controller to broker connection >> > > > disconnects >> > > > > > > after >> > > > > > > > >> controller sends R1_a, but before the response of R1_a is >> > > > > received, >> > > > > > a >> > > > > > > > >> disconnection may cause the controller to resend R1_b. >> i.e. >> > > > until >> > > > > R1 >> > > > > > > is >> > > > > > > > >> acked, R2 won't be sent by the controller. >> > > > > > > > >> This gives two guarantees: >> > > > > > > > >> 1. Correlation id wise: R1_a < R1_b < R2. >> > > > > > > > >> 2. On the broker side, when R2 is seen, R1 must have been >> > > > > processed >> > > > > > at >> > > > > > > > >> least once. >> > > > > > > > >> >> > > > > > > > >> So on the broker side, with a single thread controller >> > request >> > > > > > > handler, >> > > > > > > > the >> > > > > > > > >> logic should be: >> > > > > > > > >> 1. Process what ever request seen in the controller >> request >> > > > queue >> > > > > > > > >> 2. For the given epoch, drop request if its correlation >> id >> > is >> > > > > > smaller >> > > > > > > > than >> > > > > > > > >> that of the last processed request. >> > > > > > > > >> >> > > > > > > > >> Thanks, >> > > > > > > > >> >> > > > > > > > >> Jiangjie (Becket) Qin >> > > > > > > > >> >> > > > > > > > >> On Fri, Jul 20, 2018 at 8:07 AM, Jun Rao < >> j...@confluent.io> >> > > > > wrote: >> > > > > > > > >> >> > > > > > > > >>> I agree that there is no strong ordering when there are >> > more >> > > > than >> > > > > > one >> > > > > > > > >>> socket connections. Currently, we rely on >> controllerEpoch >> > and >> > > > > > > > leaderEpoch >> > > > > > > > >>> to ensure that the receiving broker picks up the latest >> > state >> > > > for >> > > > > > > each >> > > > > > > > >>> partition. >> > > > > > > > >>> >> > > > > > > > >>> One potential issue with the dequeue approach is that if >> > the >> > > > > queue >> > > > > > is >> > > > > > > > >> full, >> > > > > > > > >>> there is no guarantee that the controller requests will >> be >> > > > > enqueued >> > > > > > > > >>> quickly. >> > > > > > > > >>> >> > > > > > > > >>> Thanks, >> > > > > > > > >>> >> > > > > > > > >>> Jun >> > > > > > > > >>> >> > > > > > > > >>> On Fri, Jul 20, 2018 at 5:25 AM, Mayuresh Gharat < >> > > > > > > > >>> gharatmayures...@gmail.com >> > > > > > > > >>>> wrote: >> > > > > > > > >>> >> > > > > > > > >>>> Yea, the correlationId is only set to 0 in the >> > NetworkClient >> > > > > > > > >> constructor. >> > > > > > > > >>>> Since we reuse the same NetworkClient between >> Controller >> > and >> > > > the >> > > > > > > > >> broker, >> > > > > > > > >>> a >> > > > > > > > >>>> disconnection should not cause it to reset to 0, in >> which >> > > case >> > > > > it >> > > > > > > can >> > > > > > > > >> be >> > > > > > > > >>>> used to reject obsolete requests. >> > > > > > > > >>>> >> > > > > > > > >>>> Thanks, >> > > > > > > > >>>> >> > > > > > > > >>>> Mayuresh >> > > > > > > > >>>> >> > > > > > > > >>>> On Thu, Jul 19, 2018 at 1:52 PM Lucas Wang < >> > > > > lucasatu...@gmail.com >> > > > > > > >> > > > > > > > >>> wrote: >> > > > > > > > >>>> >> > > > > > > > >>>>> @Dong, >> > > > > > > > >>>>> Great example and explanation, thanks! >> > > > > > > > >>>>> >> > > > > > > > >>>>> @All >> > > > > > > > >>>>> Regarding the example given by Dong, it seems even if >> we >> > > use >> > > > a >> > > > > > > queue, >> > > > > > > > >>>> and a >> > > > > > > > >>>>> dedicated controller request handling thread, >> > > > > > > > >>>>> the same result can still happen because R1_a will be >> > sent >> > > on >> > > > > one >> > > > > > > > >>>>> connection, and R1_b & R2 will be sent on a different >> > > > > connection, >> > > > > > > > >>>>> and there is no ordering between different >> connections on >> > > the >> > > > > > > broker >> > > > > > > > >>>> side. >> > > > > > > > >>>>> I was discussing with Mayuresh offline, and it seems >> > > > > correlation >> > > > > > id >> > > > > > > > >>>> within >> > > > > > > > >>>>> the same NetworkClient object is monotonically >> increasing >> > > and >> > > > > > never >> > > > > > > > >>>> reset, >> > > > > > > > >>>>> hence a broker can leverage that to properly reject >> > > obsolete >> > > > > > > > >> requests. >> > > > > > > > >>>>> Thoughts? >> > > > > > > > >>>>> >> > > > > > > > >>>>> Thanks, >> > > > > > > > >>>>> Lucas >> > > > > > > > >>>>> >> > > > > > > > >>>>> On Thu, Jul 19, 2018 at 12:11 PM, Mayuresh Gharat < >> > > > > > > > >>>>> gharatmayures...@gmail.com> wrote: >> > > > > > > > >>>>> >> > > > > > > > >>>>>> Actually nvm, correlationId is reset in case of >> > connection >> > > > > > loss, I >> > > > > > > > >>>> think. >> > > > > > > > >>>>>> >> > > > > > > > >>>>>> Thanks, >> > > > > > > > >>>>>> >> > > > > > > > >>>>>> Mayuresh >> > > > > > > > >>>>>> >> > > > > > > > >>>>>> On Thu, Jul 19, 2018 at 11:11 AM Mayuresh Gharat < >> > > > > > > > >>>>>> gharatmayures...@gmail.com> >> > > > > > > > >>>>>> wrote: >> > > > > > > > >>>>>> >> > > > > > > > >>>>>>> I agree with Dong that out-of-order processing can >> > happen >> > > > > with >> > > > > > > > >>>> having 2 >> > > > > > > > >>>>>>> separate queues as well and it can even happen >> today. >> > > > > > > > >>>>>>> Can we use the correlationId in the request from the >> > > > > controller >> > > > > > > > >> to >> > > > > > > > >>>> the >> > > > > > > > >>>>>>> broker to handle ordering ? >> > > > > > > > >>>>>>> >> > > > > > > > >>>>>>> Thanks, >> > > > > > > > >>>>>>> >> > > > > > > > >>>>>>> Mayuresh >> > > > > > > > >>>>>>> >> > > > > > > > >>>>>>> >> > > > > > > > >>>>>>> On Thu, Jul 19, 2018 at 6:41 AM Becket Qin < >> > > > > > becket....@gmail.com >> > > > > > > > >>> >> > > > > > > > >>>>> wrote: >> > > > > > > > >>>>>>> >> > > > > > > > >>>>>>>> Good point, Joel. I agree that a dedicated >> controller >> > > > > request >> > > > > > > > >>>> handling >> > > > > > > > >>>>>>>> thread would be a better isolation. It also solves >> the >> > > > > > > > >> reordering >> > > > > > > > >>>>> issue. >> > > > > > > > >>>>>>>> >> > > > > > > > >>>>>>>> On Thu, Jul 19, 2018 at 2:23 PM, Joel Koshy < >> > > > > > > > >> jjkosh...@gmail.com> >> > > > > > > > >>>>>> wrote: >> > > > > > > > >>>>>>>> >> > > > > > > > >>>>>>>>> Good example. I think this scenario can occur in >> the >> > > > > current >> > > > > > > > >>> code >> > > > > > > > >>>> as >> > > > > > > > >>>>>>>> well >> > > > > > > > >>>>>>>>> but with even lower probability given that there >> are >> > > > other >> > > > > > > > >>>>>>>> non-controller >> > > > > > > > >>>>>>>>> requests interleaved. It is still sketchy though >> and >> > I >> > > > > think >> > > > > > a >> > > > > > > > >>>> safer >> > > > > > > > >>>>>>>>> approach would be separate queues and pinning >> > > controller >> > > > > > > > >> request >> > > > > > > > >>>>>>>> handling >> > > > > > > > >>>>>>>>> to one handler thread. >> > > > > > > > >>>>>>>>> >> > > > > > > > >>>>>>>>> On Wed, Jul 18, 2018 at 11:12 PM, Dong Lin < >> > > > > > > > >> lindon...@gmail.com >> > > > > > > > >>>> >> > > > > > > > >>>>>> wrote: >> > > > > > > > >>>>>>>>> >> > > > > > > > >>>>>>>>>> Hey Becket, >> > > > > > > > >>>>>>>>>> >> > > > > > > > >>>>>>>>>> I think you are right that there may be >> out-of-order >> > > > > > > > >>> processing. >> > > > > > > > >>>>>>>> However, >> > > > > > > > >>>>>>>>>> it seems that out-of-order processing may also >> > happen >> > > > even >> > > > > > > > >> if >> > > > > > > > >>> we >> > > > > > > > >>>>>> use a >> > > > > > > > >>>>>>>>>> separate queue. >> > > > > > > > >>>>>>>>>> >> > > > > > > > >>>>>>>>>> Here is the example: >> > > > > > > > >>>>>>>>>> >> > > > > > > > >>>>>>>>>> - Controller sends R1 and got disconnected before >> > > > > receiving >> > > > > > > > >>>>>> response. >> > > > > > > > >>>>>>>>> Then >> > > > > > > > >>>>>>>>>> it reconnects and sends R2. Both requests now >> stay >> > in >> > > > the >> > > > > > > > >>>>> controller >> > > > > > > > >>>>>>>>>> request queue in the order they are sent. >> > > > > > > > >>>>>>>>>> - thread1 takes R1_a from the request queue and >> then >> > > > > thread2 >> > > > > > > > >>>> takes >> > > > > > > > >>>>>> R2 >> > > > > > > > >>>>>>>>> from >> > > > > > > > >>>>>>>>>> the request queue almost at the same time. >> > > > > > > > >>>>>>>>>> - So R1_a and R2 are processed in parallel. >> There is >> > > > > chance >> > > > > > > > >>> that >> > > > > > > > >>>>>> R2's >> > > > > > > > >>>>>>>>>> processing is completed before R1. >> > > > > > > > >>>>>>>>>> >> > > > > > > > >>>>>>>>>> If out-of-order processing can happen for both >> > > > approaches >> > > > > > > > >> with >> > > > > > > > >>>>> very >> > > > > > > > >>>>>>>> low >> > > > > > > > >>>>>>>>>> probability, it may not be worthwhile to add the >> > extra >> > > > > > > > >> queue. >> > > > > > > > >>>> What >> > > > > > > > >>>>>> do >> > > > > > > > >>>>>>>> you >> > > > > > > > >>>>>>>>>> think? >> > > > > > > > >>>>>>>>>> >> > > > > > > > >>>>>>>>>> Thanks, >> > > > > > > > >>>>>>>>>> Dong >> > > > > > > > >>>>>>>>>> >> > > > > > > > >>>>>>>>>> >> > > > > > > > >>>>>>>>>> On Wed, Jul 18, 2018 at 6:17 PM, Becket Qin < >> > > > > > > > >>>> becket....@gmail.com >> > > > > > > > >>>>>> >> > > > > > > > >>>>>>>>> wrote: >> > > > > > > > >>>>>>>>>> >> > > > > > > > >>>>>>>>>>> Hi Mayuresh/Joel, >> > > > > > > > >>>>>>>>>>> >> > > > > > > > >>>>>>>>>>> Using the request channel as a dequeue was >> bright >> > up >> > > > some >> > > > > > > > >>> time >> > > > > > > > >>>>> ago >> > > > > > > > >>>>>>>> when >> > > > > > > > >>>>>>>>>> we >> > > > > > > > >>>>>>>>>>> initially thinking of prioritizing the request. >> The >> > > > > > > > >> concern >> > > > > > > > >>>> was >> > > > > > > > >>>>>> that >> > > > > > > > >>>>>>>>> the >> > > > > > > > >>>>>>>>>>> controller requests are supposed to be >> processed in >> > > > > order. >> > > > > > > > >>> If >> > > > > > > > >>>> we >> > > > > > > > >>>>>> can >> > > > > > > > >>>>>>>>>> ensure >> > > > > > > > >>>>>>>>>>> that there is one controller request in the >> request >> > > > > > > > >> channel, >> > > > > > > > >>>> the >> > > > > > > > >>>>>>>> order >> > > > > > > > >>>>>>>>> is >> > > > > > > > >>>>>>>>>>> not a concern. But in cases that there are more >> > than >> > > > one >> > > > > > > > >>>>>> controller >> > > > > > > > >>>>>>>>>> request >> > > > > > > > >>>>>>>>>>> inserted into the queue, the controller request >> > order >> > > > may >> > > > > > > > >>>> change >> > > > > > > > >>>>>> and >> > > > > > > > >>>>>>>>>> cause >> > > > > > > > >>>>>>>>>>> problem. For example, think about the following >> > > > sequence: >> > > > > > > > >>>>>>>>>>> 1. Controller successfully sent a request R1 to >> > > broker >> > > > > > > > >>>>>>>>>>> 2. Broker receives R1 and put the request to the >> > head >> > > > of >> > > > > > > > >> the >> > > > > > > > >>>>>> request >> > > > > > > > >>>>>>>>>> queue. >> > > > > > > > >>>>>>>>>>> 3. Controller to broker connection failed and >> the >> > > > > > > > >> controller >> > > > > > > > >>>>>>>>> reconnected >> > > > > > > > >>>>>>>>>> to >> > > > > > > > >>>>>>>>>>> the broker. >> > > > > > > > >>>>>>>>>>> 4. Controller sends a request R2 to the broker >> > > > > > > > >>>>>>>>>>> 5. Broker receives R2 and add it to the head of >> the >> > > > > > > > >> request >> > > > > > > > >>>>> queue. >> > > > > > > > >>>>>>>>>>> Now on the broker side, R2 will be processed >> before >> > > R1 >> > > > is >> > > > > > > > >>>>>> processed, >> > > > > > > > >>>>>>>>>> which >> > > > > > > > >>>>>>>>>>> may cause problem. >> > > > > > > > >>>>>>>>>>> >> > > > > > > > >>>>>>>>>>> Thanks, >> > > > > > > > >>>>>>>>>>> >> > > > > > > > >>>>>>>>>>> Jiangjie (Becket) Qin >> > > > > > > > >>>>>>>>>>> >> > > > > > > > >>>>>>>>>>> >> > > > > > > > >>>>>>>>>>> >> > > > > > > > >>>>>>>>>>> On Thu, Jul 19, 2018 at 3:23 AM, Joel Koshy < >> > > > > > > > >>>>> jjkosh...@gmail.com> >> > > > > > > > >>>>>>>>> wrote: >> > > > > > > > >>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>> @Mayuresh - I like your idea. It appears to be >> a >> > > > simpler >> > > > > > > > >>>> less >> > > > > > > > >>>>>>>>> invasive >> > > > > > > > >>>>>>>>>>>> alternative and it should work. >> Jun/Becket/others, >> > > do >> > > > > > > > >> you >> > > > > > > > >>>> see >> > > > > > > > >>>>>> any >> > > > > > > > >>>>>>>>>>> pitfalls >> > > > > > > > >>>>>>>>>>>> with this approach? >> > > > > > > > >>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>> On Wed, Jul 18, 2018 at 12:03 PM, Lucas Wang < >> > > > > > > > >>>>>>>> lucasatu...@gmail.com> >> > > > > > > > >>>>>>>>>>>> wrote: >> > > > > > > > >>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>> @Mayuresh, >> > > > > > > > >>>>>>>>>>>>> That's a very interesting idea that I haven't >> > > thought >> > > > > > > > >>>>> before. >> > > > > > > > >>>>>>>>>>>>> It seems to solve our problem at hand pretty >> > well, >> > > > and >> > > > > > > > >>>> also >> > > > > > > > >>>>>>>>>>>>> avoids the need to have a new size metric and >> > > > capacity >> > > > > > > > >>>>> config >> > > > > > > > >>>>>>>>>>>>> for the controller request queue. In fact, if >> we >> > > were >> > > > > > > > >> to >> > > > > > > > >>>>> adopt >> > > > > > > > >>>>>>>>>>>>> this design, there is no public interface >> change, >> > > and >> > > > > > > > >> we >> > > > > > > > >>>>>>>>>>>>> probably don't need a KIP. >> > > > > > > > >>>>>>>>>>>>> Also implementation wise, it seems >> > > > > > > > >>>>>>>>>>>>> the java class LinkedBlockingQueue can readily >> > > > satisfy >> > > > > > > > >>> the >> > > > > > > > >>>>>>>>>> requirement >> > > > > > > > >>>>>>>>>>>>> by supporting a capacity, and also allowing >> > > inserting >> > > > > > > > >> at >> > > > > > > > >>>>> both >> > > > > > > > >>>>>>>> ends. >> > > > > > > > >>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>> My only concern is that this design is tied to >> > the >> > > > > > > > >>>>> coincidence >> > > > > > > > >>>>>>>> that >> > > > > > > > >>>>>>>>>>>>> we have two request priorities and there are >> two >> > > ends >> > > > > > > > >>> to a >> > > > > > > > >>>>>>>> deque. >> > > > > > > > >>>>>>>>>>>>> Hence by using the proposed design, it seems >> the >> > > > > > > > >> network >> > > > > > > > >>>>> layer >> > > > > > > > >>>>>>>> is >> > > > > > > > >>>>>>>>>>>>> more tightly coupled with upper layer logic, >> e.g. >> > > if >> > > > > > > > >> we >> > > > > > > > >>>> were >> > > > > > > > >>>>>> to >> > > > > > > > >>>>>>>> add >> > > > > > > > >>>>>>>>>>>>> an extra priority level in the future for some >> > > > reason, >> > > > > > > > >>> we >> > > > > > > > >>>>>> would >> > > > > > > > >>>>>>>>>>> probably >> > > > > > > > >>>>>>>>>>>>> need to go back to the design of separate >> queues, >> > > one >> > > > > > > > >>> for >> > > > > > > > >>>>> each >> > > > > > > > >>>>>>>>>> priority >> > > > > > > > >>>>>>>>>>>>> level. >> > > > > > > > >>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>> In summary, I'm ok with both designs and lean >> > > toward >> > > > > > > > >>> your >> > > > > > > > >>>>>>>> suggested >> > > > > > > > >>>>>>>>>>>>> approach. >> > > > > > > > >>>>>>>>>>>>> Let's hear what others think. >> > > > > > > > >>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>> @Becket, >> > > > > > > > >>>>>>>>>>>>> In light of Mayuresh's suggested new design, >> I'm >> > > > > > > > >>> answering >> > > > > > > > >>>>>> your >> > > > > > > > >>>>>>>>>>> question >> > > > > > > > >>>>>>>>>>>>> only in the context >> > > > > > > > >>>>>>>>>>>>> of the current KIP design: I think your >> > suggestion >> > > > > > > > >> makes >> > > > > > > > >>>>>> sense, >> > > > > > > > >>>>>>>> and >> > > > > > > > >>>>>>>>>> I'm >> > > > > > > > >>>>>>>>>>>> ok >> > > > > > > > >>>>>>>>>>>>> with removing the capacity config and >> > > > > > > > >>>>>>>>>>>>> just relying on the default value of 20 being >> > > > > > > > >> sufficient >> > > > > > > > >>>>>> enough. >> > > > > > > > >>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>> Thanks, >> > > > > > > > >>>>>>>>>>>>> Lucas >> > > > > > > > >>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>> On Wed, Jul 18, 2018 at 9:57 AM, Mayuresh >> Gharat >> > < >> > > > > > > > >>>>>>>>>>>>> gharatmayures...@gmail.com >> > > > > > > > >>>>>>>>>>>>>> wrote: >> > > > > > > > >>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>> Hi Lucas, >> > > > > > > > >>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>> Seems like the main intent here is to >> prioritize >> > > the >> > > > > > > > >>>>>>>> controller >> > > > > > > > >>>>>>>>>>> request >> > > > > > > > >>>>>>>>>>>>>> over any other requests. >> > > > > > > > >>>>>>>>>>>>>> In that case, we can change the request queue >> > to a >> > > > > > > > >>>>> dequeue, >> > > > > > > > >>>>>>>> where >> > > > > > > > >>>>>>>>>> you >> > > > > > > > >>>>>>>>>>>>>> always insert the normal requests (produce, >> > > > > > > > >>>> consume,..etc) >> > > > > > > > >>>>>> to >> > > > > > > > >>>>>>>> the >> > > > > > > > >>>>>>>>>> end >> > > > > > > > >>>>>>>>>>>> of >> > > > > > > > >>>>>>>>>>>>>> the dequeue, but if its a controller request, >> > you >> > > > > > > > >>> insert >> > > > > > > > >>>>> it >> > > > > > > > >>>>>> to >> > > > > > > > >>>>>>>>> the >> > > > > > > > >>>>>>>>>>> head >> > > > > > > > >>>>>>>>>>>>> of >> > > > > > > > >>>>>>>>>>>>>> the queue. This ensures that the controller >> > > request >> > > > > > > > >>> will >> > > > > > > > >>>>> be >> > > > > > > > >>>>>>>> given >> > > > > > > > >>>>>>>>>>>> higher >> > > > > > > > >>>>>>>>>>>>>> priority over other requests. >> > > > > > > > >>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>> Also since we only read one request from the >> > > socket >> > > > > > > > >>> and >> > > > > > > > >>>>> mute >> > > > > > > > >>>>>>>> it >> > > > > > > > >>>>>>>>> and >> > > > > > > > >>>>>>>>>>>> only >> > > > > > > > >>>>>>>>>>>>>> unmute it after handling the request, this >> would >> > > > > > > > >>> ensure >> > > > > > > > >>>>> that >> > > > > > > > >>>>>>>> we >> > > > > > > > >>>>>>>>>> don't >> > > > > > > > >>>>>>>>>>>>>> handle controller requests out of order. >> > > > > > > > >>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>> With this approach we can avoid the second >> queue >> > > and >> > > > > > > > >>> the >> > > > > > > > >>>>>>>>> additional >> > > > > > > > >>>>>>>>>>>>> config >> > > > > > > > >>>>>>>>>>>>>> for the size of the queue. >> > > > > > > > >>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>> What do you think ? >> > > > > > > > >>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>> Thanks, >> > > > > > > > >>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>> Mayuresh >> > > > > > > > >>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>> On Wed, Jul 18, 2018 at 3:05 AM Becket Qin < >> > > > > > > > >>>>>>>> becket....@gmail.com >> > > > > > > > >>>>>>>>>> >> > > > > > > > >>>>>>>>>>>> wrote: >> > > > > > > > >>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>> Hey Joel, >> > > > > > > > >>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>> Thank for the detail explanation. I agree >> the >> > > > > > > > >>> current >> > > > > > > > >>>>>> design >> > > > > > > > >>>>>>>>>> makes >> > > > > > > > >>>>>>>>>>>>> sense. >> > > > > > > > >>>>>>>>>>>>>>> My confusion is about whether the new config >> > for >> > > > > > > > >> the >> > > > > > > > >>>>>>>> controller >> > > > > > > > >>>>>>>>>>> queue >> > > > > > > > >>>>>>>>>>>>>>> capacity is necessary. I cannot think of a >> case >> > > in >> > > > > > > > >>>> which >> > > > > > > > >>>>>>>> users >> > > > > > > > >>>>>>>>>>> would >> > > > > > > > >>>>>>>>>>>>>> change >> > > > > > > > >>>>>>>>>>>>>>> it. >> > > > > > > > >>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>> Thanks, >> > > > > > > > >>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>> Jiangjie (Becket) Qin >> > > > > > > > >>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>> On Wed, Jul 18, 2018 at 6:00 PM, Becket Qin >> < >> > > > > > > > >>>>>>>>>> becket....@gmail.com> >> > > > > > > > >>>>>>>>>>>>>> wrote: >> > > > > > > > >>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>> Hi Lucas, >> > > > > > > > >>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>> I guess my question can be rephrased to >> "do we >> > > > > > > > >>>> expect >> > > > > > > > >>>>>>>> user to >> > > > > > > > >>>>>>>>>>> ever >> > > > > > > > >>>>>>>>>>>>>> change >> > > > > > > > >>>>>>>>>>>>>>>> the controller request queue capacity"? If >> we >> > > > > > > > >>> agree >> > > > > > > > >>>>> that >> > > > > > > > >>>>>>>> 20 >> > > > > > > > >>>>>>>>> is >> > > > > > > > >>>>>>>>>>>>> already >> > > > > > > > >>>>>>>>>>>>>> a >> > > > > > > > >>>>>>>>>>>>>>>> very generous default number and we do not >> > > > > > > > >> expect >> > > > > > > > >>>> user >> > > > > > > > >>>>>> to >> > > > > > > > >>>>>>>>>> change >> > > > > > > > >>>>>>>>>>>> it, >> > > > > > > > >>>>>>>>>>>>> is >> > > > > > > > >>>>>>>>>>>>>>> it >> > > > > > > > >>>>>>>>>>>>>>>> still necessary to expose this as a config? >> > > > > > > > >>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>> Thanks, >> > > > > > > > >>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>> Jiangjie (Becket) Qin >> > > > > > > > >>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>> On Wed, Jul 18, 2018 at 2:29 AM, Lucas >> Wang < >> > > > > > > > >>>>>>>>>>> lucasatu...@gmail.com >> > > > > > > > >>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>> wrote: >> > > > > > > > >>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>> @Becket >> > > > > > > > >>>>>>>>>>>>>>>>> 1. Thanks for the comment. You are right >> that >> > > > > > > > >>>>> normally >> > > > > > > > >>>>>>>> there >> > > > > > > > >>>>>>>>>>>> should >> > > > > > > > >>>>>>>>>>>>> be >> > > > > > > > >>>>>>>>>>>>>>>>> just >> > > > > > > > >>>>>>>>>>>>>>>>> one controller request because of muting, >> > > > > > > > >>>>>>>>>>>>>>>>> and I had NOT intended to say there would >> be >> > > > > > > > >> many >> > > > > > > > >>>>>>>> enqueued >> > > > > > > > >>>>>>>>>>>>> controller >> > > > > > > > >>>>>>>>>>>>>>>>> requests. >> > > > > > > > >>>>>>>>>>>>>>>>> I went through the KIP again, and I'm not >> > sure >> > > > > > > > >>>> which >> > > > > > > > >>>>>> part >> > > > > > > > >>>>>>>>>>> conveys >> > > > > > > > >>>>>>>>>>>>> that >> > > > > > > > >>>>>>>>>>>>>>>>> info. >> > > > > > > > >>>>>>>>>>>>>>>>> I'd be happy to revise if you point it out >> > the >> > > > > > > > >>>>> section. >> > > > > > > > >>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>> 2. Though it should not happen in normal >> > > > > > > > >>>> conditions, >> > > > > > > > >>>>>> the >> > > > > > > > >>>>>>>>>> current >> > > > > > > > >>>>>>>>>>>>>> design >> > > > > > > > >>>>>>>>>>>>>>>>> does not preclude multiple controllers >> > running >> > > > > > > > >>>>>>>>>>>>>>>>> at the same time, hence if we don't have >> the >> > > > > > > > >>>>> controller >> > > > > > > > >>>>>>>>> queue >> > > > > > > > >>>>>>>>>>>>> capacity >> > > > > > > > >>>>>>>>>>>>>>>>> config and simply make its capacity to be >> 1, >> > > > > > > > >>>>>>>>>>>>>>>>> network threads handling requests from >> > > > > > > > >> different >> > > > > > > > >>>>>>>> controllers >> > > > > > > > >>>>>>>>>>> will >> > > > > > > > >>>>>>>>>>>> be >> > > > > > > > >>>>>>>>>>>>>>>>> blocked during those troublesome times, >> > > > > > > > >>>>>>>>>>>>>>>>> which is probably not what we want. On the >> > > > > > > > >> other >> > > > > > > > >>>>> hand, >> > > > > > > > >>>>>>>>> adding >> > > > > > > > >>>>>>>>>>> the >> > > > > > > > >>>>>>>>>>>>>> extra >> > > > > > > > >>>>>>>>>>>>>>>>> config with a default value, say 20, >> guards >> > us >> > > > > > > > >>> from >> > > > > > > > >>>>>>>> issues >> > > > > > > > >>>>>>>>> in >> > > > > > > > >>>>>>>>>>>> those >> > > > > > > > >>>>>>>>>>>>>>>>> troublesome times, and IMO there isn't >> much >> > > > > > > > >>>> downside >> > > > > > > > >>>>> of >> > > > > > > > >>>>>>>>> adding >> > > > > > > > >>>>>>>>>>> the >> > > > > > > > >>>>>>>>>>>>>> extra >> > > > > > > > >>>>>>>>>>>>>>>>> config. >> > > > > > > > >>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>> @Mayuresh >> > > > > > > > >>>>>>>>>>>>>>>>> Good catch, this sentence is an obsolete >> > > > > > > > >>> statement >> > > > > > > > >>>>>> based >> > > > > > > > >>>>>>>> on >> > > > > > > > >>>>>>>>> a >> > > > > > > > >>>>>>>>>>>>> previous >> > > > > > > > >>>>>>>>>>>>>>>>> design. I've revised the wording in the >> KIP. >> > > > > > > > >>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>> Thanks, >> > > > > > > > >>>>>>>>>>>>>>>>> Lucas >> > > > > > > > >>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>> On Tue, Jul 17, 2018 at 10:33 AM, Mayuresh >> > > > > > > > >>> Gharat < >> > > > > > > > >>>>>>>>>>>>>>>>> gharatmayures...@gmail.com> wrote: >> > > > > > > > >>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>> Hi Lucas, >> > > > > > > > >>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>> Thanks for the KIP. >> > > > > > > > >>>>>>>>>>>>>>>>>> I am trying to understand why you think >> "The >> > > > > > > > >>>> memory >> > > > > > > > >>>>>>>>>>> consumption >> > > > > > > > >>>>>>>>>>>>> can >> > > > > > > > >>>>>>>>>>>>>>> rise >> > > > > > > > >>>>>>>>>>>>>>>>>> given the total number of queued requests >> > can >> > > > > > > > >>> go >> > > > > > > > >>>> up >> > > > > > > > >>>>>> to >> > > > > > > > >>>>>>>> 2x" >> > > > > > > > >>>>>>>>>> in >> > > > > > > > >>>>>>>>>>>> the >> > > > > > > > >>>>>>>>>>>>>>> impact >> > > > > > > > >>>>>>>>>>>>>>>>>> section. Normally the requests from >> > > > > > > > >> controller >> > > > > > > > >>>> to a >> > > > > > > > >>>>>>>> Broker >> > > > > > > > >>>>>>>>>> are >> > > > > > > > >>>>>>>>>>>> not >> > > > > > > > >>>>>>>>>>>>>>> high >> > > > > > > > >>>>>>>>>>>>>>>>>> volume, right ? >> > > > > > > > >>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>> Thanks, >> > > > > > > > >>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>> Mayuresh >> > > > > > > > >>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>> On Tue, Jul 17, 2018 at 5:06 AM Becket >> Qin < >> > > > > > > > >>>>>>>>>>>> becket....@gmail.com> >> > > > > > > > >>>>>>>>>>>>>>>>> wrote: >> > > > > > > > >>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>>> Thanks for the KIP, Lucas. Separating >> the >> > > > > > > > >>>> control >> > > > > > > > >>>>>>>> plane >> > > > > > > > >>>>>>>>>> from >> > > > > > > > >>>>>>>>>>>> the >> > > > > > > > >>>>>>>>>>>>>>> data >> > > > > > > > >>>>>>>>>>>>>>>>>> plane >> > > > > > > > >>>>>>>>>>>>>>>>>>> makes a lot of sense. >> > > > > > > > >>>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>>> In the KIP you mentioned that the >> > > > > > > > >> controller >> > > > > > > > >>>>>> request >> > > > > > > > >>>>>>>>> queue >> > > > > > > > >>>>>>>>>>> may >> > > > > > > > >>>>>>>>>>>>>> have >> > > > > > > > >>>>>>>>>>>>>>>>> many >> > > > > > > > >>>>>>>>>>>>>>>>>>> requests in it. Will this be a common >> case? >> > > > > > > > >>> The >> > > > > > > > >>>>>>>>> controller >> > > > > > > > >>>>>>>>>>>>>> requests >> > > > > > > > >>>>>>>>>>>>>>>>> still >> > > > > > > > >>>>>>>>>>>>>>>>>>> goes through the SocketServer. The >> > > > > > > > >>> SocketServer >> > > > > > > > >>>>>> will >> > > > > > > > >>>>>>>>> mute >> > > > > > > > >>>>>>>>>>> the >> > > > > > > > >>>>>>>>>>>>>>> channel >> > > > > > > > >>>>>>>>>>>>>>>>>> once >> > > > > > > > >>>>>>>>>>>>>>>>>>> a request is read and put into the >> request >> > > > > > > > >>>>> channel. >> > > > > > > > >>>>>>>> So >> > > > > > > > >>>>>>>>>>>> assuming >> > > > > > > > >>>>>>>>>>>>>>> there >> > > > > > > > >>>>>>>>>>>>>>>>> is >> > > > > > > > >>>>>>>>>>>>>>>>>>> only one connection between controller >> and >> > > > > > > > >>> each >> > > > > > > > >>>>>>>> broker, >> > > > > > > > >>>>>>>>> on >> > > > > > > > >>>>>>>>>>> the >> > > > > > > > >>>>>>>>>>>>>>> broker >> > > > > > > > >>>>>>>>>>>>>>>>>> side, >> > > > > > > > >>>>>>>>>>>>>>>>>>> there should be only one controller >> request >> > > > > > > > >>> in >> > > > > > > > >>>>> the >> > > > > > > > >>>>>>>>>>> controller >> > > > > > > > >>>>>>>>>>>>>>> request >> > > > > > > > >>>>>>>>>>>>>>>>>> queue >> > > > > > > > >>>>>>>>>>>>>>>>>>> at any given time. If that is the case, >> do >> > > > > > > > >> we >> > > > > > > > >>>>> need >> > > > > > > > >>>>>> a >> > > > > > > > >>>>>>>>>>> separate >> > > > > > > > >>>>>>>>>>>>>>>>> controller >> > > > > > > > >>>>>>>>>>>>>>>>>>> request queue capacity config? The >> default >> > > > > > > > >>>> value >> > > > > > > > >>>>> 20 >> > > > > > > > >>>>>>>>> means >> > > > > > > > >>>>>>>>>>> that >> > > > > > > > >>>>>>>>>>>>> we >> > > > > > > > >>>>>>>>>>>>>>>>> expect >> > > > > > > > >>>>>>>>>>>>>>>>>>> there are 20 controller switches to >> happen >> > > > > > > > >>> in a >> > > > > > > > >>>>>> short >> > > > > > > > >>>>>>>>>> period >> > > > > > > > >>>>>>>>>>>> of >> > > > > > > > >>>>>>>>>>>>>>> time. >> > > > > > > > >>>>>>>>>>>>>>>>> I >> > > > > > > > >>>>>>>>>>>>>>>>>> am >> > > > > > > > >>>>>>>>>>>>>>>>>>> not sure whether someone should increase >> > > > > > > > >> the >> > > > > > > > >>>>>>>> controller >> > > > > > > > >>>>>>>>>>>> request >> > > > > > > > >>>>>>>>>>>>>>> queue >> > > > > > > > >>>>>>>>>>>>>>>>>>> capacity to handle such case, as it >> seems >> > > > > > > > >>>>>> indicating >> > > > > > > > >>>>>>>>>>> something >> > > > > > > > >>>>>>>>>>>>>> very >> > > > > > > > >>>>>>>>>>>>>>>>> wrong >> > > > > > > > >>>>>>>>>>>>>>>>>>> has happened. >> > > > > > > > >>>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>>> Thanks, >> > > > > > > > >>>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin >> > > > > > > > >>>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>>> On Fri, Jul 13, 2018 at 1:10 PM, Dong >> Lin < >> > > > > > > > >>>>>>>>>>>> lindon...@gmail.com> >> > > > > > > > >>>>>>>>>>>>>>>>> wrote: >> > > > > > > > >>>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>>>> Thanks for the update Lucas. >> > > > > > > > >>>>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>>>> I think the motivation section is >> > > > > > > > >>> intuitive. >> > > > > > > > >>>> It >> > > > > > > > >>>>>>>> will >> > > > > > > > >>>>>>>>> be >> > > > > > > > >>>>>>>>>>> good >> > > > > > > > >>>>>>>>>>>>> to >> > > > > > > > >>>>>>>>>>>>>>>>> learn >> > > > > > > > >>>>>>>>>>>>>>>>>>> more >> > > > > > > > >>>>>>>>>>>>>>>>>>>> about the comments from other >> reviewers. >> > > > > > > > >>>>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>>>> On Thu, Jul 12, 2018 at 9:48 PM, Lucas >> > > > > > > > >>> Wang < >> > > > > > > > >>>>>>>>>>>>>>> lucasatu...@gmail.com> >> > > > > > > > >>>>>>>>>>>>>>>>>>> wrote: >> > > > > > > > >>>>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> Hi Dong, >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> I've updated the motivation section of >> > > > > > > > >>> the >> > > > > > > > >>>>> KIP >> > > > > > > > >>>>>> by >> > > > > > > > >>>>>>>>>>>> explaining >> > > > > > > > >>>>>>>>>>>>>> the >> > > > > > > > >>>>>>>>>>>>>>>>>> cases >> > > > > > > > >>>>>>>>>>>>>>>>>>>> that >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> would have user impacts. >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> Please take a look at let me know your >> > > > > > > > >>>>>> comments. >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> Thanks, >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> Lucas >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> On Mon, Jul 9, 2018 at 5:53 PM, Lucas >> > > > > > > > >>> Wang >> > > > > > > > >>>> < >> > > > > > > > >>>>>>>>>>>>>>> lucasatu...@gmail.com >> > > > > > > > >>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>>>> wrote: >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> Hi Dong, >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> The simulation of disk being slow is >> > > > > > > > >>>> merely >> > > > > > > > >>>>>>>> for me >> > > > > > > > >>>>>>>>>> to >> > > > > > > > >>>>>>>>>>>>> easily >> > > > > > > > >>>>>>>>>>>>>>>>>>> construct >> > > > > > > > >>>>>>>>>>>>>>>>>>>> a >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> testing scenario >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> with a backlog of produce requests. >> > > > > > > > >> In >> > > > > > > > >>>>>>>> production, >> > > > > > > > >>>>>>>>>>> other >> > > > > > > > >>>>>>>>>>>>>> than >> > > > > > > > >>>>>>>>>>>>>>>>> the >> > > > > > > > >>>>>>>>>>>>>>>>>>> disk >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> being slow, a backlog of >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> produce requests may also be caused >> > > > > > > > >> by >> > > > > > > > >>>> high >> > > > > > > > >>>>>>>>> produce >> > > > > > > > >>>>>>>>>>> QPS. >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> In that case, we may not want to kill >> > > > > > > > >>> the >> > > > > > > > >>>>>>>> broker >> > > > > > > > >>>>>>>>> and >> > > > > > > > >>>>>>>>>>>>> that's >> > > > > > > > >>>>>>>>>>>>>>> when >> > > > > > > > >>>>>>>>>>>>>>>>>> this >> > > > > > > > >>>>>>>>>>>>>>>>>>>> KIP >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> can be useful, both for JBOD >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> and non-JBOD setup. >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> Going back to your previous question >> > > > > > > > >>>> about >> > > > > > > > >>>>>> each >> > > > > > > > >>>>>>>>>>>>>> ProduceRequest >> > > > > > > > >>>>>>>>>>>>>>>>>>> covering >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> 20 >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> partitions that are randomly >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> distributed, let's say a LeaderAndIsr >> > > > > > > > >>>>> request >> > > > > > > > >>>>>>>> is >> > > > > > > > >>>>>>>>>>>> enqueued >> > > > > > > > >>>>>>>>>>>>>> that >> > > > > > > > >>>>>>>>>>>>>>>>>> tries >> > > > > > > > >>>>>>>>>>>>>>>>>>> to >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> switch the current broker, say >> > > > > > > > >> broker0, >> > > > > > > > >>>>> from >> > > > > > > > >>>>>>>>> leader >> > > > > > > > >>>>>>>>>> to >> > > > > > > > >>>>>>>>>>>>>>> follower >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> *for one of the partitions*, say >> > > > > > > > >>>> *test-0*. >> > > > > > > > >>>>>> For >> > > > > > > > >>>>>>>> the >> > > > > > > > >>>>>>>>>>> sake >> > > > > > > > >>>>>>>>>>>> of >> > > > > > > > >>>>>>>>>>>>>>>>>> argument, >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> let's also assume the other brokers, >> > > > > > > > >>> say >> > > > > > > > >>>>>>>> broker1, >> > > > > > > > >>>>>>>>>> have >> > > > > > > > >>>>>>>>>>>>>>> *stopped* >> > > > > > > > >>>>>>>>>>>>>>>>>>>> fetching >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> from >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> the current broker, i.e. broker0. >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 1. If the enqueued produce requests >> > > > > > > > >>> have >> > > > > > > > >>>>>> acks = >> > > > > > > > >>>>>>>>> -1 >> > > > > > > > >>>>>>>>>>>> (ALL) >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 1.1 without this KIP, the >> > > > > > > > >>>> ProduceRequests >> > > > > > > > >>>>>>>> ahead >> > > > > > > > >>>>>>>>> of >> > > > > > > > >>>>>>>>>>>>>>>>> LeaderAndISR >> > > > > > > > >>>>>>>>>>>>>>>>>>> will >> > > > > > > > >>>>>>>>>>>>>>>>>>>> be >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> put into the purgatory, >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> and since they'll never be >> > > > > > > > >>>>> replicated >> > > > > > > > >>>>>>>> to >> > > > > > > > >>>>>>>>>> other >> > > > > > > > >>>>>>>>>>>>>> brokers >> > > > > > > > >>>>>>>>>>>>>>>>>>> (because >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> of >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> the assumption made above), they will >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> be completed either when the >> > > > > > > > >>>>>>>> LeaderAndISR >> > > > > > > > >>>>>>>>>>>> request >> > > > > > > > >>>>>>>>>>>>> is >> > > > > > > > >>>>>>>>>>>>>>>>>>> processed >> > > > > > > > >>>>>>>>>>>>>>>>>>>> or >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> when the timeout happens. >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 1.2 With this KIP, broker0 will >> > > > > > > > >>>>> immediately >> > > > > > > > >>>>>>>>>>> transition >> > > > > > > > >>>>>>>>>>>>> the >> > > > > > > > >>>>>>>>>>>>>>>>>>> partition >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> test-0 to become a follower, >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> after the current broker sees >> > > > > > > > >>> the >> > > > > > > > >>>>>>>>>> replication >> > > > > > > > >>>>>>>>>>> of >> > > > > > > > >>>>>>>>>>>>> the >> > > > > > > > >>>>>>>>>>>>>>>>>>> remaining >> > > > > > > > >>>>>>>>>>>>>>>>>>>> 19 >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> partitions, it can send a response >> > > > > > > > >>>>> indicating >> > > > > > > > >>>>>>>> that >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> it's no longer the leader for >> > > > > > > > >>> the >> > > > > > > > >>>>>>>>> "test-0". >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> To see the latency difference >> > > > > > > > >> between >> > > > > > > > >>>> 1.1 >> > > > > > > > >>>>>> and >> > > > > > > > >>>>>>>>> 1.2, >> > > > > > > > >>>>>>>>>>>> let's >> > > > > > > > >>>>>>>>>>>>>> say >> > > > > > > > >>>>>>>>>>>>>>>>>> there >> > > > > > > > >>>>>>>>>>>>>>>>>>>> are >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 24K produce requests ahead of the >> > > > > > > > >>>>>> LeaderAndISR, >> > > > > > > > >>>>>>>>> and >> > > > > > > > >>>>>>>>>>>> there >> > > > > > > > >>>>>>>>>>>>>> are >> > > > > > > > >>>>>>>>>>>>>>> 8 >> > > > > > > > >>>>>>>>>>>>>>>>> io >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> threads, >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> so each io thread will process >> > > > > > > > >>>>>> approximately >> > > > > > > > >>>>>>>>> 3000 >> > > > > > > > >>>>>>>>>>>>> produce >> > > > > > > > >>>>>>>>>>>>>>>>>> requests. >> > > > > > > > >>>>>>>>>>>>>>>>>>>> Now >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> let's investigate the io thread that >> > > > > > > > >>>>> finally >> > > > > > > > >>>>>>>>>> processed >> > > > > > > > >>>>>>>>>>>> the >> > > > > > > > >>>>>>>>>>>>>>>>>>>> LeaderAndISR. >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> For the 3000 produce requests, if >> > > > > > > > >> we >> > > > > > > > >>>>> model >> > > > > > > > >>>>>>>> the >> > > > > > > > >>>>>>>>>> time >> > > > > > > > >>>>>>>>>>>> when >> > > > > > > > >>>>>>>>>>>>>>> their >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> remaining >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 19 partitions catch up as t0, t1, >> > > > > > > > >>>> ...t2999, >> > > > > > > > >>>>>> and >> > > > > > > > >>>>>>>>> the >> > > > > > > > >>>>>>>>>>>>>>> LeaderAndISR >> > > > > > > > >>>>>>>>>>>>>>>>>>>> request >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> is >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> processed at time t3000. >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> Without this KIP, the 1st produce >> > > > > > > > >>>> request >> > > > > > > > >>>>>>>> would >> > > > > > > > >>>>>>>>>> have >> > > > > > > > >>>>>>>>>>>>>> waited >> > > > > > > > >>>>>>>>>>>>>>> an >> > > > > > > > >>>>>>>>>>>>>>>>>>> extra >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> t3000 - t0 time in the purgatory, the >> > > > > > > > >>> 2nd >> > > > > > > > >>>>> an >> > > > > > > > >>>>>>>> extra >> > > > > > > > >>>>>>>>>>> time >> > > > > > > > >>>>>>>>>>>> of >> > > > > > > > >>>>>>>>>>>>>>>>> t3000 - >> > > > > > > > >>>>>>>>>>>>>>>>>>> t1, >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> etc. >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> Roughly speaking, the latency >> > > > > > > > >>>> difference >> > > > > > > > >>>>> is >> > > > > > > > >>>>>>>>> bigger >> > > > > > > > >>>>>>>>>>> for >> > > > > > > > >>>>>>>>>>>>> the >> > > > > > > > >>>>>>>>>>>>>>>>>> earlier >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> produce requests than for the later >> > > > > > > > >>> ones. >> > > > > > > > >>>>> For >> > > > > > > > >>>>>>>> the >> > > > > > > > >>>>>>>>>> same >> > > > > > > > >>>>>>>>>>>>>> reason, >> > > > > > > > >>>>>>>>>>>>>>>>> the >> > > > > > > > >>>>>>>>>>>>>>>>>>> more >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> ProduceRequests queued >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> before the LeaderAndISR, the bigger >> > > > > > > > >>>>> benefit >> > > > > > > > >>>>>>>> we >> > > > > > > > >>>>>>>>> get >> > > > > > > > >>>>>>>>>>>>> (capped >> > > > > > > > >>>>>>>>>>>>>>> by >> > > > > > > > >>>>>>>>>>>>>>>>> the >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> produce timeout). >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 2. If the enqueued produce requests >> > > > > > > > >>> have >> > > > > > > > >>>>>>>> acks=0 or >> > > > > > > > >>>>>>>>>>>> acks=1 >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> There will be no latency >> > > > > > > > >> differences >> > > > > > > > >>> in >> > > > > > > > >>>>>> this >> > > > > > > > >>>>>>>>> case, >> > > > > > > > >>>>>>>>>>> but >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 2.1 without this KIP, the records >> > > > > > > > >> of >> > > > > > > > >>>>>>>> partition >> > > > > > > > >>>>>>>>>>> test-0 >> > > > > > > > >>>>>>>>>>>> in >> > > > > > > > >>>>>>>>>>>>>> the >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> ProduceRequests ahead of the >> > > > > > > > >>> LeaderAndISR >> > > > > > > > >>>>>> will >> > > > > > > > >>>>>>>> be >> > > > > > > > >>>>>>>>>>>> appended >> > > > > > > > >>>>>>>>>>>>>> to >> > > > > > > > >>>>>>>>>>>>>>>>> the >> > > > > > > > >>>>>>>>>>>>>>>>>>> local >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> log, >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> and eventually be truncated >> > > > > > > > >>> after >> > > > > > > > >>>>>>>>> processing >> > > > > > > > >>>>>>>>>>> the >> > > > > > > > >>>>>>>>>>>>>>>>>>> LeaderAndISR. >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> This is what's referred to as >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> "some unofficial definition >> > > > > > > > >> of >> > > > > > > > >>>> data >> > > > > > > > >>>>>>>> loss >> > > > > > > > >>>>>>>>> in >> > > > > > > > >>>>>>>>>>>> terms >> > > > > > > > >>>>>>>>>>>>> of >> > > > > > > > >>>>>>>>>>>>>>>>>> messages >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> beyond the high watermark". >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 2.2 with this KIP, we can mitigate >> > > > > > > > >>> the >> > > > > > > > >>>>>> effect >> > > > > > > > >>>>>>>>>> since >> > > > > > > > >>>>>>>>>>> if >> > > > > > > > >>>>>>>>>>>>> the >> > > > > > > > >>>>>>>>>>>>>>>>>>>> LeaderAndISR >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> is immediately processed, the >> > > > > > > > >> response >> > > > > > > > >>> to >> > > > > > > > >>>>>>>>> producers >> > > > > > > > >>>>>>>>>>> will >> > > > > > > > >>>>>>>>>>>>>> have >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> the NotLeaderForPartition >> > > > > > > > >>> error, >> > > > > > > > >>>>>>>> causing >> > > > > > > > >>>>>>>>>>>> producers >> > > > > > > > >>>>>>>>>>>>>> to >> > > > > > > > >>>>>>>>>>>>>>>>> retry >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> This explanation above is the benefit >> > > > > > > > >>> for >> > > > > > > > >>>>>>>> reducing >> > >> >> >> -- >> -Regards, >> Mayuresh R. Gharat >> (862) 250-7125 >> > >