I agree with Dong that out-of-order processing can happen with having 2 separate queues as well and it can even happen today. Can we use the correlationId in the request from the controller to the broker to handle ordering ?
Thanks, Mayuresh On Thu, Jul 19, 2018 at 6:41 AM Becket Qin <becket....@gmail.com> wrote: > Good point, Joel. I agree that a dedicated controller request handling > thread would be a better isolation. It also solves the reordering issue. > > On Thu, Jul 19, 2018 at 2:23 PM, Joel Koshy <jjkosh...@gmail.com> wrote: > > > Good example. I think this scenario can occur in the current code as well > > but with even lower probability given that there are other non-controller > > requests interleaved. It is still sketchy though and I think a safer > > approach would be separate queues and pinning controller request handling > > to one handler thread. > > > > On Wed, Jul 18, 2018 at 11:12 PM, Dong Lin <lindon...@gmail.com> wrote: > > > > > Hey Becket, > > > > > > I think you are right that there may be out-of-order processing. > However, > > > it seems that out-of-order processing may also happen even if we use a > > > separate queue. > > > > > > Here is the example: > > > > > > - Controller sends R1 and got disconnected before receiving response. > > Then > > > it reconnects and sends R2. Both requests now stay in the controller > > > request queue in the order they are sent. > > > - thread1 takes R1_a from the request queue and then thread2 takes R2 > > from > > > the request queue almost at the same time. > > > - So R1_a and R2 are processed in parallel. There is chance that R2's > > > processing is completed before R1. > > > > > > If out-of-order processing can happen for both approaches with very low > > > probability, it may not be worthwhile to add the extra queue. What do > you > > > think? > > > > > > Thanks, > > > Dong > > > > > > > > > On Wed, Jul 18, 2018 at 6:17 PM, Becket Qin <becket....@gmail.com> > > wrote: > > > > > > > Hi Mayuresh/Joel, > > > > > > > > Using the request channel as a dequeue was bright up some time ago > when > > > we > > > > initially thinking of prioritizing the request. The concern was that > > the > > > > controller requests are supposed to be processed in order. If we can > > > ensure > > > > that there is one controller request in the request channel, the > order > > is > > > > not a concern. But in cases that there are more than one controller > > > request > > > > inserted into the queue, the controller request order may change and > > > cause > > > > problem. For example, think about the following sequence: > > > > 1. Controller successfully sent a request R1 to broker > > > > 2. Broker receives R1 and put the request to the head of the request > > > queue. > > > > 3. Controller to broker connection failed and the controller > > reconnected > > > to > > > > the broker. > > > > 4. Controller sends a request R2 to the broker > > > > 5. Broker receives R2 and add it to the head of the request queue. > > > > Now on the broker side, R2 will be processed before R1 is processed, > > > which > > > > may cause problem. > > > > > > > > Thanks, > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > > > > > > > On Thu, Jul 19, 2018 at 3:23 AM, Joel Koshy <jjkosh...@gmail.com> > > wrote: > > > > > > > > > @Mayuresh - I like your idea. It appears to be a simpler less > > invasive > > > > > alternative and it should work. Jun/Becket/others, do you see any > > > > pitfalls > > > > > with this approach? > > > > > > > > > > On Wed, Jul 18, 2018 at 12:03 PM, Lucas Wang < > lucasatu...@gmail.com> > > > > > wrote: > > > > > > > > > > > @Mayuresh, > > > > > > That's a very interesting idea that I haven't thought before. > > > > > > It seems to solve our problem at hand pretty well, and also > > > > > > avoids the need to have a new size metric and capacity config > > > > > > for the controller request queue. In fact, if we were to adopt > > > > > > this design, there is no public interface change, and we > > > > > > probably don't need a KIP. > > > > > > Also implementation wise, it seems > > > > > > the java class LinkedBlockingQueue can readily satisfy the > > > requirement > > > > > > by supporting a capacity, and also allowing inserting at both > ends. > > > > > > > > > > > > My only concern is that this design is tied to the coincidence > that > > > > > > we have two request priorities and there are two ends to a deque. > > > > > > Hence by using the proposed design, it seems the network layer is > > > > > > more tightly coupled with upper layer logic, e.g. if we were to > add > > > > > > an extra priority level in the future for some reason, we would > > > > probably > > > > > > need to go back to the design of separate queues, one for each > > > priority > > > > > > level. > > > > > > > > > > > > In summary, I'm ok with both designs and lean toward your > suggested > > > > > > approach. > > > > > > Let's hear what others think. > > > > > > > > > > > > @Becket, > > > > > > In light of Mayuresh's suggested new design, I'm answering your > > > > question > > > > > > only in the context > > > > > > of the current KIP design: I think your suggestion makes sense, > and > > > I'm > > > > > ok > > > > > > with removing the capacity config and > > > > > > just relying on the default value of 20 being sufficient enough. > > > > > > > > > > > > Thanks, > > > > > > Lucas > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Jul 18, 2018 at 9:57 AM, Mayuresh Gharat < > > > > > > gharatmayures...@gmail.com > > > > > > > wrote: > > > > > > > > > > > > > Hi Lucas, > > > > > > > > > > > > > > Seems like the main intent here is to prioritize the controller > > > > request > > > > > > > over any other requests. > > > > > > > In that case, we can change the request queue to a dequeue, > where > > > you > > > > > > > always insert the normal requests (produce, consume,..etc) to > the > > > end > > > > > of > > > > > > > the dequeue, but if its a controller request, you insert it to > > the > > > > head > > > > > > of > > > > > > > the queue. This ensures that the controller request will be > given > > > > > higher > > > > > > > priority over other requests. > > > > > > > > > > > > > > Also since we only read one request from the socket and mute it > > and > > > > > only > > > > > > > unmute it after handling the request, this would ensure that we > > > don't > > > > > > > handle controller requests out of order. > > > > > > > > > > > > > > With this approach we can avoid the second queue and the > > additional > > > > > > config > > > > > > > for the size of the queue. > > > > > > > > > > > > > > What do you think ? > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > Mayuresh > > > > > > > > > > > > > > > > > > > > > On Wed, Jul 18, 2018 at 3:05 AM Becket Qin < > becket....@gmail.com > > > > > > > > wrote: > > > > > > > > > > > > > > > Hey Joel, > > > > > > > > > > > > > > > > Thank for the detail explanation. I agree the current design > > > makes > > > > > > sense. > > > > > > > > My confusion is about whether the new config for the > controller > > > > queue > > > > > > > > capacity is necessary. I cannot think of a case in which > users > > > > would > > > > > > > change > > > > > > > > it. > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > > > > > > > On Wed, Jul 18, 2018 at 6:00 PM, Becket Qin < > > > becket....@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi Lucas, > > > > > > > > > > > > > > > > > > I guess my question can be rephrased to "do we expect user > to > > > > ever > > > > > > > change > > > > > > > > > the controller request queue capacity"? If we agree that 20 > > is > > > > > > already > > > > > > > a > > > > > > > > > very generous default number and we do not expect user to > > > change > > > > > it, > > > > > > is > > > > > > > > it > > > > > > > > > still necessary to expose this as a config? > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > > > > > > > > > On Wed, Jul 18, 2018 at 2:29 AM, Lucas Wang < > > > > lucasatu...@gmail.com > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > >> @Becket > > > > > > > > >> 1. Thanks for the comment. You are right that normally > there > > > > > should > > > > > > be > > > > > > > > >> just > > > > > > > > >> one controller request because of muting, > > > > > > > > >> and I had NOT intended to say there would be many enqueued > > > > > > controller > > > > > > > > >> requests. > > > > > > > > >> I went through the KIP again, and I'm not sure which part > > > > conveys > > > > > > that > > > > > > > > >> info. > > > > > > > > >> I'd be happy to revise if you point it out the section. > > > > > > > > >> > > > > > > > > >> 2. Though it should not happen in normal conditions, the > > > current > > > > > > > design > > > > > > > > >> does not preclude multiple controllers running > > > > > > > > >> at the same time, hence if we don't have the controller > > queue > > > > > > capacity > > > > > > > > >> config and simply make its capacity to be 1, > > > > > > > > >> network threads handling requests from different > controllers > > > > will > > > > > be > > > > > > > > >> blocked during those troublesome times, > > > > > > > > >> which is probably not what we want. On the other hand, > > adding > > > > the > > > > > > > extra > > > > > > > > >> config with a default value, say 20, guards us from issues > > in > > > > > those > > > > > > > > >> troublesome times, and IMO there isn't much downside of > > adding > > > > the > > > > > > > extra > > > > > > > > >> config. > > > > > > > > >> > > > > > > > > >> @Mayuresh > > > > > > > > >> Good catch, this sentence is an obsolete statement based > on > > a > > > > > > previous > > > > > > > > >> design. I've revised the wording in the KIP. > > > > > > > > >> > > > > > > > > >> Thanks, > > > > > > > > >> Lucas > > > > > > > > >> > > > > > > > > >> On Tue, Jul 17, 2018 at 10:33 AM, Mayuresh Gharat < > > > > > > > > >> gharatmayures...@gmail.com> wrote: > > > > > > > > >> > > > > > > > > >> > Hi Lucas, > > > > > > > > >> > > > > > > > > > >> > Thanks for the KIP. > > > > > > > > >> > I am trying to understand why you think "The memory > > > > consumption > > > > > > can > > > > > > > > rise > > > > > > > > >> > given the total number of queued requests can go up to > 2x" > > > in > > > > > the > > > > > > > > impact > > > > > > > > >> > section. Normally the requests from controller to a > Broker > > > are > > > > > not > > > > > > > > high > > > > > > > > >> > volume, right ? > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > Thanks, > > > > > > > > >> > > > > > > > > > >> > Mayuresh > > > > > > > > >> > > > > > > > > > >> > On Tue, Jul 17, 2018 at 5:06 AM Becket Qin < > > > > > becket....@gmail.com> > > > > > > > > >> wrote: > > > > > > > > >> > > > > > > > > > >> > > Thanks for the KIP, Lucas. Separating the control > plane > > > from > > > > > the > > > > > > > > data > > > > > > > > >> > plane > > > > > > > > >> > > makes a lot of sense. > > > > > > > > >> > > > > > > > > > > >> > > In the KIP you mentioned that the controller request > > queue > > > > may > > > > > > > have > > > > > > > > >> many > > > > > > > > >> > > requests in it. Will this be a common case? The > > controller > > > > > > > requests > > > > > > > > >> still > > > > > > > > >> > > goes through the SocketServer. The SocketServer will > > mute > > > > the > > > > > > > > channel > > > > > > > > >> > once > > > > > > > > >> > > a request is read and put into the request channel. So > > > > > assuming > > > > > > > > there > > > > > > > > >> is > > > > > > > > >> > > only one connection between controller and each > broker, > > on > > > > the > > > > > > > > broker > > > > > > > > >> > side, > > > > > > > > >> > > there should be only one controller request in the > > > > controller > > > > > > > > request > > > > > > > > >> > queue > > > > > > > > >> > > at any given time. If that is the case, do we need a > > > > separate > > > > > > > > >> controller > > > > > > > > >> > > request queue capacity config? The default value 20 > > means > > > > that > > > > > > we > > > > > > > > >> expect > > > > > > > > >> > > there are 20 controller switches to happen in a short > > > period > > > > > of > > > > > > > > time. > > > > > > > > >> I > > > > > > > > >> > am > > > > > > > > >> > > not sure whether someone should increase the > controller > > > > > request > > > > > > > > queue > > > > > > > > >> > > capacity to handle such case, as it seems indicating > > > > something > > > > > > > very > > > > > > > > >> wrong > > > > > > > > >> > > has happened. > > > > > > > > >> > > > > > > > > > > >> > > Thanks, > > > > > > > > >> > > > > > > > > > > >> > > Jiangjie (Becket) Qin > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > On Fri, Jul 13, 2018 at 1:10 PM, Dong Lin < > > > > > lindon...@gmail.com> > > > > > > > > >> wrote: > > > > > > > > >> > > > > > > > > > > >> > > > Thanks for the update Lucas. > > > > > > > > >> > > > > > > > > > > > >> > > > I think the motivation section is intuitive. It will > > be > > > > good > > > > > > to > > > > > > > > >> learn > > > > > > > > >> > > more > > > > > > > > >> > > > about the comments from other reviewers. > > > > > > > > >> > > > > > > > > > > > >> > > > On Thu, Jul 12, 2018 at 9:48 PM, Lucas Wang < > > > > > > > > lucasatu...@gmail.com> > > > > > > > > >> > > wrote: > > > > > > > > >> > > > > > > > > > > > >> > > > > Hi Dong, > > > > > > > > >> > > > > > > > > > > > > >> > > > > I've updated the motivation section of the KIP by > > > > > explaining > > > > > > > the > > > > > > > > >> > cases > > > > > > > > >> > > > that > > > > > > > > >> > > > > would have user impacts. > > > > > > > > >> > > > > Please take a look at let me know your comments. > > > > > > > > >> > > > > > > > > > > > > >> > > > > Thanks, > > > > > > > > >> > > > > Lucas > > > > > > > > >> > > > > > > > > > > > > >> > > > > On Mon, Jul 9, 2018 at 5:53 PM, Lucas Wang < > > > > > > > > lucasatu...@gmail.com > > > > > > > > >> > > > > > > > > > >> > > > wrote: > > > > > > > > >> > > > > > > > > > > > > >> > > > > > Hi Dong, > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > The simulation of disk being slow is merely for > me > > > to > > > > > > easily > > > > > > > > >> > > construct > > > > > > > > >> > > > a > > > > > > > > >> > > > > > testing scenario > > > > > > > > >> > > > > > with a backlog of produce requests. In > production, > > > > other > > > > > > > than > > > > > > > > >> the > > > > > > > > >> > > disk > > > > > > > > >> > > > > > being slow, a backlog of > > > > > > > > >> > > > > > produce requests may also be caused by high > > produce > > > > QPS. > > > > > > > > >> > > > > > In that case, we may not want to kill the broker > > and > > > > > > that's > > > > > > > > when > > > > > > > > >> > this > > > > > > > > >> > > > KIP > > > > > > > > >> > > > > > can be useful, both for JBOD > > > > > > > > >> > > > > > and non-JBOD setup. > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > Going back to your previous question about each > > > > > > > ProduceRequest > > > > > > > > >> > > covering > > > > > > > > >> > > > > 20 > > > > > > > > >> > > > > > partitions that are randomly > > > > > > > > >> > > > > > distributed, let's say a LeaderAndIsr request is > > > > > enqueued > > > > > > > that > > > > > > > > >> > tries > > > > > > > > >> > > to > > > > > > > > >> > > > > > switch the current broker, say broker0, from > > leader > > > to > > > > > > > > follower > > > > > > > > >> > > > > > *for one of the partitions*, say *test-0*. For > the > > > > sake > > > > > of > > > > > > > > >> > argument, > > > > > > > > >> > > > > > let's also assume the other brokers, say > broker1, > > > have > > > > > > > > *stopped* > > > > > > > > >> > > > fetching > > > > > > > > >> > > > > > from > > > > > > > > >> > > > > > the current broker, i.e. broker0. > > > > > > > > >> > > > > > 1. If the enqueued produce requests have acks = > > -1 > > > > > (ALL) > > > > > > > > >> > > > > > 1.1 without this KIP, the ProduceRequests > ahead > > of > > > > > > > > >> LeaderAndISR > > > > > > > > >> > > will > > > > > > > > >> > > > be > > > > > > > > >> > > > > > put into the purgatory, > > > > > > > > >> > > > > > and since they'll never be replicated to > > > other > > > > > > > brokers > > > > > > > > >> > > (because > > > > > > > > >> > > > > of > > > > > > > > >> > > > > > the assumption made above), they will > > > > > > > > >> > > > > > be completed either when the > LeaderAndISR > > > > > request > > > > > > is > > > > > > > > >> > > processed > > > > > > > > >> > > > or > > > > > > > > >> > > > > > when the timeout happens. > > > > > > > > >> > > > > > 1.2 With this KIP, broker0 will immediately > > > > transition > > > > > > the > > > > > > > > >> > > partition > > > > > > > > >> > > > > > test-0 to become a follower, > > > > > > > > >> > > > > > after the current broker sees the > > > replication > > > > of > > > > > > the > > > > > > > > >> > > remaining > > > > > > > > >> > > > 19 > > > > > > > > >> > > > > > partitions, it can send a response indicating > that > > > > > > > > >> > > > > > it's no longer the leader for the > > "test-0". > > > > > > > > >> > > > > > To see the latency difference between 1.1 and > > 1.2, > > > > > let's > > > > > > > say > > > > > > > > >> > there > > > > > > > > >> > > > are > > > > > > > > >> > > > > > 24K produce requests ahead of the LeaderAndISR, > > and > > > > > there > > > > > > > are > > > > > > > > 8 > > > > > > > > >> io > > > > > > > > >> > > > > threads, > > > > > > > > >> > > > > > so each io thread will process approximately > > 3000 > > > > > > produce > > > > > > > > >> > requests. > > > > > > > > >> > > > Now > > > > > > > > >> > > > > > let's investigate the io thread that finally > > > processed > > > > > the > > > > > > > > >> > > > LeaderAndISR. > > > > > > > > >> > > > > > For the 3000 produce requests, if we model the > > > time > > > > > when > > > > > > > > their > > > > > > > > >> > > > > remaining > > > > > > > > >> > > > > > 19 partitions catch up as t0, t1, ...t2999, and > > the > > > > > > > > LeaderAndISR > > > > > > > > >> > > > request > > > > > > > > >> > > > > is > > > > > > > > >> > > > > > processed at time t3000. > > > > > > > > >> > > > > > Without this KIP, the 1st produce request > would > > > have > > > > > > > waited > > > > > > > > an > > > > > > > > >> > > extra > > > > > > > > >> > > > > > t3000 - t0 time in the purgatory, the 2nd an > extra > > > > time > > > > > of > > > > > > > > >> t3000 - > > > > > > > > >> > > t1, > > > > > > > > >> > > > > etc. > > > > > > > > >> > > > > > Roughly speaking, the latency difference is > > bigger > > > > for > > > > > > the > > > > > > > > >> > earlier > > > > > > > > >> > > > > > produce requests than for the later ones. For > the > > > same > > > > > > > reason, > > > > > > > > >> the > > > > > > > > >> > > more > > > > > > > > >> > > > > > ProduceRequests queued > > > > > > > > >> > > > > > before the LeaderAndISR, the bigger benefit we > > get > > > > > > (capped > > > > > > > > by > > > > > > > > >> the > > > > > > > > >> > > > > > produce timeout). > > > > > > > > >> > > > > > 2. If the enqueued produce requests have acks=0 > or > > > > > acks=1 > > > > > > > > >> > > > > > There will be no latency differences in this > > case, > > > > but > > > > > > > > >> > > > > > 2.1 without this KIP, the records of partition > > > > test-0 > > > > > in > > > > > > > the > > > > > > > > >> > > > > > ProduceRequests ahead of the LeaderAndISR will > be > > > > > appended > > > > > > > to > > > > > > > > >> the > > > > > > > > >> > > local > > > > > > > > >> > > > > log, > > > > > > > > >> > > > > > and eventually be truncated after > > processing > > > > the > > > > > > > > >> > > LeaderAndISR. > > > > > > > > >> > > > > > This is what's referred to as > > > > > > > > >> > > > > > "some unofficial definition of data loss > > in > > > > > terms > > > > > > of > > > > > > > > >> > messages > > > > > > > > >> > > > > > beyond the high watermark". > > > > > > > > >> > > > > > 2.2 with this KIP, we can mitigate the effect > > > since > > > > if > > > > > > the > > > > > > > > >> > > > LeaderAndISR > > > > > > > > >> > > > > > is immediately processed, the response to > > producers > > > > will > > > > > > > have > > > > > > > > >> > > > > > the NotLeaderForPartition error, causing > > > > > producers > > > > > > > to > > > > > > > > >> retry > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > This explanation above is the benefit for > reducing > > > the > > > > > > > latency > > > > > > > > >> of a > > > > > > > > >> > > > > broker > > > > > > > > >> > > > > > becoming the follower, > > > > > > > > >> > > > > > closely related is reducing the latency of a > > broker > > > > > > becoming > > > > > > > > the > > > > > > > > >> > > > leader. > > > > > > > > >> > > > > > In this case, the benefit is even more obvious, > if > > > > other > > > > > > > > brokers > > > > > > > > >> > have > > > > > > > > >> > > > > > resigned leadership, and the > > > > > > > > >> > > > > > current broker should take leadership. Any delay > > in > > > > > > > processing > > > > > > > > >> the > > > > > > > > >> > > > > > LeaderAndISR will be perceived > > > > > > > > >> > > > > > by clients as unavailability. In extreme cases, > > this > > > > can > > > > > > > cause > > > > > > > > >> > failed > > > > > > > > >> > > > > > produce requests if the retries are > > > > > > > > >> > > > > > exhausted. > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > Another two types of controller requests are > > > > > > UpdateMetadata > > > > > > > > and > > > > > > > > >> > > > > > StopReplica, which I'll briefly discuss as > > follows: > > > > > > > > >> > > > > > For UpdateMetadata requests, delayed processing > > > means > > > > > > > clients > > > > > > > > >> > > receiving > > > > > > > > >> > > > > > stale metadata, e.g. with the wrong leadership > > info > > > > > > > > >> > > > > > for certain partitions, and the effect is more > > > retries > > > > > or > > > > > > > even > > > > > > > > >> > fatal > > > > > > > > >> > > > > > failure if the retries are exhausted. > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > For StopReplica requests, a long queuing time > may > > > > > degrade > > > > > > > the > > > > > > > > >> > > > performance > > > > > > > > >> > > > > > of topic deletion. > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > Regarding your last question of the delay for > > > > > > > > >> > DescribeLogDirsRequest, > > > > > > > > >> > > > you > > > > > > > > >> > > > > > are right > > > > > > > > >> > > > > > that this KIP cannot help with the latency in > > > getting > > > > > the > > > > > > > log > > > > > > > > >> dirs > > > > > > > > >> > > > info, > > > > > > > > >> > > > > > and it's only relevant > > > > > > > > >> > > > > > when controller requests are involved. > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > Regards, > > > > > > > > >> > > > > > Lucas > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > On Tue, Jul 3, 2018 at 5:11 PM, Dong Lin < > > > > > > > lindon...@gmail.com > > > > > > > > > > > > > > > > > >> > > wrote: > > > > > > > > >> > > > > > > > > > > > > > >> > > > > >> Hey Jun, > > > > > > > > >> > > > > >> > > > > > > > > >> > > > > >> Thanks much for the comments. It is good point. > > So > > > > the > > > > > > > > feature > > > > > > > > >> may > > > > > > > > >> > > be > > > > > > > > >> > > > > >> useful for JBOD use-case. I have one question > > > below. > > > > > > > > >> > > > > >> > > > > > > > > >> > > > > >> Hey Lucas, > > > > > > > > >> > > > > >> > > > > > > > > >> > > > > >> Do you think this feature is also useful for > > > non-JBOD > > > > > > setup > > > > > > > > or > > > > > > > > >> it > > > > > > > > >> > is > > > > > > > > >> > > > > only > > > > > > > > >> > > > > >> useful for the JBOD setup? It may be useful to > > > > > understand > > > > > > > > this. > > > > > > > > >> > > > > >> > > > > > > > > >> > > > > >> When the broker is setup using JBOD, in order > to > > > move > > > > > > > leaders > > > > > > > > >> on > > > > > > > > >> > the > > > > > > > > >> > > > > >> failed > > > > > > > > >> > > > > >> disk to other disks, the system operator first > > > needs > > > > to > > > > > > get > > > > > > > > the > > > > > > > > >> > list > > > > > > > > >> > > > of > > > > > > > > >> > > > > >> partitions on the failed disk. This is > currently > > > > > achieved > > > > > > > > using > > > > > > > > >> > > > > >> AdminClient.describeLogDirs(), which sends > > > > > > > > >> DescribeLogDirsRequest > > > > > > > > >> > to > > > > > > > > >> > > > the > > > > > > > > >> > > > > >> broker. If we only prioritize the controller > > > > requests, > > > > > > then > > > > > > > > the > > > > > > > > >> > > > > >> DescribeLogDirsRequest > > > > > > > > >> > > > > >> may still take a long time to be processed by > the > > > > > broker. > > > > > > > So > > > > > > > > >> the > > > > > > > > >> > > > overall > > > > > > > > >> > > > > >> time to move leaders away from the failed disk > > may > > > > > still > > > > > > be > > > > > > > > >> long > > > > > > > > >> > > even > > > > > > > > >> > > > > with > > > > > > > > >> > > > > >> this KIP. What do you think? > > > > > > > > >> > > > > >> > > > > > > > > >> > > > > >> Thanks, > > > > > > > > >> > > > > >> Dong > > > > > > > > >> > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > > > > >> On Tue, Jul 3, 2018 at 4:38 PM, Lucas Wang < > > > > > > > > >> lucasatu...@gmail.com > > > > > > > > >> > > > > > > > > > > >> > > > > wrote: > > > > > > > > >> > > > > >> > > > > > > > > >> > > > > >> > Thanks for the insightful comment, Jun. > > > > > > > > >> > > > > >> > > > > > > > > > >> > > > > >> > @Dong, > > > > > > > > >> > > > > >> > Since both of the two comments in your > previous > > > > email > > > > > > are > > > > > > > > >> about > > > > > > > > >> > > the > > > > > > > > >> > > > > >> > benefits of this KIP and whether it's useful, > > > > > > > > >> > > > > >> > in light of Jun's last comment, do you agree > > that > > > > > this > > > > > > > KIP > > > > > > > > >> can > > > > > > > > >> > be > > > > > > > > >> > > > > >> > beneficial in the case mentioned by Jun? > > > > > > > > >> > > > > >> > Please let me know, thanks! > > > > > > > > >> > > > > >> > > > > > > > > > >> > > > > >> > Regards, > > > > > > > > >> > > > > >> > Lucas > > > > > > > > >> > > > > >> > > > > > > > > > >> > > > > >> > On Tue, Jul 3, 2018 at 2:07 PM, Jun Rao < > > > > > > > j...@confluent.io> > > > > > > > > >> > wrote: > > > > > > > > >> > > > > >> > > > > > > > > > >> > > > > >> > > Hi, Lucas, Dong, > > > > > > > > >> > > > > >> > > > > > > > > > > >> > > > > >> > > If all disks on a broker are slow, one > > probably > > > > > > should > > > > > > > > just > > > > > > > > >> > kill > > > > > > > > >> > > > the > > > > > > > > >> > > > > >> > > broker. In that case, this KIP may not > help. > > If > > > > > only > > > > > > > one > > > > > > > > of > > > > > > > > >> > the > > > > > > > > >> > > > > disks > > > > > > > > >> > > > > >> on > > > > > > > > >> > > > > >> > a > > > > > > > > >> > > > > >> > > broker is slow, one may want to fail that > > disk > > > > and > > > > > > move > > > > > > > > the > > > > > > > > >> > > > leaders > > > > > > > > >> > > > > on > > > > > > > > >> > > > > >> > that > > > > > > > > >> > > > > >> > > disk to other brokers. In that case, being > > able > > > > to > > > > > > > > process > > > > > > > > >> the > > > > > > > > >> > > > > >> > LeaderAndIsr > > > > > > > > >> > > > > >> > > requests faster will potentially help the > > > > producers > > > > > > > > recover > > > > > > > > >> > > > quicker. > > > > > > > > >> > > > > >> > > > > > > > > > > >> > > > > >> > > Thanks, > > > > > > > > >> > > > > >> > > > > > > > > > > >> > > > > >> > > Jun > > > > > > > > >> > > > > >> > > > > > > > > > > >> > > > > >> > > On Mon, Jul 2, 2018 at 7:56 PM, Dong Lin < > > > > > > > > >> lindon...@gmail.com > > > > > > > > >> > > > > > > > > > > >> > > > > wrote: > > > > > > > > >> > > > > >> > > > > > > > > > > >> > > > > >> > > > Hey Lucas, > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > Thanks for the reply. Some follow up > > > questions > > > > > > below. > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > Regarding 1, if each ProduceRequest > covers > > 20 > > > > > > > > partitions > > > > > > > > >> > that > > > > > > > > >> > > > are > > > > > > > > >> > > > > >> > > randomly > > > > > > > > >> > > > > >> > > > distributed across all partitions, then > > each > > > > > > > > >> ProduceRequest > > > > > > > > >> > > will > > > > > > > > >> > > > > >> likely > > > > > > > > >> > > > > >> > > > cover some partitions for which the > broker > > is > > > > > still > > > > > > > > >> leader > > > > > > > > >> > > after > > > > > > > > >> > > > > it > > > > > > > > >> > > > > >> > > quickly > > > > > > > > >> > > > > >> > > > processes the > > > > > > > > >> > > > > >> > > > LeaderAndIsrRequest. Then broker will > still > > > be > > > > > slow > > > > > > > in > > > > > > > > >> > > > processing > > > > > > > > >> > > > > >> these > > > > > > > > >> > > > > >> > > > ProduceRequest and request will still be > > very > > > > > high > > > > > > > with > > > > > > > > >> this > > > > > > > > >> > > > KIP. > > > > > > > > >> > > > > It > > > > > > > > >> > > > > >> > > seems > > > > > > > > >> > > > > >> > > > that most ProduceRequest will still > timeout > > > > after > > > > > > 30 > > > > > > > > >> > seconds. > > > > > > > > >> > > Is > > > > > > > > >> > > > > >> this > > > > > > > > >> > > > > >> > > > understanding correct? > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > Regarding 2, if most ProduceRequest will > > > still > > > > > > > timeout > > > > > > > > >> after > > > > > > > > >> > > 30 > > > > > > > > >> > > > > >> > seconds, > > > > > > > > >> > > > > >> > > > then it is less clear how this KIP > reduces > > > > > average > > > > > > > > >> produce > > > > > > > > >> > > > > latency. > > > > > > > > >> > > > > >> Can > > > > > > > > >> > > > > >> > > you > > > > > > > > >> > > > > >> > > > clarify what metrics can be improved by > > this > > > > KIP? > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > Not sure why system operator directly > cares > > > > > number > > > > > > of > > > > > > > > >> > > truncated > > > > > > > > >> > > > > >> > messages. > > > > > > > > >> > > > > >> > > > Do you mean this KIP can improve average > > > > > throughput > > > > > > > or > > > > > > > > >> > reduce > > > > > > > > >> > > > > >> message > > > > > > > > >> > > > > >> > > > duplication? It will be good to > understand > > > > this. > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > Thanks, > > > > > > > > >> > > > > >> > > > Dong > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > On Tue, 3 Jul 2018 at 7:12 AM Lucas Wang > < > > > > > > > > >> > > lucasatu...@gmail.com > > > > > > > > >> > > > > > > > > > > > > >> > > > > >> > wrote: > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > Hi Dong, > > > > > > > > >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > Thanks for your valuable comments. > Please > > > see > > > > > my > > > > > > > > reply > > > > > > > > >> > > below. > > > > > > > > >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > 1. The Google doc showed only 1 > > partition. > > > > Now > > > > > > > let's > > > > > > > > >> > > consider > > > > > > > > >> > > > a > > > > > > > > >> > > > > >> more > > > > > > > > >> > > > > >> > > > common > > > > > > > > >> > > > > >> > > > > scenario > > > > > > > > >> > > > > >> > > > > where broker0 is the leader of many > > > > partitions. > > > > > > And > > > > > > > > >> let's > > > > > > > > >> > > say > > > > > > > > >> > > > > for > > > > > > > > >> > > > > >> > some > > > > > > > > >> > > > > >> > > > > reason its IO becomes slow. > > > > > > > > >> > > > > >> > > > > The number of leader partitions on > > broker0 > > > is > > > > > so > > > > > > > > large, > > > > > > > > >> > say > > > > > > > > >> > > > 10K, > > > > > > > > >> > > > > >> that > > > > > > > > >> > > > > >> > > the > > > > > > > > >> > > > > >> > > > > cluster is skewed, > > > > > > > > >> > > > > >> > > > > and the operator would like to shift > the > > > > > > leadership > > > > > > > > >> for a > > > > > > > > >> > > lot > > > > > > > > >> > > > of > > > > > > > > >> > > > > >> > > > > partitions, say 9K, to other brokers, > > > > > > > > >> > > > > >> > > > > either manually or through some service > > > like > > > > > > cruise > > > > > > > > >> > control. > > > > > > > > >> > > > > >> > > > > With this KIP, not only will the > > leadership > > > > > > > > transitions > > > > > > > > >> > > finish > > > > > > > > >> > > > > >> more > > > > > > > > >> > > > > >> > > > > quickly, helping the cluster itself > > > becoming > > > > > more > > > > > > > > >> > balanced, > > > > > > > > >> > > > > >> > > > > but all existing producers > corresponding > > to > > > > the > > > > > > 9K > > > > > > > > >> > > partitions > > > > > > > > >> > > > > will > > > > > > > > >> > > > > >> > get > > > > > > > > >> > > > > >> > > > the > > > > > > > > >> > > > > >> > > > > errors relatively quickly > > > > > > > > >> > > > > >> > > > > rather than relying on their timeout, > > > thanks > > > > to > > > > > > the > > > > > > > > >> > batched > > > > > > > > >> > > > > async > > > > > > > > >> > > > > >> ZK > > > > > > > > >> > > > > >> > > > > operations. > > > > > > > > >> > > > > >> > > > > To me it's a useful feature to have > > during > > > > such > > > > > > > > >> > troublesome > > > > > > > > >> > > > > times. > > > > > > > > >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > 2. The experiments in the Google Doc > have > > > > shown > > > > > > > that > > > > > > > > >> with > > > > > > > > >> > > this > > > > > > > > >> > > > > KIP > > > > > > > > >> > > > > >> > many > > > > > > > > >> > > > > >> > > > > producers > > > > > > > > >> > > > > >> > > > > receive an explicit error > > > > > NotLeaderForPartition, > > > > > > > > based > > > > > > > > >> on > > > > > > > > >> > > > which > > > > > > > > >> > > > > >> they > > > > > > > > >> > > > > >> > > > retry > > > > > > > > >> > > > > >> > > > > immediately. > > > > > > > > >> > > > > >> > > > > Therefore the latency (~14 > seconds+quick > > > > retry) > > > > > > for > > > > > > > > >> their > > > > > > > > >> > > > single > > > > > > > > >> > > > > >> > > message > > > > > > > > >> > > > > >> > > > is > > > > > > > > >> > > > > >> > > > > much smaller > > > > > > > > >> > > > > >> > > > > compared with the case of timing out > > > without > > > > > the > > > > > > > KIP > > > > > > > > >> (30 > > > > > > > > >> > > > seconds > > > > > > > > >> > > > > >> for > > > > > > > > >> > > > > >> > > > timing > > > > > > > > >> > > > > >> > > > > out + quick retry). > > > > > > > > >> > > > > >> > > > > One might argue that reducing the > timing > > > out > > > > on > > > > > > the > > > > > > > > >> > producer > > > > > > > > >> > > > > side > > > > > > > > >> > > > > >> can > > > > > > > > >> > > > > >> > > > > achieve the same result, > > > > > > > > >> > > > > >> > > > > yet reducing the timeout has its own > > > > > > drawbacks[1]. > > > > > > > > >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > Also *IF* there were a metric to show > the > > > > > number > > > > > > of > > > > > > > > >> > > truncated > > > > > > > > >> > > > > >> > messages > > > > > > > > >> > > > > >> > > on > > > > > > > > >> > > > > >> > > > > brokers, > > > > > > > > >> > > > > >> > > > > with the experiments done in the Google > > > Doc, > > > > it > > > > > > > > should > > > > > > > > >> be > > > > > > > > >> > > easy > > > > > > > > >> > > > > to > > > > > > > > >> > > > > >> see > > > > > > > > >> > > > > >> > > > that > > > > > > > > >> > > > > >> > > > > a lot fewer messages need > > > > > > > > >> > > > > >> > > > > to be truncated on broker0 since the > > > > up-to-date > > > > > > > > >> metadata > > > > > > > > >> > > > avoids > > > > > > > > >> > > > > >> > > appending > > > > > > > > >> > > > > >> > > > > of messages > > > > > > > > >> > > > > >> > > > > in subsequent PRODUCE requests. If we > > talk > > > > to a > > > > > > > > system > > > > > > > > >> > > > operator > > > > > > > > >> > > > > >> and > > > > > > > > >> > > > > >> > ask > > > > > > > > >> > > > > >> > > > > whether > > > > > > > > >> > > > > >> > > > > they prefer fewer wasteful IOs, I bet > > most > > > > > likely > > > > > > > the > > > > > > > > >> > answer > > > > > > > > >> > > > is > > > > > > > > >> > > > > >> yes. > > > > > > > > >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > 3. To answer your question, I think it > > > might > > > > be > > > > > > > > >> helpful to > > > > > > > > >> > > > > >> construct > > > > > > > > >> > > > > >> > > some > > > > > > > > >> > > > > >> > > > > formulas. > > > > > > > > >> > > > > >> > > > > To simplify the modeling, I'm going > back > > to > > > > the > > > > > > > case > > > > > > > > >> where > > > > > > > > >> > > > there > > > > > > > > >> > > > > >> is > > > > > > > > >> > > > > >> > > only > > > > > > > > >> > > > > >> > > > > ONE partition involved. > > > > > > > > >> > > > > >> > > > > Following the experiments in the Google > > > Doc, > > > > > > let's > > > > > > > > say > > > > > > > > >> > > broker0 > > > > > > > > >> > > > > >> > becomes > > > > > > > > >> > > > > >> > > > the > > > > > > > > >> > > > > >> > > > > follower at time t0, > > > > > > > > >> > > > > >> > > > > and after t0 there were still N produce > > > > > requests > > > > > > in > > > > > > > > its > > > > > > > > >> > > > request > > > > > > > > >> > > > > >> > queue. > > > > > > > > >> > > > > >> > > > > With the up-to-date metadata brought by > > > this > > > > > KIP, > > > > > > > > >> broker0 > > > > > > > > >> > > can > > > > > > > > >> > > > > >> reply > > > > > > > > >> > > > > >> > > with > > > > > > > > >> > > > > >> > > > an > > > > > > > > >> > > > > >> > > > > NotLeaderForPartition exception, > > > > > > > > >> > > > > >> > > > > let's use M1 to denote the average > > > processing > > > > > > time > > > > > > > of > > > > > > > > >> > > replying > > > > > > > > >> > > > > >> with > > > > > > > > >> > > > > >> > > such > > > > > > > > >> > > > > >> > > > an > > > > > > > > >> > > > > >> > > > > error message. > > > > > > > > >> > > > > >> > > > > Without this KIP, the broker will need > to > > > > > append > > > > > > > > >> messages > > > > > > > > >> > to > > > > > > > > >> > > > > >> > segments, > > > > > > > > >> > > > > >> > > > > which may trigger a flush to disk, > > > > > > > > >> > > > > >> > > > > let's use M2 to denote the average > > > processing > > > > > > time > > > > > > > > for > > > > > > > > >> > such > > > > > > > > >> > > > > logic. > > > > > > > > >> > > > > >> > > > > Then the average extra latency incurred > > > > without > > > > > > > this > > > > > > > > >> KIP > > > > > > > > >> > is > > > > > > > > >> > > N > > > > > > > > >> > > > * > > > > > > > > >> > > > > >> (M2 - > > > > > > > > >> > > > > >> > > > M1) / > > > > > > > > >> > > > > >> > > > > 2. > > > > > > > > >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > In practice, M2 should always be larger > > > than > > > > > M1, > > > > > > > > which > > > > > > > > >> > means > > > > > > > > >> > > > as > > > > > > > > >> > > > > >> long > > > > > > > > >> > > > > >> > > as N > > > > > > > > >> > > > > >> > > > > is positive, > > > > > > > > >> > > > > >> > > > > we would see improvements on the > average > > > > > latency. > > > > > > > > >> > > > > >> > > > > There does not need to be significant > > > backlog > > > > > of > > > > > > > > >> requests > > > > > > > > >> > in > > > > > > > > >> > > > the > > > > > > > > >> > > > > >> > > request > > > > > > > > >> > > > > >> > > > > queue, > > > > > > > > >> > > > > >> > > > > or severe degradation of disk > performance > > > to > > > > > have > > > > > > > the > > > > > > > > >> > > > > improvement. > > > > > > > > >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > Regards, > > > > > > > > >> > > > > >> > > > > Lucas > > > > > > > > >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > [1] For instance, reducing the timeout > on > > > the > > > > > > > > producer > > > > > > > > >> > side > > > > > > > > >> > > > can > > > > > > > > >> > > > > >> > trigger > > > > > > > > >> > > > > >> > > > > unnecessary duplicate requests > > > > > > > > >> > > > > >> > > > > when the corresponding leader broker is > > > > > > overloaded, > > > > > > > > >> > > > exacerbating > > > > > > > > >> > > > > >> the > > > > > > > > >> > > > > >> > > > > situation. > > > > > > > > >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > On Sun, Jul 1, 2018 at 9:18 PM, Dong > Lin > > < > > > > > > > > >> > > lindon...@gmail.com > > > > > > > > >> > > > > > > > > > > > > >> > > > > >> > wrote: > > > > > > > > >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > > Hey Lucas, > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > Thanks much for the detailed > > > documentation > > > > of > > > > > > the > > > > > > > > >> > > > experiment. > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > Initially I also think having a > > separate > > > > > queue > > > > > > > for > > > > > > > > >> > > > controller > > > > > > > > >> > > > > >> > > requests > > > > > > > > >> > > > > >> > > > is > > > > > > > > >> > > > > >> > > > > > useful because, as you mentioned in > the > > > > > summary > > > > > > > > >> section > > > > > > > > >> > of > > > > > > > > >> > > > the > > > > > > > > >> > > > > >> > Google > > > > > > > > >> > > > > >> > > > > doc, > > > > > > > > >> > > > > >> > > > > > controller requests are generally > more > > > > > > important > > > > > > > > than > > > > > > > > >> > data > > > > > > > > >> > > > > >> requests > > > > > > > > >> > > > > >> > > and > > > > > > > > >> > > > > >> > > > > we > > > > > > > > >> > > > > >> > > > > > probably want controller requests to > be > > > > > > processed > > > > > > > > >> > sooner. > > > > > > > > >> > > > But > > > > > > > > >> > > > > >> then > > > > > > > > >> > > > > >> > > Eno > > > > > > > > >> > > > > >> > > > > has > > > > > > > > >> > > > > >> > > > > > two very good questions which I am > not > > > sure > > > > > the > > > > > > > > >> Google > > > > > > > > >> > doc > > > > > > > > >> > > > has > > > > > > > > >> > > > > >> > > answered > > > > > > > > >> > > > > >> > > > > > explicitly. Could you help with the > > > > following > > > > > > > > >> questions? > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > 1) It is not very clear what is the > > > actual > > > > > > > benefit > > > > > > > > of > > > > > > > > >> > > > KIP-291 > > > > > > > > >> > > > > to > > > > > > > > >> > > > > >> > > users. > > > > > > > > >> > > > > >> > > > > The > > > > > > > > >> > > > > >> > > > > > experiment setup in the Google doc > > > > simulates > > > > > > the > > > > > > > > >> > scenario > > > > > > > > >> > > > that > > > > > > > > >> > > > > >> > broker > > > > > > > > >> > > > > >> > > > is > > > > > > > > >> > > > > >> > > > > > very slow handling ProduceRequest due > > to > > > > e.g. > > > > > > > slow > > > > > > > > >> disk. > > > > > > > > >> > > It > > > > > > > > >> > > > > >> > currently > > > > > > > > >> > > > > >> > > > > > assumes that there is only 1 > partition. > > > But > > > > > in > > > > > > > the > > > > > > > > >> > common > > > > > > > > >> > > > > >> scenario, > > > > > > > > >> > > > > >> > > it > > > > > > > > >> > > > > >> > > > is > > > > > > > > >> > > > > >> > > > > > probably reasonable to assume that > > there > > > > are > > > > > > many > > > > > > > > >> other > > > > > > > > >> > > > > >> partitions > > > > > > > > >> > > > > >> > > that > > > > > > > > >> > > > > >> > > > > are > > > > > > > > >> > > > > >> > > > > > also actively produced to and > > > > ProduceRequest > > > > > to > > > > > > > > these > > > > > > > > >> > > > > partition > > > > > > > > >> > > > > >> > also > > > > > > > > >> > > > > >> > > > > takes > > > > > > > > >> > > > > >> > > > > > e.g. 2 seconds to be processed. So > even > > > if > > > > > > > broker0 > > > > > > > > >> can > > > > > > > > >> > > > become > > > > > > > > >> > > > > >> > > follower > > > > > > > > >> > > > > >> > > > > for > > > > > > > > >> > > > > >> > > > > > the partition 0 soon, it probably > still > > > > needs > > > > > > to > > > > > > > > >> process > > > > > > > > >> > > the > > > > > > > > >> > > > > >> > > > > ProduceRequest > > > > > > > > >> > > > > >> > > > > > slowly t in the queue because these > > > > > > > ProduceRequests > > > > > > > > >> > cover > > > > > > > > >> > > > > other > > > > > > > > >> > > > > >> > > > > partitions. > > > > > > > > >> > > > > >> > > > > > Thus most ProduceRequest will still > > > timeout > > > > > > after > > > > > > > > 30 > > > > > > > > >> > > seconds > > > > > > > > >> > > > > and > > > > > > > > >> > > > > >> > most > > > > > > > > >> > > > > >> > > > > > clients will still likely timeout > after > > > 30 > > > > > > > seconds. > > > > > > > > >> Then > > > > > > > > >> > > it > > > > > > > > >> > > > is > > > > > > > > >> > > > > >> not > > > > > > > > >> > > > > >> > > > > > obviously what is the benefit to > client > > > > since > > > > > > > > client > > > > > > > > >> > will > > > > > > > > >> > > > > >> timeout > > > > > > > > >> > > > > >> > > after > > > > > > > > >> > > > > >> > > > > 30 > > > > > > > > >> > > > > >> > > > > > seconds before possibly re-connecting > > to > > > > > > broker1, > > > > > > > > >> with > > > > > > > > >> > or > > > > > > > > >> > > > > >> without > > > > > > > > >> > > > > >> > > > > KIP-291. > > > > > > > > >> > > > > >> > > > > > Did I miss something here? > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > 2) I guess Eno's is asking for the > > > specific > > > > > > > > benefits > > > > > > > > >> of > > > > > > > > >> > > this > > > > > > > > >> > > > > >> KIP to > > > > > > > > >> > > > > >> > > > user > > > > > > > > >> > > > > >> > > > > or > > > > > > > > >> > > > > >> > > > > > system administrator, e.g. whether > this > > > KIP > > > > > > > > decreases > > > > > > > > >> > > > average > > > > > > > > >> > > > > >> > > latency, > > > > > > > > >> > > > > >> > > > > > 999th percentile latency, probably of > > > > > exception > > > > > > > > >> exposed > > > > > > > > >> > to > > > > > > > > >> > > > > >> client > > > > > > > > >> > > > > >> > > etc. > > > > > > > > >> > > > > >> > > > It > > > > > > > > >> > > > > >> > > > > > is probably useful to clarify this. > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > 3) Does this KIP help improve user > > > > experience > > > > > > > only > > > > > > > > >> when > > > > > > > > >> > > > there > > > > > > > > >> > > > > is > > > > > > > > >> > > > > >> > > issue > > > > > > > > >> > > > > >> > > > > with > > > > > > > > >> > > > > >> > > > > > broker, e.g. significant backlog in > the > > > > > request > > > > > > > > queue > > > > > > > > >> > due > > > > > > > > >> > > to > > > > > > > > >> > > > > >> slow > > > > > > > > >> > > > > >> > > disk > > > > > > > > >> > > > > >> > > > as > > > > > > > > >> > > > > >> > > > > > described in the Google doc? Or is > this > > > KIP > > > > > > also > > > > > > > > >> useful > > > > > > > > >> > > when > > > > > > > > >> > > > > >> there > > > > > > > > >> > > > > >> > is > > > > > > > > >> > > > > >> > > > no > > > > > > > > >> > > > > >> > > > > > ongoing issue in the cluster? It > might > > be > > > > > > helpful > > > > > > > > to > > > > > > > > >> > > clarify > > > > > > > > >> > > > > >> this > > > > > > > > >> > > > > >> > to > > > > > > > > >> > > > > >> > > > > > understand the benefit of this KIP. > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > Thanks much, > > > > > > > > >> > > > > >> > > > > > Dong > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > On Fri, Jun 29, 2018 at 2:58 PM, > Lucas > > > > Wang < > > > > > > > > >> > > > > >> lucasatu...@gmail.com > > > > > > > > >> > > > > >> > > > > > > > > > > >> > > > > >> > > > > wrote: > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > Hi Eno, > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > Sorry for the delay in getting the > > > > > experiment > > > > > > > > >> results. > > > > > > > > >> > > > > >> > > > > > > Here is a link to the positive > impact > > > > > > achieved > > > > > > > by > > > > > > > > >> > > > > implementing > > > > > > > > >> > > > > >> > the > > > > > > > > >> > > > > >> > > > > > proposed > > > > > > > > >> > > > > >> > > > > > > change: > > > > > > > > >> > > > > >> > > > > > > > https://docs.google.com/document/d/ > > > > > > > > >> > > > > 1ge2jjp5aPTBber6zaIT9AdhW > > > > > > > > >> > > > > >> > > > > > > > FWUENJ3JO6Zyu4f9tgQ/edit?usp=sharing > > > > > > > > >> > > > > >> > > > > > > Please take a look when you have > time > > > and > > > > > let > > > > > > > me > > > > > > > > >> know > > > > > > > > >> > > your > > > > > > > > >> > > > > >> > > feedback. > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > Regards, > > > > > > > > >> > > > > >> > > > > > > Lucas > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > On Tue, Jun 26, 2018 at 9:52 AM, > > > Harsha < > > > > > > > > >> > > ka...@harsha.io> > > > > > > > > >> > > > > >> wrote: > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > Thanks for the pointer. Will > take a > > > > look > > > > > > > might > > > > > > > > >> suit > > > > > > > > >> > > our > > > > > > > > >> > > > > >> > > > requirements > > > > > > > > >> > > > > >> > > > > > > > better. > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > Thanks, > > > > > > > > >> > > > > >> > > > > > > > Harsha > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > On Mon, Jun 25th, 2018 at 2:52 > PM, > > > > Lucas > > > > > > > Wang < > > > > > > > > >> > > > > >> > > > lucasatu...@gmail.com > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > wrote: > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > Hi Harsha, > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > If I understand correctly, the > > > > > > replication > > > > > > > > >> quota > > > > > > > > >> > > > > mechanism > > > > > > > > >> > > > > >> > > > proposed > > > > > > > > >> > > > > >> > > > > > in > > > > > > > > >> > > > > >> > > > > > > > > KIP-73 can be helpful in that > > > > scenario. > > > > > > > > >> > > > > >> > > > > > > > > Have you tried it out? > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > Thanks, > > > > > > > > >> > > > > >> > > > > > > > > Lucas > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > -- -Regards, Mayuresh R. Gharat (862) 250-7125