@Mayuresh - I like your idea. It appears to be a simpler less invasive alternative and it should work. Jun/Becket/others, do you see any pitfalls with this approach?
On Wed, Jul 18, 2018 at 12:03 PM, Lucas Wang <lucasatu...@gmail.com> wrote: > @Mayuresh, > That's a very interesting idea that I haven't thought before. > It seems to solve our problem at hand pretty well, and also > avoids the need to have a new size metric and capacity config > for the controller request queue. In fact, if we were to adopt > this design, there is no public interface change, and we > probably don't need a KIP. > Also implementation wise, it seems > the java class LinkedBlockingQueue can readily satisfy the requirement > by supporting a capacity, and also allowing inserting at both ends. > > My only concern is that this design is tied to the coincidence that > we have two request priorities and there are two ends to a deque. > Hence by using the proposed design, it seems the network layer is > more tightly coupled with upper layer logic, e.g. if we were to add > an extra priority level in the future for some reason, we would probably > need to go back to the design of separate queues, one for each priority > level. > > In summary, I'm ok with both designs and lean toward your suggested > approach. > Let's hear what others think. > > @Becket, > In light of Mayuresh's suggested new design, I'm answering your question > only in the context > of the current KIP design: I think your suggestion makes sense, and I'm ok > with removing the capacity config and > just relying on the default value of 20 being sufficient enough. > > Thanks, > Lucas > > > > > > > > > > > On Wed, Jul 18, 2018 at 9:57 AM, Mayuresh Gharat < > gharatmayures...@gmail.com > > wrote: > > > Hi Lucas, > > > > Seems like the main intent here is to prioritize the controller request > > over any other requests. > > In that case, we can change the request queue to a dequeue, where you > > always insert the normal requests (produce, consume,..etc) to the end of > > the dequeue, but if its a controller request, you insert it to the head > of > > the queue. This ensures that the controller request will be given higher > > priority over other requests. > > > > Also since we only read one request from the socket and mute it and only > > unmute it after handling the request, this would ensure that we don't > > handle controller requests out of order. > > > > With this approach we can avoid the second queue and the additional > config > > for the size of the queue. > > > > What do you think ? > > > > Thanks, > > > > Mayuresh > > > > > > On Wed, Jul 18, 2018 at 3:05 AM Becket Qin <becket....@gmail.com> wrote: > > > > > Hey Joel, > > > > > > Thank for the detail explanation. I agree the current design makes > sense. > > > My confusion is about whether the new config for the controller queue > > > capacity is necessary. I cannot think of a case in which users would > > change > > > it. > > > > > > Thanks, > > > > > > Jiangjie (Becket) Qin > > > > > > On Wed, Jul 18, 2018 at 6:00 PM, Becket Qin <becket....@gmail.com> > > wrote: > > > > > > > Hi Lucas, > > > > > > > > I guess my question can be rephrased to "do we expect user to ever > > change > > > > the controller request queue capacity"? If we agree that 20 is > already > > a > > > > very generous default number and we do not expect user to change it, > is > > > it > > > > still necessary to expose this as a config? > > > > > > > > Thanks, > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > On Wed, Jul 18, 2018 at 2:29 AM, Lucas Wang <lucasatu...@gmail.com> > > > wrote: > > > > > > > >> @Becket > > > >> 1. Thanks for the comment. You are right that normally there should > be > > > >> just > > > >> one controller request because of muting, > > > >> and I had NOT intended to say there would be many enqueued > controller > > > >> requests. > > > >> I went through the KIP again, and I'm not sure which part conveys > that > > > >> info. > > > >> I'd be happy to revise if you point it out the section. > > > >> > > > >> 2. Though it should not happen in normal conditions, the current > > design > > > >> does not preclude multiple controllers running > > > >> at the same time, hence if we don't have the controller queue > capacity > > > >> config and simply make its capacity to be 1, > > > >> network threads handling requests from different controllers will be > > > >> blocked during those troublesome times, > > > >> which is probably not what we want. On the other hand, adding the > > extra > > > >> config with a default value, say 20, guards us from issues in those > > > >> troublesome times, and IMO there isn't much downside of adding the > > extra > > > >> config. > > > >> > > > >> @Mayuresh > > > >> Good catch, this sentence is an obsolete statement based on a > previous > > > >> design. I've revised the wording in the KIP. > > > >> > > > >> Thanks, > > > >> Lucas > > > >> > > > >> On Tue, Jul 17, 2018 at 10:33 AM, Mayuresh Gharat < > > > >> gharatmayures...@gmail.com> wrote: > > > >> > > > >> > Hi Lucas, > > > >> > > > > >> > Thanks for the KIP. > > > >> > I am trying to understand why you think "The memory consumption > can > > > rise > > > >> > given the total number of queued requests can go up to 2x" in the > > > impact > > > >> > section. Normally the requests from controller to a Broker are not > > > high > > > >> > volume, right ? > > > >> > > > > >> > > > > >> > Thanks, > > > >> > > > > >> > Mayuresh > > > >> > > > > >> > On Tue, Jul 17, 2018 at 5:06 AM Becket Qin <becket....@gmail.com> > > > >> wrote: > > > >> > > > > >> > > Thanks for the KIP, Lucas. Separating the control plane from the > > > data > > > >> > plane > > > >> > > makes a lot of sense. > > > >> > > > > > >> > > In the KIP you mentioned that the controller request queue may > > have > > > >> many > > > >> > > requests in it. Will this be a common case? The controller > > requests > > > >> still > > > >> > > goes through the SocketServer. The SocketServer will mute the > > > channel > > > >> > once > > > >> > > a request is read and put into the request channel. So assuming > > > there > > > >> is > > > >> > > only one connection between controller and each broker, on the > > > broker > > > >> > side, > > > >> > > there should be only one controller request in the controller > > > request > > > >> > queue > > > >> > > at any given time. If that is the case, do we need a separate > > > >> controller > > > >> > > request queue capacity config? The default value 20 means that > we > > > >> expect > > > >> > > there are 20 controller switches to happen in a short period of > > > time. > > > >> I > > > >> > am > > > >> > > not sure whether someone should increase the controller request > > > queue > > > >> > > capacity to handle such case, as it seems indicating something > > very > > > >> wrong > > > >> > > has happened. > > > >> > > > > > >> > > Thanks, > > > >> > > > > > >> > > Jiangjie (Becket) Qin > > > >> > > > > > >> > > > > > >> > > On Fri, Jul 13, 2018 at 1:10 PM, Dong Lin <lindon...@gmail.com> > > > >> wrote: > > > >> > > > > > >> > > > Thanks for the update Lucas. > > > >> > > > > > > >> > > > I think the motivation section is intuitive. It will be good > to > > > >> learn > > > >> > > more > > > >> > > > about the comments from other reviewers. > > > >> > > > > > > >> > > > On Thu, Jul 12, 2018 at 9:48 PM, Lucas Wang < > > > lucasatu...@gmail.com> > > > >> > > wrote: > > > >> > > > > > > >> > > > > Hi Dong, > > > >> > > > > > > > >> > > > > I've updated the motivation section of the KIP by explaining > > the > > > >> > cases > > > >> > > > that > > > >> > > > > would have user impacts. > > > >> > > > > Please take a look at let me know your comments. > > > >> > > > > > > > >> > > > > Thanks, > > > >> > > > > Lucas > > > >> > > > > > > > >> > > > > On Mon, Jul 9, 2018 at 5:53 PM, Lucas Wang < > > > lucasatu...@gmail.com > > > >> > > > > >> > > > wrote: > > > >> > > > > > > > >> > > > > > Hi Dong, > > > >> > > > > > > > > >> > > > > > The simulation of disk being slow is merely for me to > easily > > > >> > > construct > > > >> > > > a > > > >> > > > > > testing scenario > > > >> > > > > > with a backlog of produce requests. In production, other > > than > > > >> the > > > >> > > disk > > > >> > > > > > being slow, a backlog of > > > >> > > > > > produce requests may also be caused by high produce QPS. > > > >> > > > > > In that case, we may not want to kill the broker and > that's > > > when > > > >> > this > > > >> > > > KIP > > > >> > > > > > can be useful, both for JBOD > > > >> > > > > > and non-JBOD setup. > > > >> > > > > > > > > >> > > > > > Going back to your previous question about each > > ProduceRequest > > > >> > > covering > > > >> > > > > 20 > > > >> > > > > > partitions that are randomly > > > >> > > > > > distributed, let's say a LeaderAndIsr request is enqueued > > that > > > >> > tries > > > >> > > to > > > >> > > > > > switch the current broker, say broker0, from leader to > > > follower > > > >> > > > > > *for one of the partitions*, say *test-0*. For the sake of > > > >> > argument, > > > >> > > > > > let's also assume the other brokers, say broker1, have > > > *stopped* > > > >> > > > fetching > > > >> > > > > > from > > > >> > > > > > the current broker, i.e. broker0. > > > >> > > > > > 1. If the enqueued produce requests have acks = -1 (ALL) > > > >> > > > > > 1.1 without this KIP, the ProduceRequests ahead of > > > >> LeaderAndISR > > > >> > > will > > > >> > > > be > > > >> > > > > > put into the purgatory, > > > >> > > > > > and since they'll never be replicated to other > > brokers > > > >> > > (because > > > >> > > > > of > > > >> > > > > > the assumption made above), they will > > > >> > > > > > be completed either when the LeaderAndISR request > is > > > >> > > processed > > > >> > > > or > > > >> > > > > > when the timeout happens. > > > >> > > > > > 1.2 With this KIP, broker0 will immediately transition > the > > > >> > > partition > > > >> > > > > > test-0 to become a follower, > > > >> > > > > > after the current broker sees the replication of > the > > > >> > > remaining > > > >> > > > 19 > > > >> > > > > > partitions, it can send a response indicating that > > > >> > > > > > it's no longer the leader for the "test-0". > > > >> > > > > > To see the latency difference between 1.1 and 1.2, let's > > say > > > >> > there > > > >> > > > are > > > >> > > > > > 24K produce requests ahead of the LeaderAndISR, and there > > are > > > 8 > > > >> io > > > >> > > > > threads, > > > >> > > > > > so each io thread will process approximately 3000 > produce > > > >> > requests. > > > >> > > > Now > > > >> > > > > > let's investigate the io thread that finally processed the > > > >> > > > LeaderAndISR. > > > >> > > > > > For the 3000 produce requests, if we model the time when > > > their > > > >> > > > > remaining > > > >> > > > > > 19 partitions catch up as t0, t1, ...t2999, and the > > > LeaderAndISR > > > >> > > > request > > > >> > > > > is > > > >> > > > > > processed at time t3000. > > > >> > > > > > Without this KIP, the 1st produce request would have > > waited > > > an > > > >> > > extra > > > >> > > > > > t3000 - t0 time in the purgatory, the 2nd an extra time of > > > >> t3000 - > > > >> > > t1, > > > >> > > > > etc. > > > >> > > > > > Roughly speaking, the latency difference is bigger for > the > > > >> > earlier > > > >> > > > > > produce requests than for the later ones. For the same > > reason, > > > >> the > > > >> > > more > > > >> > > > > > ProduceRequests queued > > > >> > > > > > before the LeaderAndISR, the bigger benefit we get > (capped > > > by > > > >> the > > > >> > > > > > produce timeout). > > > >> > > > > > 2. If the enqueued produce requests have acks=0 or acks=1 > > > >> > > > > > There will be no latency differences in this case, but > > > >> > > > > > 2.1 without this KIP, the records of partition test-0 in > > the > > > >> > > > > > ProduceRequests ahead of the LeaderAndISR will be appended > > to > > > >> the > > > >> > > local > > > >> > > > > log, > > > >> > > > > > and eventually be truncated after processing the > > > >> > > LeaderAndISR. > > > >> > > > > > This is what's referred to as > > > >> > > > > > "some unofficial definition of data loss in terms > of > > > >> > messages > > > >> > > > > > beyond the high watermark". > > > >> > > > > > 2.2 with this KIP, we can mitigate the effect since if > the > > > >> > > > LeaderAndISR > > > >> > > > > > is immediately processed, the response to producers will > > have > > > >> > > > > > the NotLeaderForPartition error, causing producers > > to > > > >> retry > > > >> > > > > > > > > >> > > > > > This explanation above is the benefit for reducing the > > latency > > > >> of a > > > >> > > > > broker > > > >> > > > > > becoming the follower, > > > >> > > > > > closely related is reducing the latency of a broker > becoming > > > the > > > >> > > > leader. > > > >> > > > > > In this case, the benefit is even more obvious, if other > > > brokers > > > >> > have > > > >> > > > > > resigned leadership, and the > > > >> > > > > > current broker should take leadership. Any delay in > > processing > > > >> the > > > >> > > > > > LeaderAndISR will be perceived > > > >> > > > > > by clients as unavailability. In extreme cases, this can > > cause > > > >> > failed > > > >> > > > > > produce requests if the retries are > > > >> > > > > > exhausted. > > > >> > > > > > > > > >> > > > > > Another two types of controller requests are > UpdateMetadata > > > and > > > >> > > > > > StopReplica, which I'll briefly discuss as follows: > > > >> > > > > > For UpdateMetadata requests, delayed processing means > > clients > > > >> > > receiving > > > >> > > > > > stale metadata, e.g. with the wrong leadership info > > > >> > > > > > for certain partitions, and the effect is more retries or > > even > > > >> > fatal > > > >> > > > > > failure if the retries are exhausted. > > > >> > > > > > > > > >> > > > > > For StopReplica requests, a long queuing time may degrade > > the > > > >> > > > performance > > > >> > > > > > of topic deletion. > > > >> > > > > > > > > >> > > > > > Regarding your last question of the delay for > > > >> > DescribeLogDirsRequest, > > > >> > > > you > > > >> > > > > > are right > > > >> > > > > > that this KIP cannot help with the latency in getting the > > log > > > >> dirs > > > >> > > > info, > > > >> > > > > > and it's only relevant > > > >> > > > > > when controller requests are involved. > > > >> > > > > > > > > >> > > > > > Regards, > > > >> > > > > > Lucas > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > On Tue, Jul 3, 2018 at 5:11 PM, Dong Lin < > > lindon...@gmail.com > > > > > > > >> > > wrote: > > > >> > > > > > > > > >> > > > > >> Hey Jun, > > > >> > > > > >> > > > >> > > > > >> Thanks much for the comments. It is good point. So the > > > feature > > > >> may > > > >> > > be > > > >> > > > > >> useful for JBOD use-case. I have one question below. > > > >> > > > > >> > > > >> > > > > >> Hey Lucas, > > > >> > > > > >> > > > >> > > > > >> Do you think this feature is also useful for non-JBOD > setup > > > or > > > >> it > > > >> > is > > > >> > > > > only > > > >> > > > > >> useful for the JBOD setup? It may be useful to understand > > > this. > > > >> > > > > >> > > > >> > > > > >> When the broker is setup using JBOD, in order to move > > leaders > > > >> on > > > >> > the > > > >> > > > > >> failed > > > >> > > > > >> disk to other disks, the system operator first needs to > get > > > the > > > >> > list > > > >> > > > of > > > >> > > > > >> partitions on the failed disk. This is currently achieved > > > using > > > >> > > > > >> AdminClient.describeLogDirs(), which sends > > > >> DescribeLogDirsRequest > > > >> > to > > > >> > > > the > > > >> > > > > >> broker. If we only prioritize the controller requests, > then > > > the > > > >> > > > > >> DescribeLogDirsRequest > > > >> > > > > >> may still take a long time to be processed by the broker. > > So > > > >> the > > > >> > > > overall > > > >> > > > > >> time to move leaders away from the failed disk may still > be > > > >> long > > > >> > > even > > > >> > > > > with > > > >> > > > > >> this KIP. What do you think? > > > >> > > > > >> > > > >> > > > > >> Thanks, > > > >> > > > > >> Dong > > > >> > > > > >> > > > >> > > > > >> > > > >> > > > > >> On Tue, Jul 3, 2018 at 4:38 PM, Lucas Wang < > > > >> lucasatu...@gmail.com > > > >> > > > > > >> > > > > wrote: > > > >> > > > > >> > > > >> > > > > >> > Thanks for the insightful comment, Jun. > > > >> > > > > >> > > > > >> > > > > >> > @Dong, > > > >> > > > > >> > Since both of the two comments in your previous email > are > > > >> about > > > >> > > the > > > >> > > > > >> > benefits of this KIP and whether it's useful, > > > >> > > > > >> > in light of Jun's last comment, do you agree that this > > KIP > > > >> can > > > >> > be > > > >> > > > > >> > beneficial in the case mentioned by Jun? > > > >> > > > > >> > Please let me know, thanks! > > > >> > > > > >> > > > > >> > > > > >> > Regards, > > > >> > > > > >> > Lucas > > > >> > > > > >> > > > > >> > > > > >> > On Tue, Jul 3, 2018 at 2:07 PM, Jun Rao < > > j...@confluent.io> > > > >> > wrote: > > > >> > > > > >> > > > > >> > > > > >> > > Hi, Lucas, Dong, > > > >> > > > > >> > > > > > >> > > > > >> > > If all disks on a broker are slow, one probably > should > > > just > > > >> > kill > > > >> > > > the > > > >> > > > > >> > > broker. In that case, this KIP may not help. If only > > one > > > of > > > >> > the > > > >> > > > > disks > > > >> > > > > >> on > > > >> > > > > >> > a > > > >> > > > > >> > > broker is slow, one may want to fail that disk and > move > > > the > > > >> > > > leaders > > > >> > > > > on > > > >> > > > > >> > that > > > >> > > > > >> > > disk to other brokers. In that case, being able to > > > process > > > >> the > > > >> > > > > >> > LeaderAndIsr > > > >> > > > > >> > > requests faster will potentially help the producers > > > recover > > > >> > > > quicker. > > > >> > > > > >> > > > > > >> > > > > >> > > Thanks, > > > >> > > > > >> > > > > > >> > > > > >> > > Jun > > > >> > > > > >> > > > > > >> > > > > >> > > On Mon, Jul 2, 2018 at 7:56 PM, Dong Lin < > > > >> lindon...@gmail.com > > > >> > > > > > >> > > > > wrote: > > > >> > > > > >> > > > > > >> > > > > >> > > > Hey Lucas, > > > >> > > > > >> > > > > > > >> > > > > >> > > > Thanks for the reply. Some follow up questions > below. > > > >> > > > > >> > > > > > > >> > > > > >> > > > Regarding 1, if each ProduceRequest covers 20 > > > partitions > > > >> > that > > > >> > > > are > > > >> > > > > >> > > randomly > > > >> > > > > >> > > > distributed across all partitions, then each > > > >> ProduceRequest > > > >> > > will > > > >> > > > > >> likely > > > >> > > > > >> > > > cover some partitions for which the broker is still > > > >> leader > > > >> > > after > > > >> > > > > it > > > >> > > > > >> > > quickly > > > >> > > > > >> > > > processes the > > > >> > > > > >> > > > LeaderAndIsrRequest. Then broker will still be slow > > in > > > >> > > > processing > > > >> > > > > >> these > > > >> > > > > >> > > > ProduceRequest and request will still be very high > > with > > > >> this > > > >> > > > KIP. > > > >> > > > > It > > > >> > > > > >> > > seems > > > >> > > > > >> > > > that most ProduceRequest will still timeout after > 30 > > > >> > seconds. > > > >> > > Is > > > >> > > > > >> this > > > >> > > > > >> > > > understanding correct? > > > >> > > > > >> > > > > > > >> > > > > >> > > > Regarding 2, if most ProduceRequest will still > > timeout > > > >> after > > > >> > > 30 > > > >> > > > > >> > seconds, > > > >> > > > > >> > > > then it is less clear how this KIP reduces average > > > >> produce > > > >> > > > > latency. > > > >> > > > > >> Can > > > >> > > > > >> > > you > > > >> > > > > >> > > > clarify what metrics can be improved by this KIP? > > > >> > > > > >> > > > > > > >> > > > > >> > > > Not sure why system operator directly cares number > of > > > >> > > truncated > > > >> > > > > >> > messages. > > > >> > > > > >> > > > Do you mean this KIP can improve average throughput > > or > > > >> > reduce > > > >> > > > > >> message > > > >> > > > > >> > > > duplication? It will be good to understand this. > > > >> > > > > >> > > > > > > >> > > > > >> > > > Thanks, > > > >> > > > > >> > > > Dong > > > >> > > > > >> > > > > > > >> > > > > >> > > > > > > >> > > > > >> > > > > > > >> > > > > >> > > > > > > >> > > > > >> > > > > > > >> > > > > >> > > > On Tue, 3 Jul 2018 at 7:12 AM Lucas Wang < > > > >> > > lucasatu...@gmail.com > > > >> > > > > > > > >> > > > > >> > wrote: > > > >> > > > > >> > > > > > > >> > > > > >> > > > > Hi Dong, > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > Thanks for your valuable comments. Please see my > > > reply > > > >> > > below. > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > 1. The Google doc showed only 1 partition. Now > > let's > > > >> > > consider > > > >> > > > a > > > >> > > > > >> more > > > >> > > > > >> > > > common > > > >> > > > > >> > > > > scenario > > > >> > > > > >> > > > > where broker0 is the leader of many partitions. > And > > > >> let's > > > >> > > say > > > >> > > > > for > > > >> > > > > >> > some > > > >> > > > > >> > > > > reason its IO becomes slow. > > > >> > > > > >> > > > > The number of leader partitions on broker0 is so > > > large, > > > >> > say > > > >> > > > 10K, > > > >> > > > > >> that > > > >> > > > > >> > > the > > > >> > > > > >> > > > > cluster is skewed, > > > >> > > > > >> > > > > and the operator would like to shift the > leadership > > > >> for a > > > >> > > lot > > > >> > > > of > > > >> > > > > >> > > > > partitions, say 9K, to other brokers, > > > >> > > > > >> > > > > either manually or through some service like > cruise > > > >> > control. > > > >> > > > > >> > > > > With this KIP, not only will the leadership > > > transitions > > > >> > > finish > > > >> > > > > >> more > > > >> > > > > >> > > > > quickly, helping the cluster itself becoming more > > > >> > balanced, > > > >> > > > > >> > > > > but all existing producers corresponding to the > 9K > > > >> > > partitions > > > >> > > > > will > > > >> > > > > >> > get > > > >> > > > > >> > > > the > > > >> > > > > >> > > > > errors relatively quickly > > > >> > > > > >> > > > > rather than relying on their timeout, thanks to > the > > > >> > batched > > > >> > > > > async > > > >> > > > > >> ZK > > > >> > > > > >> > > > > operations. > > > >> > > > > >> > > > > To me it's a useful feature to have during such > > > >> > troublesome > > > >> > > > > times. > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > 2. The experiments in the Google Doc have shown > > that > > > >> with > > > >> > > this > > > >> > > > > KIP > > > >> > > > > >> > many > > > >> > > > > >> > > > > producers > > > >> > > > > >> > > > > receive an explicit error NotLeaderForPartition, > > > based > > > >> on > > > >> > > > which > > > >> > > > > >> they > > > >> > > > > >> > > > retry > > > >> > > > > >> > > > > immediately. > > > >> > > > > >> > > > > Therefore the latency (~14 seconds+quick retry) > for > > > >> their > > > >> > > > single > > > >> > > > > >> > > message > > > >> > > > > >> > > > is > > > >> > > > > >> > > > > much smaller > > > >> > > > > >> > > > > compared with the case of timing out without the > > KIP > > > >> (30 > > > >> > > > seconds > > > >> > > > > >> for > > > >> > > > > >> > > > timing > > > >> > > > > >> > > > > out + quick retry). > > > >> > > > > >> > > > > One might argue that reducing the timing out on > the > > > >> > producer > > > >> > > > > side > > > >> > > > > >> can > > > >> > > > > >> > > > > achieve the same result, > > > >> > > > > >> > > > > yet reducing the timeout has its own > drawbacks[1]. > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > Also *IF* there were a metric to show the number > of > > > >> > > truncated > > > >> > > > > >> > messages > > > >> > > > > >> > > on > > > >> > > > > >> > > > > brokers, > > > >> > > > > >> > > > > with the experiments done in the Google Doc, it > > > should > > > >> be > > > >> > > easy > > > >> > > > > to > > > >> > > > > >> see > > > >> > > > > >> > > > that > > > >> > > > > >> > > > > a lot fewer messages need > > > >> > > > > >> > > > > to be truncated on broker0 since the up-to-date > > > >> metadata > > > >> > > > avoids > > > >> > > > > >> > > appending > > > >> > > > > >> > > > > of messages > > > >> > > > > >> > > > > in subsequent PRODUCE requests. If we talk to a > > > system > > > >> > > > operator > > > >> > > > > >> and > > > >> > > > > >> > ask > > > >> > > > > >> > > > > whether > > > >> > > > > >> > > > > they prefer fewer wasteful IOs, I bet most likely > > the > > > >> > answer > > > >> > > > is > > > >> > > > > >> yes. > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > 3. To answer your question, I think it might be > > > >> helpful to > > > >> > > > > >> construct > > > >> > > > > >> > > some > > > >> > > > > >> > > > > formulas. > > > >> > > > > >> > > > > To simplify the modeling, I'm going back to the > > case > > > >> where > > > >> > > > there > > > >> > > > > >> is > > > >> > > > > >> > > only > > > >> > > > > >> > > > > ONE partition involved. > > > >> > > > > >> > > > > Following the experiments in the Google Doc, > let's > > > say > > > >> > > broker0 > > > >> > > > > >> > becomes > > > >> > > > > >> > > > the > > > >> > > > > >> > > > > follower at time t0, > > > >> > > > > >> > > > > and after t0 there were still N produce requests > in > > > its > > > >> > > > request > > > >> > > > > >> > queue. > > > >> > > > > >> > > > > With the up-to-date metadata brought by this KIP, > > > >> broker0 > > > >> > > can > > > >> > > > > >> reply > > > >> > > > > >> > > with > > > >> > > > > >> > > > an > > > >> > > > > >> > > > > NotLeaderForPartition exception, > > > >> > > > > >> > > > > let's use M1 to denote the average processing > time > > of > > > >> > > replying > > > >> > > > > >> with > > > >> > > > > >> > > such > > > >> > > > > >> > > > an > > > >> > > > > >> > > > > error message. > > > >> > > > > >> > > > > Without this KIP, the broker will need to append > > > >> messages > > > >> > to > > > >> > > > > >> > segments, > > > >> > > > > >> > > > > which may trigger a flush to disk, > > > >> > > > > >> > > > > let's use M2 to denote the average processing > time > > > for > > > >> > such > > > >> > > > > logic. > > > >> > > > > >> > > > > Then the average extra latency incurred without > > this > > > >> KIP > > > >> > is > > > >> > > N > > > >> > > > * > > > >> > > > > >> (M2 - > > > >> > > > > >> > > > M1) / > > > >> > > > > >> > > > > 2. > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > In practice, M2 should always be larger than M1, > > > which > > > >> > means > > > >> > > > as > > > >> > > > > >> long > > > >> > > > > >> > > as N > > > >> > > > > >> > > > > is positive, > > > >> > > > > >> > > > > we would see improvements on the average latency. > > > >> > > > > >> > > > > There does not need to be significant backlog of > > > >> requests > > > >> > in > > > >> > > > the > > > >> > > > > >> > > request > > > >> > > > > >> > > > > queue, > > > >> > > > > >> > > > > or severe degradation of disk performance to have > > the > > > >> > > > > improvement. > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > Regards, > > > >> > > > > >> > > > > Lucas > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > [1] For instance, reducing the timeout on the > > > producer > > > >> > side > > > >> > > > can > > > >> > > > > >> > trigger > > > >> > > > > >> > > > > unnecessary duplicate requests > > > >> > > > > >> > > > > when the corresponding leader broker is > overloaded, > > > >> > > > exacerbating > > > >> > > > > >> the > > > >> > > > > >> > > > > situation. > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > On Sun, Jul 1, 2018 at 9:18 PM, Dong Lin < > > > >> > > lindon...@gmail.com > > > >> > > > > > > > >> > > > > >> > wrote: > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > > Hey Lucas, > > > >> > > > > >> > > > > > > > > >> > > > > >> > > > > > Thanks much for the detailed documentation of > the > > > >> > > > experiment. > > > >> > > > > >> > > > > > > > > >> > > > > >> > > > > > Initially I also think having a separate queue > > for > > > >> > > > controller > > > >> > > > > >> > > requests > > > >> > > > > >> > > > is > > > >> > > > > >> > > > > > useful because, as you mentioned in the summary > > > >> section > > > >> > of > > > >> > > > the > > > >> > > > > >> > Google > > > >> > > > > >> > > > > doc, > > > >> > > > > >> > > > > > controller requests are generally more > important > > > than > > > >> > data > > > >> > > > > >> requests > > > >> > > > > >> > > and > > > >> > > > > >> > > > > we > > > >> > > > > >> > > > > > probably want controller requests to be > processed > > > >> > sooner. > > > >> > > > But > > > >> > > > > >> then > > > >> > > > > >> > > Eno > > > >> > > > > >> > > > > has > > > >> > > > > >> > > > > > two very good questions which I am not sure the > > > >> Google > > > >> > doc > > > >> > > > has > > > >> > > > > >> > > answered > > > >> > > > > >> > > > > > explicitly. Could you help with the following > > > >> questions? > > > >> > > > > >> > > > > > > > > >> > > > > >> > > > > > 1) It is not very clear what is the actual > > benefit > > > of > > > >> > > > KIP-291 > > > >> > > > > to > > > >> > > > > >> > > users. > > > >> > > > > >> > > > > The > > > >> > > > > >> > > > > > experiment setup in the Google doc simulates > the > > > >> > scenario > > > >> > > > that > > > >> > > > > >> > broker > > > >> > > > > >> > > > is > > > >> > > > > >> > > > > > very slow handling ProduceRequest due to e.g. > > slow > > > >> disk. > > > >> > > It > > > >> > > > > >> > currently > > > >> > > > > >> > > > > > assumes that there is only 1 partition. But in > > the > > > >> > common > > > >> > > > > >> scenario, > > > >> > > > > >> > > it > > > >> > > > > >> > > > is > > > >> > > > > >> > > > > > probably reasonable to assume that there are > many > > > >> other > > > >> > > > > >> partitions > > > >> > > > > >> > > that > > > >> > > > > >> > > > > are > > > >> > > > > >> > > > > > also actively produced to and ProduceRequest to > > > these > > > >> > > > > partition > > > >> > > > > >> > also > > > >> > > > > >> > > > > takes > > > >> > > > > >> > > > > > e.g. 2 seconds to be processed. So even if > > broker0 > > > >> can > > > >> > > > become > > > >> > > > > >> > > follower > > > >> > > > > >> > > > > for > > > >> > > > > >> > > > > > the partition 0 soon, it probably still needs > to > > > >> process > > > >> > > the > > > >> > > > > >> > > > > ProduceRequest > > > >> > > > > >> > > > > > slowly t in the queue because these > > ProduceRequests > > > >> > cover > > > >> > > > > other > > > >> > > > > >> > > > > partitions. > > > >> > > > > >> > > > > > Thus most ProduceRequest will still timeout > after > > > 30 > > > >> > > seconds > > > >> > > > > and > > > >> > > > > >> > most > > > >> > > > > >> > > > > > clients will still likely timeout after 30 > > seconds. > > > >> Then > > > >> > > it > > > >> > > > is > > > >> > > > > >> not > > > >> > > > > >> > > > > > obviously what is the benefit to client since > > > client > > > >> > will > > > >> > > > > >> timeout > > > >> > > > > >> > > after > > > >> > > > > >> > > > > 30 > > > >> > > > > >> > > > > > seconds before possibly re-connecting to > broker1, > > > >> with > > > >> > or > > > >> > > > > >> without > > > >> > > > > >> > > > > KIP-291. > > > >> > > > > >> > > > > > Did I miss something here? > > > >> > > > > >> > > > > > > > > >> > > > > >> > > > > > 2) I guess Eno's is asking for the specific > > > benefits > > > >> of > > > >> > > this > > > >> > > > > >> KIP to > > > >> > > > > >> > > > user > > > >> > > > > >> > > > > or > > > >> > > > > >> > > > > > system administrator, e.g. whether this KIP > > > decreases > > > >> > > > average > > > >> > > > > >> > > latency, > > > >> > > > > >> > > > > > 999th percentile latency, probably of exception > > > >> exposed > > > >> > to > > > >> > > > > >> client > > > >> > > > > >> > > etc. > > > >> > > > > >> > > > It > > > >> > > > > >> > > > > > is probably useful to clarify this. > > > >> > > > > >> > > > > > > > > >> > > > > >> > > > > > 3) Does this KIP help improve user experience > > only > > > >> when > > > >> > > > there > > > >> > > > > is > > > >> > > > > >> > > issue > > > >> > > > > >> > > > > with > > > >> > > > > >> > > > > > broker, e.g. significant backlog in the request > > > queue > > > >> > due > > > >> > > to > > > >> > > > > >> slow > > > >> > > > > >> > > disk > > > >> > > > > >> > > > as > > > >> > > > > >> > > > > > described in the Google doc? Or is this KIP > also > > > >> useful > > > >> > > when > > > >> > > > > >> there > > > >> > > > > >> > is > > > >> > > > > >> > > > no > > > >> > > > > >> > > > > > ongoing issue in the cluster? It might be > helpful > > > to > > > >> > > clarify > > > >> > > > > >> this > > > >> > > > > >> > to > > > >> > > > > >> > > > > > understand the benefit of this KIP. > > > >> > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > > > > >> > > > > > Thanks much, > > > >> > > > > >> > > > > > Dong > > > >> > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > > > > >> > > > > > On Fri, Jun 29, 2018 at 2:58 PM, Lucas Wang < > > > >> > > > > >> lucasatu...@gmail.com > > > >> > > > > >> > > > > > >> > > > > >> > > > > wrote: > > > >> > > > > >> > > > > > > > > >> > > > > >> > > > > > > Hi Eno, > > > >> > > > > >> > > > > > > > > > >> > > > > >> > > > > > > Sorry for the delay in getting the experiment > > > >> results. > > > >> > > > > >> > > > > > > Here is a link to the positive impact > achieved > > by > > > >> > > > > implementing > > > >> > > > > >> > the > > > >> > > > > >> > > > > > proposed > > > >> > > > > >> > > > > > > change: > > > >> > > > > >> > > > > > > https://docs.google.com/document/d/ > > > >> > > > > 1ge2jjp5aPTBber6zaIT9AdhW > > > >> > > > > >> > > > > > > FWUENJ3JO6Zyu4f9tgQ/edit?usp=sharing > > > >> > > > > >> > > > > > > Please take a look when you have time and let > > me > > > >> know > > > >> > > your > > > >> > > > > >> > > feedback. > > > >> > > > > >> > > > > > > > > > >> > > > > >> > > > > > > Regards, > > > >> > > > > >> > > > > > > Lucas > > > >> > > > > >> > > > > > > > > > >> > > > > >> > > > > > > On Tue, Jun 26, 2018 at 9:52 AM, Harsha < > > > >> > > ka...@harsha.io> > > > >> > > > > >> wrote: > > > >> > > > > >> > > > > > > > > > >> > > > > >> > > > > > > > Thanks for the pointer. Will take a look > > might > > > >> suit > > > >> > > our > > > >> > > > > >> > > > requirements > > > >> > > > > >> > > > > > > > better. > > > >> > > > > >> > > > > > > > > > > >> > > > > >> > > > > > > > Thanks, > > > >> > > > > >> > > > > > > > Harsha > > > >> > > > > >> > > > > > > > > > > >> > > > > >> > > > > > > > On Mon, Jun 25th, 2018 at 2:52 PM, Lucas > > Wang < > > > >> > > > > >> > > > lucasatu...@gmail.com > > > >> > > > > >> > > > > > > > > >> > > > > >> > > > > > > > wrote: > > > >> > > > > >> > > > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > Hi Harsha, > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > If I understand correctly, the > replication > > > >> quota > > > >> > > > > mechanism > > > >> > > > > >> > > > proposed > > > >> > > > > >> > > > > > in > > > >> > > > > >> > > > > > > > > KIP-73 can be helpful in that scenario. > > > >> > > > > >> > > > > > > > > Have you tried it out? > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > Thanks, > > > >> > > > > >> > > > > > > > > Lucas > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > On Sun, Jun 24, 2018 at 8:28 AM, Harsha < > > > >> > > > > ka...@harsha.io > > > >> > > > > >> > > > > >> > > > > >> > > > wrote: > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > Hi Lucas, > > > >> > > > > >> > > > > > > > > > One more question, any thoughts on > making > > > >> this > > > >> > > > > >> configurable > > > >> > > > > >> > > > > > > > > > and also allowing subset of data > requests > > > to > > > >> be > > > >> > > > > >> > prioritized. > > > >> > > > > >> > > > For > > > >> > > > > >> > > > > > > > example > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > ,we notice in our cluster when we take > > out > > > a > > > >> > > broker > > > >> > > > > and > > > >> > > > > >> > bring > > > >> > > > > >> > > > new > > > >> > > > > >> > > > > > one > > > >> > > > > >> > > > > > > > it > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > will try to become follower and have > lot > > of > > > >> > fetch > > > >> > > > > >> requests > > > >> > > > > >> > to > > > >> > > > > >> > > > > other > > > >> > > > > >> > > > > > > > > leaders > > > >> > > > > >> > > > > > > > > > in clusters. This will negatively > effect > > > the > > > >> > > > > >> > > application/client > > > >> > > > > >> > > > > > > > > requests. > > > >> > > > > >> > > > > > > > > > We are also exploring the similar > > solution > > > to > > > >> > > > > >> de-prioritize > > > >> > > > > >> > > if > > > >> > > > > >> > > > a > > > >> > > > > >> > > > > > new > > > >> > > > > >> > > > > > > > > > replica comes in for fetch requests, we > > are > > > >> ok > > > >> > > with > > > >> > > > > the > > > >> > > > > >> > > replica > > > >> > > > > >> > > > > to > > > >> > > > > >> > > > > > be > > > >> > > > > >> > > > > > > > > > taking time but the leaders should > > > prioritize > > > >> > the > > > >> > > > > client > > > >> > > > > >> > > > > requests. > > > >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > > > > > > Thanks, > > > >> > > > > >> > > > > > > > > > Harsha > > > >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > > > > > > On Fri, Jun 22nd, 2018 at 11:35 AM > Lucas > > > Wang > > > >> > > wrote: > > > >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > Hi Eno, > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > Sorry for the delayed response. > > > >> > > > > >> > > > > > > > > > > - I haven't implemented the feature > > yet, > > > >> so no > > > >> > > > > >> > experimental > > > >> > > > > >> > > > > > results > > > >> > > > > >> > > > > > > > so > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > far. > > > >> > > > > >> > > > > > > > > > > And I plan to test in out in the > > > following > > > >> > days. > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > - You are absolutely right that the > > > >> priority > > > >> > > queue > > > >> > > > > >> does > > > >> > > > > >> > not > > > >> > > > > >> > > > > > > > completely > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > prevent > > > >> > > > > >> > > > > > > > > > > data requests being processed ahead > of > > > >> > > controller > > > >> > > > > >> > requests. > > > >> > > > > >> > > > > > > > > > > That being said, I expect it to > greatly > > > >> > mitigate > > > >> > > > the > > > >> > > > > >> > effect > > > >> > > > > >> > > > of > > > >> > > > > >> > > > > > > stable > > > >> > > > > >> > > > > > > > > > > metadata. > > > >> > > > > >> > > > > > > > > > > In any case, I'll try it out and post > > the > > > >> > > results > > > >> > > > > >> when I > > > >> > > > > >> > > have > > > >> > > > > >> > > > > it. > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > Regards, > > > >> > > > > >> > > > > > > > > > > Lucas > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > On Wed, Jun 20, 2018 at 5:44 AM, Eno > > > >> Thereska > > > >> > < > > > >> > > > > >> > > > > > > > eno.there...@gmail.com > > > >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > wrote: > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > Hi Lucas, > > > >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > Sorry for the delay, just had a > look > > at > > > >> > this. > > > >> > > A > > > >> > > > > >> couple > > > >> > > > > >> > of > > > >> > > > > >> > > > > > > > questions: > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > - did you notice any positive > change > > > >> after > > > >> > > > > >> implementing > > > >> > > > > >> > > > this > > > >> > > > > >> > > > > > KIP? > > > >> > > > > >> > > > > > > > > I'm > > > >> > > > > >> > > > > > > > > > > > wondering if you have any > > experimental > > > >> > results > > > >> > > > > that > > > >> > > > > >> > show > > > >> > > > > >> > > > the > > > >> > > > > >> > > > > > > > benefit > > > >> > > > > >> > > > > > > > > of > > > >> > > > > >> > > > > > > > > > > the > > > >> > > > > >> > > > > > > > > > > > two queues. > > > >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > - priority is usually not > sufficient > > in > > > >> > > > addressing > > > >> > > > > >> the > > > >> > > > > >> > > > > problem > > > >> > > > > >> > > > > > > the > > > >> > > > > >> > > > > > > > > KIP > > > >> > > > > >> > > > > > > > > > > > identifies. Even with priority > > queues, > > > >> you > > > >> > > will > > > >> > > > > >> > sometimes > > > >> > > > > >> > > > > > > (often?) > > > >> > > > > >> > > > > > > > > have > > > >> > > > > >> > > > > > > > > > > the > > > >> > > > > >> > > > > > > > > > > > case that data plane requests will > be > > > >> ahead > > > >> > of > > > >> > > > the > > > >> > > > > >> > > control > > > >> > > > > >> > > > > > plane > > > >> > > > > >> > > > > > > > > > > requests. > > > >> > > > > >> > > > > > > > > > > > This happens because the system > might > > > >> have > > > >> > > > already > > > >> > > > > >> > > started > > > >> > > > > >> > > > > > > > > processing > > > >> > > > > >> > > > > > > > > > > the > > > >> > > > > >> > > > > > > > > > > > data plane requests before the > > control > > > >> plane > > > >> > > > ones > > > >> > > > > >> > > arrived. > > > >> > > > > >> > > > So > > > >> > > > > >> > > > > > it > > > >> > > > > >> > > > > > > > > would > > > >> > > > > >> > > > > > > > > > > be > > > >> > > > > >> > > > > > > > > > > > good to know what % of the problem > > this > > > >> KIP > > > >> > > > > >> addresses. > > > >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > Thanks > > > >> > > > > >> > > > > > > > > > > > Eno > > > >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > On Fri, Jun 15, 2018 at 4:44 PM, > Ted > > > Yu < > > > >> > > > > >> > > > > yuzhih...@gmail.com > > > >> > > > > >> > > > > > > > > > >> > > > > >> > > > > > > > > wrote: > > > >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > Change looks good. > > > >> > > > > >> > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > Thanks > > > >> > > > > >> > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > On Fri, Jun 15, 2018 at 8:42 AM, > > > Lucas > > > >> > Wang > > > >> > > < > > > >> > > > > >> > > > > > > > lucasatu...@gmail.com > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > wrote: > > > >> > > > > >> > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > Hi Ted, > > > >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > Thanks for the suggestion. I've > > > >> updated > > > >> > > the > > > >> > > > > KIP. > > > >> > > > > >> > > Please > > > >> > > > > >> > > > > > take > > > >> > > > > >> > > > > > > > > > another > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > look. > > > >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > Lucas > > > >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > On Thu, Jun 14, 2018 at 6:34 > PM, > > > Ted > > > >> Yu > > > >> > < > > > >> > > > > >> > > > > > > yuzhih...@gmail.com > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > wrote: > > > >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > Currently in > KafkaConfig.scala > > : > > > >> > > > > >> > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > val QueuedMaxRequests = 500 > > > >> > > > > >> > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > It would be good if you can > > > include > > > >> > the > > > >> > > > > >> default > > > >> > > > > >> > > value > > > >> > > > > >> > > > > for > > > >> > > > > >> > > > > > > > this > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > new > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > config > > > >> > > > > >> > > > > > > > > > > > > > > in the KIP. > > > >> > > > > >> > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > Thanks > > > >> > > > > >> > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > On Thu, Jun 14, 2018 at 4:28 > > PM, > > > >> Lucas > > > >> > > > Wang > > > >> > > > > < > > > >> > > > > >> > > > > > > > > > lucasatu...@gmail.com > > > >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > wrote: > > > >> > > > > >> > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > Hi Ted, Dong > > > >> > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > I've updated the KIP by > > adding > > > a > > > >> new > > > >> > > > > config, > > > >> > > > > >> > > > instead > > > >> > > > > >> > > > > of > > > >> > > > > >> > > > > > > > > reusing > > > >> > > > > >> > > > > > > > > > > the > > > >> > > > > >> > > > > > > > > > > > > > > > existing one. > > > >> > > > > >> > > > > > > > > > > > > > > > Please take another look > when > > > you > > > >> > have > > > >> > > > > time. > > > >> > > > > >> > > > Thanks a > > > >> > > > > >> > > > > > > lot! > > > >> > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > Lucas > > > >> > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > On Thu, Jun 14, 2018 at > 2:33 > > > PM, > > > >> Ted > > > >> > > Yu > > > >> > > > < > > > >> > > > > >> > > > > > > > yuzhih...@gmail.com > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > wrote: > > > >> > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > bq. that's a waste of > > > resource > > > >> if > > > >> > > > > control > > > >> > > > > >> > > request > > > >> > > > > >> > > > > > rate > > > >> > > > > >> > > > > > > is > > > >> > > > > >> > > > > > > > > low > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > I don't know if control > > > request > > > >> > rate > > > >> > > > can > > > >> > > > > >> get > > > >> > > > > >> > to > > > >> > > > > >> > > > > > > 100,000, > > > >> > > > > >> > > > > > > > > > > likely > > > >> > > > > >> > > > > > > > > > > > > not. > > > >> > > > > >> > > > > > > > > > > > > > > Then > > > >> > > > > >> > > > > > > > > > > > > > > > > using the same bound as > > that > > > >> for > > > >> > > data > > > >> > > > > >> > requests > > > >> > > > > >> > > > > seems > > > >> > > > > >> > > > > > > > high. > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > On Wed, Jun 13, 2018 at > > 10:13 > > > >> PM, > > > >> > > > Lucas > > > >> > > > > >> Wang > > > >> > > > > >> > < > > > >> > > > > >> > > > > > > > > > > > > lucasatu...@gmail.com > > > > >> > > > > >> > > > > > > > > > > > > > > > > wrote: > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > Hi Ted, > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > Thanks for taking a > look > > at > > > >> this > > > >> > > > KIP. > > > >> > > > > >> > > > > > > > > > > > > > > > > > Let's say today the > > setting > > > >> of > > > >> > > > > >> > > > > > "queued.max.requests" > > > >> > > > > >> > > > > > > in > > > >> > > > > >> > > > > > > > > > > > cluster A > > > >> > > > > >> > > > > > > > > > > > > > is > > > >> > > > > >> > > > > > > > > > > > > > > > > 1000, > > > >> > > > > >> > > > > > > > > > > > > > > > > > while the setting in > > > cluster > > > >> B > > > >> > is > > > >> > > > > >> 100,000. > > > >> > > > > >> > > > > > > > > > > > > > > > > > The 100 times > difference > > > >> might > > > >> > > have > > > >> > > > > >> > indicated > > > >> > > > > >> > > > > that > > > >> > > > > >> > > > > > > > > machines > > > >> > > > > >> > > > > > > > > > > in > > > >> > > > > >> > > > > > > > > > > > > > > cluster > > > >> > > > > >> > > > > > > > > > > > > > > > B > > > >> > > > > >> > > > > > > > > > > > > > > > > > have larger memory. > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > By reusing the > > > >> > > > "queued.max.requests", > > > >> > > > > >> the > > > >> > > > > >> > > > > > > > > > > controlRequestQueue > > > >> > > > > >> > > > > > > > > > > > in > > > >> > > > > >> > > > > > > > > > > > > > > > cluster > > > >> > > > > >> > > > > > > > > > > > > > > > > B > > > >> > > > > >> > > > > > > > > > > > > > > > > > automatically > > > >> > > > > >> > > > > > > > > > > > > > > > > > gets a 100x capacity > > > without > > > >> > > > > explicitly > > > >> > > > > >> > > > bothering > > > >> > > > > >> > > > > > the > > > >> > > > > >> > > > > > > > > > > > operators. > > > >> > > > > >> > > > > > > > > > > > > > > > > > I understand the > counter > > > >> > argument > > > >> > > > can > > > >> > > > > be > > > >> > > > > >> > that > > > >> > > > > >> > > > > maybe > > > >> > > > > >> > > > > > > > > that's > > > >> > > > > >> > > > > > > > > > a > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > waste > > > >> > > > > >> > > > > > > > > > > > > > of > > > >> > > > > >> > > > > > > > > > > > > > > > > > resource if control > > request > > > >> > > > > >> > > > > > > > > > > > > > > > > > rate is low and > operators > > > may > > > >> > want > > > >> > > > to > > > >> > > > > >> fine > > > >> > > > > >> > > tune > > > >> > > > > >> > > > > the > > > >> > > > > >> > > > > > > > > > capacity > > > >> > > > > >> > > > > > > > > > > of > > > >> > > > > >> > > > > > > > > > > > > the > > > >> > > > > >> > > > > > > > > > > > > > > > > > controlRequestQueue. > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > I'm ok with either > > > approach, > > > >> and > > > >> > > can > > > >> > > > > >> change > > > >> > > > > >> > > it > > > >> > > > > >> > > > if > > > >> > > > > >> > > > > > you > > > >> > > > > >> > > > > > > > or > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > anyone > > > >> > > > > >> > > > > > > > > > > > > > else > > > >> > > > > >> > > > > > > > > > > > > > > > > feels > > > >> > > > > >> > > > > > > > > > > > > > > > > > strong about adding the > > > extra > > > >> > > > config. > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > Thanks, > > > >> > > > > >> > > > > > > > > > > > > > > > > > Lucas > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > On Wed, Jun 13, 2018 at > > > 3:11 > > > >> PM, > > > >> > > Ted > > > >> > > > > Yu > > > >> > > > > >> < > > > >> > > > > >> > > > > > > > > > yuzhih...@gmail.com > > > >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > wrote: > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > Lucas: > > > >> > > > > >> > > > > > > > > > > > > > > > > > > Under Rejected > > > >> Alternatives, > > > >> > #2, > > > >> > > > can > > > >> > > > > >> you > > > >> > > > > >> > > > > > elaborate > > > >> > > > > >> > > > > > > a > > > >> > > > > >> > > > > > > > > bit > > > >> > > > > >> > > > > > > > > > > more > > > >> > > > > >> > > > > > > > > > > > > on > > > >> > > > > >> > > > > > > > > > > > > > > why > > > >> > > > > >> > > > > > > > > > > > > > > > > the > > > >> > > > > >> > > > > > > > > > > > > > > > > > > separate config has > > > bigger > > > >> > > impact > > > >> > > > ? > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > Thanks > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > On Wed, Jun 13, 2018 > at > > > >> 2:00 > > > >> > PM, > > > >> > > > > Dong > > > >> > > > > >> > Lin < > > > >> > > > > >> > > > > > > > > > > > lindon...@gmail.com > > > >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > wrote: > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > Hey Luca, > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > Thanks for the KIP. > > > Looks > > > >> > good > > > >> > > > > >> overall. > > > >> > > > > >> > > > Some > > > >> > > > > >> > > > > > > > > comments > > > >> > > > > >> > > > > > > > > > > > below: > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > - We usually > specify > > > the > > > >> > full > > > >> > > > > mbean > > > >> > > > > >> for > > > >> > > > > >> > > the > > > >> > > > > >> > > > > new > > > >> > > > > >> > > > > > > > > metrics > > > >> > > > > >> > > > > > > > > > > in > > > >> > > > > >> > > > > > > > > > > > > the > > > >> > > > > >> > > > > > > > > > > > > > > KIP. > > > >> > > > > >> > > > > > > > > > > > > > > > > Can > > > >> > > > > >> > > > > > > > > > > > > > > > > > > you > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > specify it in the > > > Public > > > >> > > > Interface > > > >> > > > > >> > > section > > > >> > > > > >> > > > > > > similar > > > >> > > > > >> > > > > > > > > to > > > >> > > > > >> > > > > > > > > > > > KIP-237 > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > < > > > >> https://cwiki.apache.org/ > > > >> > > > > >> > > > > > > > > > confluence/display/KAFKA/KIP- > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > 237%3A+More+Controller+Health+ > > > >> > > > > >> Metrics> > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > ? > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > - Maybe we could > > follow > > > >> the > > > >> > > same > > > >> > > > > >> > pattern > > > >> > > > > >> > > as > > > >> > > > > >> > > > > > > KIP-153 > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > < > > > >> https://cwiki.apache.org/ > > > >> > > > > >> > > > > > > > > > confluence/display/KAFKA/KIP- > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > 153%3A+Include+only+client+traffic+in+BytesOutPerSec+ > > > >> > > > > >> > > > > > > > > > > > > metric>, > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > where we keep the > > > >> existing > > > >> > > > sensor > > > >> > > > > >> name > > > >> > > > > >> > > > > > > > > "BytesInPerSec" > > > >> > > > > >> > > > > > > > > > > and > > > >> > > > > >> > > > > > > > > > > > > add > > > >> > > > > >> > > > > > > > > > > > > > a > > > >> > > > > >> > > > > > > > > > > > > > > > new > > > >> > > > > >> > > > > > > > > > > > > > > > > > > sensor > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> "ReplicationBytesInPerSec", > > > >> > > > rather > > > >> > > > > >> than > > > >> > > > > >> > > > > > replacing > > > >> > > > > >> > > > > > > > > the > > > >> > > > > >> > > > > > > > > > > > sensor > > > >> > > > > >> > > > > > > > > > > > > > > name " > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > BytesInPerSec" with > > > e.g. > > > >> > > > > >> > > > > "ClientBytesInPerSec". > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > - It seems that the > > KIP > > > >> > > changes > > > >> > > > > the > > > >> > > > > >> > > > semantics > > > >> > > > > >> > > > > > of > > > >> > > > > >> > > > > > > > the > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > broker > > > >> > > > > >> > > > > > > > > > > > > > > config > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > "queued.max.requests" > > > >> > because > > > >> > > > the > > > >> > > > > >> > number > > > >> > > > > >> > > of > > > >> > > > > >> > > > > > total > > > >> > > > > >> > > > > > > > > > > requests > > > >> > > > > >> > > > > > > > > > > > > > queued > > > >> > > > > >> > > > > > > > > > > > > > > > in > > > >> > > > > >> > > > > > > > > > > > > > > > > > the > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > broker will be no > > > longer > > > >> > > bounded > > > >> > > > > by > > > >> > > > > >> > > > > > > > > > > "queued.max.requests". > > > >> > > > > >> > > > > > > > > > > > > This > > > >> > > > > >> > > > > > > > > > > > > > > > > > probably > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > needs to be > specified > > > in > > > >> the > > > >> > > > > Public > > > >> > > > > >> > > > > Interfaces > > > >> > > > > >> > > > > > > > > section > > > >> > > > > >> > > > > > > > > > > for > > > >> > > > > >> > > > > > > > > > > > > > > > > discussion. > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > Thanks, > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > Dong > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > On Wed, Jun 13, > 2018 > > at > > > >> > 12:45 > > > >> > > > PM, > > > >> > > > > >> Lucas > > > >> > > > > >> > > > Wang > > > >> > > > > >> > > > > < > > > >> > > > > >> > > > > > > > > > > > > > > > lucasatu...@gmail.com > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > wrote: > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > Hi Kafka experts, > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > I created KIP-291 > > to > > > >> add a > > > >> > > > > >> separate > > > >> > > > > >> > > queue > > > >> > > > > >> > > > > for > > > >> > > > > >> > > > > > > > > > > controller > > > >> > > > > >> > > > > > > > > > > > > > > > requests: > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> https://cwiki.apache.org/ > > > >> > > > > >> > > > > > > > > > confluence/display/KAFKA/KIP- > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > 291% > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > 3A+Have+separate+queues+for+ > > > >> > > > > >> > > > > > > > > > control+requests+and+data+ > > > >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > requests > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > Can you please > > take a > > > >> look > > > >> > > and > > > >> > > > > >> let me > > > >> > > > > >> > > > know > > > >> > > > > >> > > > > > your > > > >> > > > > >> > > > > > > > > > > feedback? > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > -- > > -Regards, > > Mayuresh R. Gharat > > (862) 250-7125 > > >