Re: [DISCUSS] KIP-124: Request rate quotas

Rajini Sivaram Mon, 20 Feb 2017 06:00:32 -0800

I have updated the KIP to use request rates instead of request processing
time,


I have removed all requests that require ClusterAction permission
(LeaderAndIsr and UpdateMetdata as well in addition to stop/shutdown). But
I have left Metadata request in. Quota windows which limit the maximum
delay tend to be small (1 second by default) compared to request timeout or
max.block.ms and even the existing byte rate quotas can impact the time
taken to fetch metadata if the metadata request is queued behind a produce
request (for instance). So I don't think clients will need any additional
exception handling code for request rate quotas beyond what they already
need for byte rate quotas. Clients can flood the broker with metadata
requests (eg. producer with retry.backoff.ms=0 sending a message to a
non-existent topic), so it makes sense to throttle metadata requests.


Thanks,

Rajini

On Mon, Feb 20, 2017 at 11:55 AM, Dong Lin <lindon...@gmail.com> wrote:

> Hey Rajini,
>
> Thanks for the explanation. I have some follow up questions regarding the
> types of requests that will be covered by this quota. Since this KIP focus
> only on throttling the traffic between client and broker and client never
> sends LeaderAndIsrRequest to broker, should we exclude LeaderAndIsrRequest
> from this KIP?
>
> Besides, I am still not sure we should throttle MetadataUpdateRequeset. The
> benefits of throttling MetadataUpdateRequest seems little since it doesn't
> increase with user traffic. Client only sends MetadataUpdateRequest when
> there is partition leadership change or when client metadata has expired.
> On the other hand, if we throttle MetadataUpdateRequest, there is chance
> that MetadataUpdateRequest doesn't get update in time and user may receive
> exception. This seems like a big interface change because user will have to
> change application code to handle such exception. Note that the current
> rate-based quota will reduce traffic without throwing any exception to
> user.
>
> Anyway, I am looking forward to the updated KIP:)
>
> Thanks,
> Dong
>
> On Mon, Feb 20, 2017 at 2:43 AM, Rajini Sivaram <rajinisiva...@gmail.com>
> wrote:
>
> > Dong, Onur & Becket,
> >
> > Thank you all for the very useful feedback.
> >
> > The choice of request handling time as opposed to request rate was based
> on
> > the observation in KAFKA-4195
> > <https://issues.apache.org/jira/browse/KAFKA-4195> that request rates
> may
> > be less intuitive to configure than percentage utilization. But since the
> > KIP is measuring time rather than request pool utilization as suggested
> in
> > the JIRA, I agree that request rate would probably work better than
> > percentage. So I am inclined to change the KIP to throttle on request
> rates
> > (e.g 100 requests per second) rather than percentage. Average request
> rates
> > are exposed as metrics, so admin can configure quotas based on that. And
> > the values are more meaningful from the client application point of
> view. I
> > am still interested in feedback regarding the second rejected alternative
> > that throttles based on percentage utilization of resource handler pool.
> > That was the suggestion from Jun/Ismael in KAFKA-4195, but I couldn't see
> > how that would help in the case where a small number of connections
> pushed
> > a continuous stream of short requests. Suggestions welcome.
> >
> > Responses to other questions above:
> >
> > - (Dong): The KIP proposes to throttle most requests (and not just
> > Produce/Fetch) since the goal is to control usage of broker resources. So
> > LeaderAndIsrRequest and MetadataRequest will also be throttled. The few
> > requests not being throttled are timing-sensitive.
> >
> > - (Dong): The KIP does not propose to throttle inter-broker traffic based
> > on request rates. The most frequent requests in inter-broker traffic are
> > fetch requests and a well configured broker would use reasonably good
> > values of min.bytes and max.wait that avoids overloading the broker
> > unnecessarily with fetch requests. The existing byte-rate based quotas
> > should be sufficient in this case.
> >
> > - (Onur): Quota window configuration - this is the existing configuration
> > quota.window.size.seconds (also used for byte-rate quotas)
> >
> > - (Becket): The main issue that the KIP is addressing is clients flooding
> > the broker with small requests (eg. fetch with max.wait.ms=0), which can
> > overload the broker and delay requests from other clients/users even
> though
> > the byte rate is quite small. CPU quota reflects the resource usage on
> the
> > broker that the KIP is attempting to limit. Since this is the time on the
> > local broker, it shouldn't vary much depending on acks=-1 etc. but I do
> > agree on the unpredictability of time based quotas. Switching from
> request
> > processing time to request rates will address this. Would you still be
> > concerned that "*Users do not have direct control over the request rate,
> > i.e. users do **not know when a request will be sent by the clients*"?
> >
> > Jun/Ismael,
> >
> > I am interested in your views on request rate based quotas and whether we
> > should still consider utilization of the resource handler pool.
> >
> >
> > Many thanks,
> >
> > Rajini
> >
> >
> > On Sun, Feb 19, 2017 at 11:54 PM, Becket Qin <becket....@gmail.com>
> wrote:
> >
> > > Thanks for the KIP, Rajini,
> > >
> > > If I understand correctly the proposal was essentially trying to quota
> > the
> > > CPU usage (that is probably why time slice is used instead of request
> > rate)
> > > while the existing quota we have is for network bandwidth.
> > >
> > > Given we are trying to throttle both CPU and Network, that implies the
> > > following patterns for the clients:
> > > 1. High CPU usage, high network usage.
> > > 2. High CPU usage, low network usage.
> > > 3. Low CPU usage, high network usage.
> > > 4. Low CPU usage, low network usage
> > >
> > > Theoretically the existing quota addresses case 3 & 4. And this KIP
> seems
> > > trying to address case 1 & 2. However, it might be helpful to
> understand
> > > what we want to achieve with CPU and network quotas.
> > >
> > > People mainly use quota for two different purposes:
> > > a) protecting the broker from misbehaving clients, and
> > > b) resource distribution for multi-tenancy.
> > >
> > > I agree that generally speaking CPU time is a suitable metric to quota
> on
> > > for CPU usage and would work for a). However, as Dong and Onur noticed,
> > it
> > > is not easy to quantify the impact for the end users at application
> level
> > > with a throttled CPU time. If the purpose of the CPU quota is only for
> > > protection, maybe we don't need a user facing CPU quota.
> > >
> > > That said, a user facing CPU quota could be useful for virtualization,
> > > which maybe related to multi-tenancy but is a little different. Imagine
> > > there are 10 services sharing the same physical Kafka cluster. With CPU
> > > time quota and network bandwidth quota, each service can provision a
> > > logical Kafka cluster with some reserved CPU time and network
> bandwidth.
> > > And in this case the quota will be on per logic cluster. Not sure if
> this
> > > is what the KIP is intended in the future, though. It would be good if
> > the
> > > KIP can be more clear on what exact scenarios the CPU quota is trying
> to
> > > address.
> > >
> > > As of the request rate quota, while it seems easy to enforce and
> > intuitive,
> > > there are some caveats.
> > > 1. Users do not have direct control over the request rate, i.e. users
> do
> > > not known when a request will be sent by the clients.
> > > 2. Each request may require different amount of CPU resources to
> handle.
> > > That may depends on many things, e.g. whether acks = 1 or acks = -1,
> > > whether a request is addressing 1000 partitions or 1 partition,
> whether a
> > > fetch request requires message format down conversion or not, etc.
> > > So the result of using request rate quota could be quite unpredictable.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Sat, Feb 18, 2017 at 9:35 PM, Dong Lin <lindon...@gmail.com> wrote:
> > >
> > > > I realized the main concern with this proposal is how user can
> > interpret
> > > > this CPU-percentage based quota. Since this quota is exposed to user,
> > we
> > > > need to explain to user how this quota is going to impact their
> > > application
> > > > performance and convince them that the quota is now too low for their
> > > > application. We can able to do this with byte-rate based quota. But I
> > am
> > > > not sure how we can do this with CPU-percentage based quota. For
> > example,
> > > > how is user going to understand whether 1% CPU is OK?
> > > >
> > > > On Fri, Feb 17, 2017 at 10:11 AM, Onur Karaman <
> > > > onurkaraman.apa...@gmail.com
> > > > > wrote:
> > > >
> > > > > Overall a big fan of the KIP.
> > > > >
> > > > > I'd have to agree with Dong. I'm not sure about the decision of
> using
> > > the
> > > > > percentage over the window as opposed to request rate. It's pretty
> > hard
> > > > to
> > > > > reason about. I just spoke to one of our SRE's and he agrees.
> > > > >
> > > > > Also I may have missed it, but I couldn't find information in the
> KIP
> > > on
> > > > > where this window would be configured.
> > > > >
> > > > > On Fri, Feb 17, 2017 at 9:45 AM, Dong Lin <lindon...@gmail.com>
> > wrote:
> > > > >
> > > > > > To correct the typo above: It seems to me that determination of
> > > request
> > > > > > rate is not any more difficult than determination of *byte* rate
> as
> > > > both
> > > > > > metrics are commonly used to measure performance and provide
> > > guarantee
> > > > to
> > > > > > user.
> > > > > >
> > > > > > On Fri, Feb 17, 2017 at 9:40 AM, Dong Lin <lindon...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Hey Rajini,
> > > > > > >
> > > > > > > Thanks for the KIP. I have some questions:
> > > > > > >
> > > > > > > - I am wondering why throttling based on request rate is listed
> > as
> > > a
> > > > > > > rejected alternative. Can you provide more specific reason why
> it
> > > is
> > > > > > > difficult for administrators to decide request rates to
> allocate?
> > > It
> > > > > > seems
> > > > > > > to me that determination of request rate is not any more
> > difficult
> > > > than
> > > > > > > determination of request rate as both metrics are commonly used
> > to
> > > > > > measure
> > > > > > > performance and provide guarantee to user. On the other hand,
> the
> > > > > > > percentage of processing time provides a vague guarantee to
> user.
> > > For
> > > > > > > example, what performance can user expect if you provide 1%
> > > > processing
> > > > > > time
> > > > > > > quota to this user? How is administrator going to decide this
> > > quota?
> > > > > > Should
> > > > > > > Kafka administrator continues to reduce this percentage quota
> as
> > > > number
> > > > > > of
> > > > > > > users grow?
> > > > > > >
> > > > > > > - The KIP suggests that LeaderAndIsrRequest and MetadataRequest
> > > will
> > > > > also
> > > > > > > be throttled by this quota. What is the motivation for
> throttling
> > > > these
> > > > > > > requests? It is also inconsistent with rate-based quota which
> is
> > > only
> > > > > > > applied to ProduceRequest and FetchRequest. IMO it will be
> > simpler
> > > to
> > > > > > only
> > > > > > > throttle ProduceRequest and FetchRequest.
> > > > > > >
> > > > > > > - Do you think we should also throttle the inter-broker traffic
> > > using
> > > > > > this
> > > > > > > quota as well similar to KIP-73?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Dong
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> > > > > rajinisiva...@gmail.com
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > >> Hi all,
> > > > > > >>
> > > > > > >> I have just created KIP-124 to introduce request rate quotas
> to
> > > > Kafka:
> > > > > > >>
> > > > > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-124+-+
> > > > > > >> Request+rate+quotas
> > > > > > >>
> > > > > > >> The proposal is for a simple percentage request handling time
> > > quota
> > > > > that
> > > > > > >> can be allocated to *<client-id>*, *<user>* or *<user,
> > > client-id>*.
> > > > > > There
> > > > > > >> are a few other suggestions also under "Rejected
> alternatives".
> > > > > Feedback
> > > > > > >> and suggestions are welcome.
> > > > > > >>
> > > > > > >> Thank you...
> > > > > > >>
> > > > > > >> Regards,
> > > > > > >>
> > > > > > >> Rajini
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Reply via email to