Thanks for the KIP, Rajini,

If I understand correctly the proposal was essentially trying to quota the
CPU usage (that is probably why time slice is used instead of request rate)
while the existing quota we have is for network bandwidth.

Given we are trying to throttle both CPU and Network, that implies the
following patterns for the clients:
1. High CPU usage, high network usage.
2. High CPU usage, low network usage.
3. Low CPU usage, high network usage.
4. Low CPU usage, low network usage

Theoretically the existing quota addresses case 3 & 4. And this KIP seems
trying to address case 1 & 2. However, it might be helpful to understand
what we want to achieve with CPU and network quotas.

People mainly use quota for two different purposes:
a) protecting the broker from misbehaving clients, and
b) resource distribution for multi-tenancy.

I agree that generally speaking CPU time is a suitable metric to quota on
for CPU usage and would work for a). However, as Dong and Onur noticed, it
is not easy to quantify the impact for the end users at application level
with a throttled CPU time. If the purpose of the CPU quota is only for
protection, maybe we don't need a user facing CPU quota.

That said, a user facing CPU quota could be useful for virtualization,
which maybe related to multi-tenancy but is a little different. Imagine
there are 10 services sharing the same physical Kafka cluster. With CPU
time quota and network bandwidth quota, each service can provision a
logical Kafka cluster with some reserved CPU time and network bandwidth.
And in this case the quota will be on per logic cluster. Not sure if this
is what the KIP is intended in the future, though. It would be good if the
KIP can be more clear on what exact scenarios the CPU quota is trying to
address.

As of the request rate quota, while it seems easy to enforce and intuitive,
there are some caveats.
1. Users do not have direct control over the request rate, i.e. users do
not known when a request will be sent by the clients.
2. Each request may require different amount of CPU resources to handle.
That may depends on many things, e.g. whether acks = 1 or acks = -1,
whether a request is addressing 1000 partitions or 1 partition, whether a
fetch request requires message format down conversion or not, etc.
So the result of using request rate quota could be quite unpredictable.

Thanks,

Jiangjie (Becket) Qin

On Sat, Feb 18, 2017 at 9:35 PM, Dong Lin <lindon...@gmail.com> wrote:

> I realized the main concern with this proposal is how user can interpret
> this CPU-percentage based quota. Since this quota is exposed to user, we
> need to explain to user how this quota is going to impact their application
> performance and convince them that the quota is now too low for their
> application. We can able to do this with byte-rate based quota. But I am
> not sure how we can do this with CPU-percentage based quota. For example,
> how is user going to understand whether 1% CPU is OK?
>
> On Fri, Feb 17, 2017 at 10:11 AM, Onur Karaman <
> onurkaraman.apa...@gmail.com
> > wrote:
>
> > Overall a big fan of the KIP.
> >
> > I'd have to agree with Dong. I'm not sure about the decision of using the
> > percentage over the window as opposed to request rate. It's pretty hard
> to
> > reason about. I just spoke to one of our SRE's and he agrees.
> >
> > Also I may have missed it, but I couldn't find information in the KIP on
> > where this window would be configured.
> >
> > On Fri, Feb 17, 2017 at 9:45 AM, Dong Lin <lindon...@gmail.com> wrote:
> >
> > > To correct the typo above: It seems to me that determination of request
> > > rate is not any more difficult than determination of *byte* rate as
> both
> > > metrics are commonly used to measure performance and provide guarantee
> to
> > > user.
> > >
> > > On Fri, Feb 17, 2017 at 9:40 AM, Dong Lin <lindon...@gmail.com> wrote:
> > >
> > > > Hey Rajini,
> > > >
> > > > Thanks for the KIP. I have some questions:
> > > >
> > > > - I am wondering why throttling based on request rate is listed as a
> > > > rejected alternative. Can you provide more specific reason why it is
> > > > difficult for administrators to decide request rates to allocate? It
> > > seems
> > > > to me that determination of request rate is not any more difficult
> than
> > > > determination of request rate as both metrics are commonly used to
> > > measure
> > > > performance and provide guarantee to user. On the other hand, the
> > > > percentage of processing time provides a vague guarantee to user. For
> > > > example, what performance can user expect if you provide 1%
> processing
> > > time
> > > > quota to this user? How is administrator going to decide this quota?
> > > Should
> > > > Kafka administrator continues to reduce this percentage quota as
> number
> > > of
> > > > users grow?
> > > >
> > > > - The KIP suggests that LeaderAndIsrRequest and MetadataRequest will
> > also
> > > > be throttled by this quota. What is the motivation for throttling
> these
> > > > requests? It is also inconsistent with rate-based quota which is only
> > > > applied to ProduceRequest and FetchRequest. IMO it will be simpler to
> > > only
> > > > throttle ProduceRequest and FetchRequest.
> > > >
> > > > - Do you think we should also throttle the inter-broker traffic using
> > > this
> > > > quota as well similar to KIP-73?
> > > >
> > > > Thanks,
> > > > Dong
> > > >
> > > >
> > > >
> > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> > rajinisiva...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> I have just created KIP-124 to introduce request rate quotas to
> Kafka:
> > > >>
> > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-124+-+
> > > >> Request+rate+quotas
> > > >>
> > > >> The proposal is for a simple percentage request handling time quota
> > that
> > > >> can be allocated to *<client-id>*, *<user>* or *<user, client-id>*.
> > > There
> > > >> are a few other suggestions also under "Rejected alternatives".
> > Feedback
> > > >> and suggestions are welcome.
> > > >>
> > > >> Thank you...
> > > >>
> > > >> Regards,
> > > >>
> > > >> Rajini
> > > >>
> > > >
> > > >
> > >
> >
>

Reply via email to