Re: [DISCUSS] KIP-124: Request rate quotas

Rajini Sivaram Thu, 23 Feb 2017 15:30:02 -0800

Thank you all for the feedback.

Ismael #1. It makes sense not to throttle inter-broker requests like
LeaderAndIsr etc. The simplest way to ensure that clients cannot use these
requests to bypass quotas for DoS attacks is to ensure that ACLs prevent
clients from using these requests and unauthorized requests are included
towards quotas.


Ismael #2, Jay #1 : I was thinking that these quotas can return a separate
throttle time, and all utilization based quotas could use the same field
(we won't add another one for network thread utilization for instance). But
perhaps it makes sense to keep byte rate quotas separate in produce/fetch
responses to provide separate metrics? Agree with Ismael that the name of
the existing field should be changed if we have two. Happy to switch to a
single combined throttle time if that is sufficient.

Ismael #4, #5, #6: Will update KIP. Will use dot separated name for new
property. Replication quotas use dot separated, so it will be consistent
with all properties except byte rate quotas.

Radai: #1 Request processing time rather than request rate were chosen
because the time per request can vary significantly between requests as
mentioned in the discussion and KIP.
#2 Two separate quotas for heartbeats/regular requests feel like more
configuration and more metrics. Since most users would set quotas higher
than the expected usage and quotas are more of a safety net, a single quota
should work in most cases.
 #3 The number of requests in purgatory is limited by the number of active
connections since only one request per connection will be throttled at a
time.
#4 As with byte rate quotas, to use the full allocated quotas,
clients/users would need to use partitions that are distributed across the
cluster. The alternative of using cluster-wide quotas instead of per-broker
quotas would be far too complex to implement.

Dong : We currently have two ClientQuotaManagers for quota types Fetch and
Produce. A new one will be added for IOThread, which manages quotas for I/O
thread utilization. This will not update the Fetch or Produce queue-size,
but will have a separate metric for the queue-size.  I wasn't planning to
add any additional metrics apart from the equivalent ones for existing
quotas as part of this KIP. Ratio of byte-rate to I/O thread utilization
could be slightly misleading since it depends on the sequence of requests.
But we can look into more metrics after the KIP is implemented if required.

I think we need to limit the maximum delay since all requests are
throttled. If a client has a quota of 0.001 units and a single request used
50ms, we don't want to delay all requests from the client by 50 seconds,
throwing the client out of all its consumer groups. The issue is only if a
user is allocated a quota that is insufficient to process one large
request. The expectation is that the units allocated per user will be much
higher than the time taken to process one request and the limit should
seldom be applied. Agree this needs proper documentation.

Regards,

Rajini


On Thu, Feb 23, 2017 at 8:04 PM, radai <radai.rosenbl...@gmail.com> wrote:

> @jun: i wasnt concerned about tying up a request processing thread, but
> IIUC the code does still read the entire request out, which might add-up to
> a non-negligible amount of memory.
>
> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <lindon...@gmail.com> wrote:
>
> > Hey Rajini,
> >
> > The current KIP says that the maximum delay will be reduced to window
> size
> > if it is larger than the window size. I have a concern with this:
> >
> > 1) This essentially means that the user is allowed to exceed their quota
> > over a long period of time. Can you provide an upper bound on this
> > deviation?
> >
> > 2) What is the motivation for cap the maximum delay by the window size? I
> > am wondering if there is better alternative to address the problem.
> >
> > 3) It means that the existing metric-related config will have a more
> > directly impact on the mechanism of this io-thread-unit-based quota. The
> > may be an important change depending on the answer to 1) above. We
> probably
> > need to document this more explicitly.
> >
> > Dong
> >
> >
> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <lindon...@gmail.com> wrote:
> >
> > > Hey Jun,
> > >
> > > Yeah you are right. I thought it wasn't because at LinkedIn it will be
> > too
> > > much pressure on inGraph to expose those per-clientId metrics so we
> ended
> > > up printing them periodically to local log. Never mind if it is not a
> > > general problem.
> > >
> > > Hey Rajini,
> > >
> > > - I agree with Jay that we probably don't want to add a new field for
> > > every quota ProduceResponse or FetchResponse. Is there any use-case for
> > > having separate throttle-time fields for byte-rate-quota and
> > > io-thread-unit-quota? You probably need to document this as interface
> > > change if you plan to add new field in any request.
> > >
> > > - I don't think IOThread belongs to quotaType. The existing quota types
> > > (i.e. Produce/Fetch/LeaderReplication/FollowerReplication) identify
> the
> > > type of request that are throttled, not the quota mechanism that is
> > applied.
> > >
> > > - If a request is throttled due to this io-thread-unit-based quota, is
> > the
> > > existing queue-size metric in ClientQuotaManager incremented?
> > >
> > > - In the interest of providing guide line for admin to decide
> > > io-thread-unit-based quota and for user to understand its impact on
> their
> > > traffic, would it be useful to have a metric that shows the overall
> > > byte-rate per io-thread-unit? Can we also show this a per-clientId
> > metric?
> > >
> > > Thanks,
> > > Dong
> > >
> > >
> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <j...@confluent.io> wrote:
> > >
> > >> Hi, Ismael,
> > >>
> > >> For #3, typically, an admin won't configure more io threads than CPU
> > >> cores,
> > >> but it's possible for an admin to start with fewer io threads than
> cores
> > >> and grow that later on.
> > >>
> > >> Hi, Dong,
> > >>
> > >> I think the throttleTime sensor on the broker tells the admin whether
> a
> > >> user/clentId is throttled or not.
> > >>
> > >> Hi, Radi,
> > >>
> > >> The reasoning for delaying the throttled requests on the broker
> instead
> > of
> > >> returning an error immediately is that the latter has no way to
> prevent
> > >> the
> > >> client from retrying immediately, which will make things worse. The
> > >> delaying logic is based off a delay queue. A separate expiration
> thread
> > >> just waits on the next to be expired request. So, it doesn't tie up a
> > >> request handler thread.
> > >>
> > >> Thanks,
> > >>
> > >> Jun
> > >>
> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <ism...@juma.me.uk>
> wrote:
> > >>
> > >> > Hi Jay,
> > >> >
> > >> > Regarding 1, I definitely like the simplicity of keeping a single
> > >> throttle
> > >> > time field in the response. The downside is that the client metrics
> > >> will be
> > >> > more coarse grained.
> > >> >
> > >> > Regarding 3, we have `leader.imbalance.per.broker.percentage` and
> > >> > `log.cleaner.min.cleanable.ratio`.
> > >> >
> > >> > Ismael
> > >> >
> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <j...@confluent.io>
> wrote:
> > >> >
> > >> > > A few minor comments:
> > >> > >
> > >> > >    1. Isn't it the case that the throttling time response field
> > should
> > >> > have
> > >> > >    the total time your request was throttled irrespective of the
> > >> quotas
> > >> > > that
> > >> > >    caused that. Limiting it to byte rate quota doesn't make sense,
> > >> but I
> > >> > > also
> > >> > >    I don't think we want to end up adding new fields in the
> response
> > >> for
> > >> > > every
> > >> > >    single thing we quota, right?
> > >> > >    2. I don't think we should make this quota specifically about
> io
> > >> > >    threads. Once we introduce these quotas people set them and
> > expect
> > >> > them
> > >> > > to
> > >> > >    be enforced (and if they aren't it may cause an outage). As a
> > >> result
> > >> > > they
> > >> > >    are a bit more sensitive than normal configs, I think. The
> > current
> > >> > > thread
> > >> > >    pools seem like something of an implementation detail and not
> the
> > >> > level
> > >> > > the
> > >> > >    user-facing quotas should be involved with. I think it might be
> > >> better
> > >> > > to
> > >> > >    make this a general request-time throttle with no mention in
> the
> > >> > naming
> > >> > >    about I/O threads and simply acknowledge the current limitation
> > >> (which
> > >> > > we
> > >> > >    may someday fix) in the docs that this covers only the time
> after
> > >> the
> > >> > >    thread is read off the network.
> > >> > >    3. As such I think the right interface to the user would be
> > >> something
> > >> > >    like percent_request_time and be in {0,...100} or
> > >> request_time_ratio
> > >> > > and be
> > >> > >    in {0.0,...,1.0} (I think "ratio" is the terminology we used if
> > the
> > >> > > scale
> > >> > >    is between 0 and 1 in the other metrics, right?)
> > >> > >
> > >> > > -Jay
> > >> > >
> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <
> > >> rajinisiva...@gmail.com
> > >> > >
> > >> > > wrote:
> > >> > >
> > >> > > > Guozhang/Dong,
> > >> > > >
> > >> > > > Thank you for the feedback.
> > >> > > >
> > >> > > > Guozhang : I have updated the section on co-existence of byte
> rate
> > >> and
> > >> > > > request time quotas.
> > >> > > >
> > >> > > > Dong: I hadn't added much detail to the metrics and sensors
> since
> > >> they
> > >> > > are
> > >> > > > going to be very similar to the existing metrics and sensors. To
> > >> avoid
> > >> > > > confusion, I have now added more detail. All metrics are in the
> > >> group
> > >> > > > "quotaType" and all sensors have names starting with "quotaType"
> > >> (where
> > >> > > > quotaType is Produce/Fetch/LeaderReplication/
> > >> > > > FollowerReplication/*IOThread*).
> > >> > > > So there will be no reuse of existing metrics/sensors. The new
> > ones
> > >> for
> > >> > > > request processing time based throttling will be completely
> > >> independent
> > >> > > of
> > >> > > > existing metrics/sensors, but will be consistent in format.
> > >> > > >
> > >> > > > The existing throttle_time_ms field in produce/fetch responses
> > will
> > >> not
> > >> > > be
> > >> > > > impacted by this KIP. That will continue to return byte-rate
> based
> > >> > > > throttling times. In addition, a new field
> > request_throttle_time_ms
> > >> > will
> > >> > > be
> > >> > > > added to return request quota based throttling times. These will
> > be
> > >> > > exposed
> > >> > > > as new metrics on the client-side.
> > >> > > >
> > >> > > > Since all metrics and sensors are different for each type of
> > quota,
> > >> I
> > >> > > > believe there is already sufficient metrics to monitor
> throttling
> > on
> > >> > both
> > >> > > > client and broker side for each type of throttling.
> > >> > > >
> > >> > > > Regards,
> > >> > > >
> > >> > > > Rajini
> > >> > > >
> > >> > > >
> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <lindon...@gmail.com>
> > >> wrote:
> > >> > > >
> > >> > > > > Hey Rajini,
> > >> > > > >
> > >> > > > > I think it makes a lot of sense to use io_thread_units as
> metric
> > >> to
> > >> > > quota
> > >> > > > > user's traffic here. LGTM overall. I have some questions
> > regarding
> > >> > > > sensors.
> > >> > > > >
> > >> > > > > - Can you be more specific in the KIP what sensors will be
> > added?
> > >> For
> > >> > > > > example, it will be useful to specify the name and attributes
> of
> > >> > these
> > >> > > > new
> > >> > > > > sensors.
> > >> > > > >
> > >> > > > > - We currently have throttle-time and queue-size for byte-rate
> > >> based
> > >> > > > quota.
> > >> > > > > Are you going to have separate throttle-time and queue-size
> for
> > >> > > requests
> > >> > > > > throttled by io_thread_unit-based quota, or will they share
> the
> > >> same
> > >> > > > > sensor?
> > >> > > > >
> > >> > > > > - Does the throttle-time in the ProduceResponse and
> > FetchResponse
> > >> > > > contains
> > >> > > > > time due to io_thread_unit-based quota?
> > >> > > > >
> > >> > > > > - Currently kafka server doesn't not provide any log or
> metrics
> > >> that
> > >> > > > tells
> > >> > > > > whether any given clientId (or user) is throttled. This is not
> > too
> > >> > bad
> > >> > > > > because we can still check the client-side byte-rate metric to
> > >> > validate
> > >> > > > > whether a given client is throttled. But with this
> > io_thread_unit,
> > >> > > there
> > >> > > > > will be no way to validate whether a given client is slow
> > because
> > >> it
> > >> > > has
> > >> > > > > exceeded its io_thread_unit limit. It is necessary for user to
> > be
> > >> > able
> > >> > > to
> > >> > > > > know this information to figure how whether they have reached
> > >> there
> > >> > > quota
> > >> > > > > limit. How about we add log4j log on the server side to
> > >> periodically
> > >> > > > print
> > >> > > > > the (client_id, byte-rate-throttle-time,
> > >> > io-thread-unit-throttle-time)
> > >> > > so
> > >> > > > > that kafka administrator can figure those users that have
> > reached
> > >> > their
> > >> > > > > limit and act accordingly?
> > >> > > > >
> > >> > > > > Thanks,
> > >> > > > > Dong
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <
> > >> wangg...@gmail.com>
> > >> > > > wrote:
> > >> > > > >
> > >> > > > > > Made a pass over the doc, overall LGTM except a minor
> comment
> > on
> > >> > the
> > >> > > > > > throttling implementation:
> > >> > > > > >
> > >> > > > > > Stated as "Request processing time throttling will be
> applied
> > on
> > >> > top
> > >> > > if
> > >> > > > > > necessary." I thought that it meant the request processing
> > time
> > >> > > > > throttling
> > >> > > > > > is applied first, but continue reading I found it actually
> > >> meant to
> > >> > > > apply
> > >> > > > > > produce / fetch byte rate throttling first.
> > >> > > > > >
> > >> > > > > > Also the last sentence "The remaining delay if any is
> applied
> > to
> > >> > the
> > >> > > > > > response." is a bit confusing to me. Maybe rewording it a
> bit?
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > Guozhang
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <j...@confluent.io>
> > >> wrote:
> > >> > > > > >
> > >> > > > > > > Hi, Rajini,
> > >> > > > > > >
> > >> > > > > > > Thanks for the updated KIP. The latest proposal looks good
> > to
> > >> me.
> > >> > > > > > >
> > >> > > > > > > Jun
> > >> > > > > > >
> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <
> > >> > > > > rajinisiva...@gmail.com
> > >> > > > > > >
> > >> > > > > > > wrote:
> > >> > > > > > >
> > >> > > > > > > > Jun/Roger,
> > >> > > > > > > >
> > >> > > > > > > > Thank you for the feedback.
> > >> > > > > > > >
> > >> > > > > > > > 1. I have updated the KIP to use absolute units instead
> of
> > >> > > > > percentage.
> > >> > > > > > > The
> > >> > > > > > > > property is called* io_thread_units* to align with the
> > >> thread
> > >> > > count
> > >> > > > > > > > property *num.io.threads*. When we implement network
> > thread
> > >> > > > > utilization
> > >> > > > > > > > quotas, we can add another property
> > *network_thread_units.*
> > >> > > > > > > >
> > >> > > > > > > > 2. ControlledShutdown is already listed under the exempt
> > >> > > requests.
> > >> > > > > Jun,
> > >> > > > > > > did
> > >> > > > > > > > you mean a different request that needs to be added? The
> > >> four
> > >> > > > > requests
> > >> > > > > > > > currently exempt in the KIP are StopReplica,
> > >> > ControlledShutdown,
> > >> > > > > > > > LeaderAndIsr and UpdateMetadata. These are controlled
> > using
> > >> > > > > > ClusterAction
> > >> > > > > > > > ACL, so it is easy to exclude and only throttle if
> > >> > unauthorized.
> > >> > > I
> > >> > > > > > wasn't
> > >> > > > > > > > sure if there are other requests used only for
> > inter-broker
> > >> > that
> > >> > > > > needed
> > >> > > > > > > to
> > >> > > > > > > > be excluded.
> > >> > > > > > > >
> > >> > > > > > > > 3. I was thinking the smallest change would be to
> replace
> > >> all
> > >> > > > > > references
> > >> > > > > > > to
> > >> > > > > > > > *requestChannel.sendResponse()* with a local method
> > >> > > > > > > > *sendResponseMaybeThrottle()* that does the throttling
> if
> > >> any
> > >> > > plus
> > >> > > > > send
> > >> > > > > > > > response. If we throttle first in *KafkaApis.handle()*,
> > the
> > >> > time
> > >> > > > > spent
> > >> > > > > > > > within the method handling the request will not be
> > recorded
> > >> or
> > >> > > used
> > >> > > > > in
> > >> > > > > > > > throttling. We can look into this again when the PR is
> > ready
> > >> > for
> > >> > > > > > review.
> > >> > > > > > > >
> > >> > > > > > > > Regards,
> > >> > > > > > > >
> > >> > > > > > > > Rajini
> > >> > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <
> > >> > > > > roger.hoo...@gmail.com>
> > >> > > > > > > > wrote:
> > >> > > > > > > >
> > >> > > > > > > > > Great to see this KIP and the excellent discussion.
> > >> > > > > > > > >
> > >> > > > > > > > > To me, Jun's suggestion makes sense.  If my
> application
> > is
> > >> > > > > allocated
> > >> > > > > > 1
> > >> > > > > > > > > request handler unit, then it's as if I have a Kafka
> > >> broker
> > >> > > with
> > >> > > > a
> > >> > > > > > > single
> > >> > > > > > > > > request handler thread dedicated to me.  That's the
> > most I
> > >> > can
> > >> > > > use,
> > >> > > > > > at
> > >> > > > > > > > > least.  That allocation doesn't change even if an
> admin
> > >> later
> > >> > > > > > increases
> > >> > > > > > > > the
> > >> > > > > > > > > size of the request thread pool on the broker.  It's
> > >> similar
> > >> > to
> > >> > > > the
> > >> > > > > > CPU
> > >> > > > > > > > > abstraction that VMs and containers get from
> hypervisors
> > >> or
> > >> > OS
> > >> > > > > > > > schedulers.
> > >> > > > > > > > > While different client access patterns can use wildly
> > >> > different
> > >> > > > > > amounts
> > >> > > > > > > > of
> > >> > > > > > > > > request thread resources per request, a given
> > application
> > >> > will
> > >> > > > > > > generally
> > >> > > > > > > > > have a stable access pattern and can figure out
> > >> empirically
> > >> > how
> > >> > > > > many
> > >> > > > > > > > > "request thread units" it needs to meet it's
> > >> > throughput/latency
> > >> > > > > > goals.
> > >> > > > > > > > >
> > >> > > > > > > > > Cheers,
> > >> > > > > > > > >
> > >> > > > > > > > > Roger
> > >> > > > > > > > >
> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <
> > >> j...@confluent.io>
> > >> > > > wrote:
> > >> > > > > > > > >
> > >> > > > > > > > > > Hi, Rajini,
> > >> > > > > > > > > >
> > >> > > > > > > > > > Thanks for the updated KIP. A few more comments.
> > >> > > > > > > > > >
> > >> > > > > > > > > > 1. A concern of request_time_percent is that it's
> not
> > an
> > >> > > > absolute
> > >> > > > > > > > value.
> > >> > > > > > > > > > Let's say you give a user a 10% limit. If the admin
> > >> doubles
> > >> > > the
> > >> > > > > > > number
> > >> > > > > > > > of
> > >> > > > > > > > > > request handler threads, that user now actually has
> > >> twice
> > >> > the
> > >> > > > > > > absolute
> > >> > > > > > > > > > capacity. This may confuse people a bit. So, perhaps
> > >> > setting
> > >> > > > the
> > >> > > > > > > quota
> > >> > > > > > > > > > based on an absolute request thread unit is better.
> > >> > > > > > > > > >
> > >> > > > > > > > > > 2. ControlledShutdownRequest is also an inter-broker
> > >> > request
> > >> > > > and
> > >> > > > > > > needs
> > >> > > > > > > > to
> > >> > > > > > > > > > be excluded from throttling.
> > >> > > > > > > > > >
> > >> > > > > > > > > > 3. Implementation wise, I am wondering if it's
> simpler
> > >> to
> > >> > > apply
> > >> > > > > the
> > >> > > > > > > > > request
> > >> > > > > > > > > > time throttling first in KafkaApis.handle().
> > Otherwise,
> > >> we
> > >> > > will
> > >> > > > > > need
> > >> > > > > > > to
> > >> > > > > > > > > add
> > >> > > > > > > > > > the throttling logic in each type of request.
> > >> > > > > > > > > >
> > >> > > > > > > > > > Thanks,
> > >> > > > > > > > > >
> > >> > > > > > > > > > Jun
> > >> > > > > > > > > >
> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini Sivaram <
> > >> > > > > > > > rajinisiva...@gmail.com
> > >> > > > > > > > > >
> > >> > > > > > > > > > wrote:
> > >> > > > > > > > > >
> > >> > > > > > > > > > > Jun,
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > Thank you for the review.
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > I have reverted to the original KIP that throttles
> > >> based
> > >> > on
> > >> > > > > > request
> > >> > > > > > > > > > handler
> > >> > > > > > > > > > > utilization. At the moment, it uses percentage,
> but
> > I
> > >> am
> > >> > > > happy
> > >> > > > > to
> > >> > > > > > > > > change
> > >> > > > > > > > > > to
> > >> > > > > > > > > > > a fraction (out of 1 instead of 100) if required.
> I
> > >> have
> > >> > > > added
> > >> > > > > > the
> > >> > > > > > > > > > examples
> > >> > > > > > > > > > > from this discussion to the KIP. Also added a
> > "Future
> > >> > Work"
> > >> > > > > > section
> > >> > > > > > > > to
> > >> > > > > > > > > > > address network thread utilization. The
> > configuration
> > >> is
> > >> > > > named
> > >> > > > > > > > > > > "request_time_percent" with the expectation that
> it
> > >> can
> > >> > > also
> > >> > > > be
> > >> > > > > > > used
> > >> > > > > > > > as
> > >> > > > > > > > > > the
> > >> > > > > > > > > > > limit for network thread utilization when that is
> > >> > > > implemented,
> > >> > > > > so
> > >> > > > > > > > that
> > >> > > > > > > > > > > users have to set only one config for the two and
> > not
> > >> > have
> > >> > > to
> > >> > > > > > worry
> > >> > > > > > > > > about
> > >> > > > > > > > > > > the internal distribution of the work between the
> > two
> > >> > > thread
> > >> > > > > > pools
> > >> > > > > > > in
> > >> > > > > > > > > > > Kafka.
> > >> > > > > > > > > > >
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > Regards,
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > Rajini
> > >> > > > > > > > > > >
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <
> > >> > > j...@confluent.io>
> > >> > > > > > > wrote:
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > > Hi, Rajini,
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > Thanks for the proposal.
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > The benefit of using the request processing time
> > >> over
> > >> > the
> > >> > > > > > request
> > >> > > > > > > > > rate
> > >> > > > > > > > > > is
> > >> > > > > > > > > > > > exactly what people have said. I will just
> expand
> > >> that
> > >> > a
> > >> > > > bit.
> > >> > > > > > > > > Consider
> > >> > > > > > > > > > > the
> > >> > > > > > > > > > > > following case. The producer sends a produce
> > request
> > >> > > with a
> > >> > > > > > 10MB
> > >> > > > > > > > > > message
> > >> > > > > > > > > > > > but compressed to 100KB with gzip. The
> > >> decompression of
> > >> > > the
> > >> > > > > > > message
> > >> > > > > > > > > on
> > >> > > > > > > > > > > the
> > >> > > > > > > > > > > > broker could take 10-15 seconds, during which
> > time,
> > >> a
> > >> > > > request
> > >> > > > > > > > handler
> > >> > > > > > > > > > > > thread is completely blocked. In this case,
> > neither
> > >> the
> > >> > > > > byte-in
> > >> > > > > > > > quota
> > >> > > > > > > > > > nor
> > >> > > > > > > > > > > > the request rate quota may be effective in
> > >> protecting
> > >> > the
> > >> > > > > > broker.
> > >> > > > > > > > > > > Consider
> > >> > > > > > > > > > > > another case. A consumer group starts with 10
> > >> instances
> > >> > > and
> > >> > > > > > later
> > >> > > > > > > > on
> > >> > > > > > > > > > > > switches to 20 instances. The request rate will
> > >> likely
> > >> > > > > double,
> > >> > > > > > > but
> > >> > > > > > > > > the
> > >> > > > > > > > > > > > actually load on the broker may not double since
> > >> each
> > >> > > fetch
> > >> > > > > > > request
> > >> > > > > > > > > > only
> > >> > > > > > > > > > > > contains half of the partitions. Request rate
> > quota
> > >> may
> > >> > > not
> > >> > > > > be
> > >> > > > > > > easy
> > >> > > > > > > > > to
> > >> > > > > > > > > > > > configure in this case.
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > What we really want is to be able to prevent a
> > >> client
> > >> > > from
> > >> > > > > > using
> > >> > > > > > > > too
> > >> > > > > > > > > > much
> > >> > > > > > > > > > > > of the server side resources. In this particular
> > >> KIP,
> > >> > > this
> > >> > > > > > > resource
> > >> > > > > > > > > is
> > >> > > > > > > > > > > the
> > >> > > > > > > > > > > > capacity of the request handler threads. I agree
> > >> that
> > >> > it
> > >> > > > may
> > >> > > > > > not
> > >> > > > > > > be
> > >> > > > > > > > > > > > intuitive for the users to determine how to set
> > the
> > >> > right
> > >> > > > > > limit.
> > >> > > > > > > > > > However,
> > >> > > > > > > > > > > > this is not completely new and has been done in
> > the
> > >> > > > container
> > >> > > > > > > world
> > >> > > > > > > > > > > > already. For example, Linux cgroup (
> > >> > > > > https://access.redhat.com/
> > >> > > > > > > > > > > > documentation/en-US/Red_Hat_En
> > >> terprise_Linux/6/html/
> > >> > > > > > > > > > > > Resource_Management_Guide/sec-cpu.html) has the
> > >> > concept
> > >> > > of
> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> > >> > > > > > > > > > > > which specifies the total amount of time in
> > >> > microseconds
> > >> > > > for
> > >> > > > > > > which
> > >> > > > > > > > > all
> > >> > > > > > > > > > > > tasks in a cgroup can run during a one second
> > >> period.
> > >> > We
> > >> > > > can
> > >> > > > > > > > > > potentially
> > >> > > > > > > > > > > > model the request handler threads in a similar
> > way.
> > >> For
> > >> > > > > > example,
> > >> > > > > > > > each
> > >> > > > > > > > > > > > request handler thread can be 1 request handler
> > unit
> > >> > and
> > >> > > > the
> > >> > > > > > > admin
> > >> > > > > > > > > can
> > >> > > > > > > > > > > > configure a limit on how many units (say 0.01) a
> > >> client
> > >> > > can
> > >> > > > > > have.
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > Regarding not throttling the internal broker to
> > >> broker
> > >> > > > > > requests.
> > >> > > > > > > We
> > >> > > > > > > > > > could
> > >> > > > > > > > > > > > do that. Alternatively, we could just let the
> > admin
> > >> > > > > configure a
> > >> > > > > > > > high
> > >> > > > > > > > > > > limit
> > >> > > > > > > > > > > > for the kafka user (it may not be able to do
> that
> > >> > easily
> > >> > > > > based
> > >> > > > > > on
> > >> > > > > > > > > > > clientId
> > >> > > > > > > > > > > > though).
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > Ideally we want to be able to protect the
> > >> utilization
> > >> > of
> > >> > > > the
> > >> > > > > > > > network
> > >> > > > > > > > > > > thread
> > >> > > > > > > > > > > > pool too. The difficult is mostly what Rajini
> > said:
> > >> (1)
> > >> > > The
> > >> > > > > > > > mechanism
> > >> > > > > > > > > > for
> > >> > > > > > > > > > > > throttling the requests is through Purgatory and
> > we
> > >> > will
> > >> > > > have
> > >> > > > > > to
> > >> > > > > > > > > think
> > >> > > > > > > > > > > > through how to integrate that into the network
> > >> layer.
> > >> > > (2)
> > >> > > > In
> > >> > > > > > the
> > >> > > > > > > > > > network
> > >> > > > > > > > > > > > layer, currently we know the user, but not the
> > >> clientId
> > >> > > of
> > >> > > > > the
> > >> > > > > > > > > request.
> > >> > > > > > > > > > > So,
> > >> > > > > > > > > > > > it's a bit tricky to throttle based on clientId
> > >> there.
> > >> > > > Plus,
> > >> > > > > > the
> > >> > > > > > > > > > byteOut
> > >> > > > > > > > > > > > quota can already protect the network thread
> > >> > utilization
> > >> > > > for
> > >> > > > > > > fetch
> > >> > > > > > > > > > > > requests. So, if we can't figure out this part
> > right
> > >> > now,
> > >> > > > > just
> > >> > > > > > > > > focusing
> > >> > > > > > > > > > > on
> > >> > > > > > > > > > > > the request handling threads for this KIP is
> > still a
> > >> > > useful
> > >> > > > > > > > feature.
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > Thanks,
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > Jun
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini Sivaram
> <
> > >> > > > > > > > > > rajinisiva...@gmail.com
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > wrote:
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > > Thank you all for the feedback.
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > > Jay: I have removed exemption for consumer
> > >> heartbeat
> > >> > > etc.
> > >> > > > > > Agree
> > >> > > > > > > > > that
> > >> > > > > > > > > > > > > protecting the cluster is more important than
> > >> > > protecting
> > >> > > > > > > > individual
> > >> > > > > > > > > > > apps.
> > >> > > > > > > > > > > > > Have retained the exemption for
> > >> > > StopReplicat/LeaderAndIsr
> > >> > > > > > etc,
> > >> > > > > > > > > these
> > >> > > > > > > > > > > are
> > >> > > > > > > > > > > > > throttled only if authorization fails (so
> can't
> > be
> > >> > used
> > >> > > > for
> > >> > > > > > DoS
> > >> > > > > > > > > > attacks
> > >> > > > > > > > > > > > in
> > >> > > > > > > > > > > > > a secure cluster, but allows inter-broker
> > >> requests to
> > >> > > > > > complete
> > >> > > > > > > > > > without
> > >> > > > > > > > > > > > > delays).
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > > I will wait another day to see if these is any
> > >> > > objection
> > >> > > > to
> > >> > > > > > > > quotas
> > >> > > > > > > > > > > based
> > >> > > > > > > > > > > > on
> > >> > > > > > > > > > > > > request processing time (as opposed to request
> > >> rate)
> > >> > > and
> > >> > > > if
> > >> > > > > > > there
> > >> > > > > > > > > are
> > >> > > > > > > > > > > no
> > >> > > > > > > > > > > > > objections, I will revert to the original
> > proposal
> > >> > with
> > >> > > > > some
> > >> > > > > > > > > changes.
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > > The original proposal was only including the
> > time
> > >> > used
> > >> > > by
> > >> > > > > the
> > >> > > > > > > > > request
> > >> > > > > > > > > > > > > handler threads (that made calculation easy).
> I
> > >> think
> > >> > > the
> > >> > > > > > > > > suggestion
> > >> > > > > > > > > > is
> > >> > > > > > > > > > > > to
> > >> > > > > > > > > > > > > include the time spent in the network threads
> as
> > >> well
> > >> > > > since
> > >> > > > > > > that
> > >> > > > > > > > > may
> > >> > > > > > > > > > be
> > >> > > > > > > > > > > > > significant. As Jay pointed out, it is more
> > >> > complicated
> > >> > > > to
> > >> > > > > > > > > calculate
> > >> > > > > > > > > > > the
> > >> > > > > > > > > > > > > total available CPU time and convert to a
> ratio
> > >> when
> > >> > > > there
> > >> > > > > > *m*
> > >> > > > > > > > I/O
> > >> > > > > > > > > > > > threads
> > >> > > > > > > > > > > > > and *n* network threads.
> > >> > ThreadMXBean#getThreadCPUTime(
> > >> > > )
> > >> > > > > may
> > >> > > > > > > > give
> > >> > > > > > > > > us
> > >> > > > > > > > > > > > what
> > >> > > > > > > > > > > > > we want, but it can be very expensive on some
> > >> > > platforms.
> > >> > > > As
> > >> > > > > > > > Becket
> > >> > > > > > > > > > and
> > >> > > > > > > > > > > > > Guozhang have pointed out, we do have several
> > time
> > >> > > > > > measurements
> > >> > > > > > > > > > already
> > >> > > > > > > > > > > > for
> > >> > > > > > > > > > > > > generating metrics that we could use, though
> we
> > >> might
> > >> > > > want
> > >> > > > > to
> > >> > > > > > > > > switch
> > >> > > > > > > > > > to
> > >> > > > > > > > > > > > > nanoTime() instead of currentTimeMillis()
> since
> > >> some
> > >> > of
> > >> > > > the
> > >> > > > > > > > values
> > >> > > > > > > > > > for
> > >> > > > > > > > > > > > > small requests may be < 1ms. But rather than
> add
> > >> up
> > >> > the
> > >> > > > > time
> > >> > > > > > > > spent
> > >> > > > > > > > > in
> > >> > > > > > > > > > > I/O
> > >> > > > > > > > > > > > > thread and network thread, wouldn't it be
> better
> > >> to
> > >> > > > convert
> > >> > > > > > the
> > >> > > > > > > > > time
> > >> > > > > > > > > > > > spent
> > >> > > > > > > > > > > > > on each thread into a separate ratio? UserA
> has
> > a
> > >> > > request
> > >> > > > > > quota
> > >> > > > > > > > of
> > >> > > > > > > > > > 5%.
> > >> > > > > > > > > > > > Can
> > >> > > > > > > > > > > > > we take that to mean that UserA can use 5% of
> > the
> > >> > time
> > >> > > on
> > >> > > > > > > network
> > >> > > > > > > > > > > threads
> > >> > > > > > > > > > > > > and 5% of the time on I/O threads? If either
> is
> > >> > > exceeded,
> > >> > > > > the
> > >> > > > > > > > > > response
> > >> > > > > > > > > > > is
> > >> > > > > > > > > > > > > throttled - it would mean maintaining two sets
> > of
> > >> > > metrics
> > >> > > > > for
> > >> > > > > > > the
> > >> > > > > > > > > two
> > >> > > > > > > > > > > > > durations, but would result in more meaningful
> > >> > ratios.
> > >> > > We
> > >> > > > > > could
> > >> > > > > > > > > > define
> > >> > > > > > > > > > > > two
> > >> > > > > > > > > > > > > quota limits (UserA has 5% of request threads
> > and
> > >> 10%
> > >> > > of
> > >> > > > > > > network
> > >> > > > > > > > > > > > threads),
> > >> > > > > > > > > > > > > but that seems unnecessary and harder to
> explain
> > >> to
> > >> > > > users.
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > > Back to why and how quotas are applied to
> > network
> > >> > > thread
> > >> > > > > > > > > utilization:
> > >> > > > > > > > > > > > > a) In the case of fetch,  the time spent in
> the
> > >> > network
> > >> > > > > > thread
> > >> > > > > > > > may
> > >> > > > > > > > > be
> > >> > > > > > > > > > > > > significant and I can see the need to include
> > >> this.
> > >> > Are
> > >> > > > > there
> > >> > > > > > > > other
> > >> > > > > > > > > > > > > requests where the network thread utilization
> is
> > >> > > > > significant?
> > >> > > > > > > In
> > >> > > > > > > > > the
> > >> > > > > > > > > > > case
> > >> > > > > > > > > > > > > of fetch, request handler thread utilization
> > would
> > >> > > > throttle
> > >> > > > > > > > clients
> > >> > > > > > > > > > > with
> > >> > > > > > > > > > > > > high request rate, low data volume and fetch
> > byte
> > >> > rate
> > >> > > > > quota
> > >> > > > > > > will
> > >> > > > > > > > > > > > throttle
> > >> > > > > > > > > > > > > clients with high data volume. Network thread
> > >> > > utilization
> > >> > > > > is
> > >> > > > > > > > > perhaps
> > >> > > > > > > > > > > > > proportional to the data volume. I am
> wondering
> > >> if we
> > >> > > > even
> > >> > > > > > need
> > >> > > > > > > > to
> > >> > > > > > > > > > > > throttle
> > >> > > > > > > > > > > > > based on network thread utilization or whether
> > the
> > >> > data
> > >> > > > > > volume
> > >> > > > > > > > > quota
> > >> > > > > > > > > > > > covers
> > >> > > > > > > > > > > > > this case.
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > > b) At the moment, we record and check for
> quota
> > >> > > violation
> > >> > > > > at
> > >> > > > > > > the
> > >> > > > > > > > > same
> > >> > > > > > > > > > > > time.
> > >> > > > > > > > > > > > > If a quota is violated, the response is
> delayed.
> > >> > Using
> > >> > > > > Jay'e
> > >> > > > > > > > > example
> > >> > > > > > > > > > of
> > >> > > > > > > > > > > > > disk reads for fetches happening in the
> network
> > >> > thread,
> > >> > > > We
> > >> > > > > > > can't
> > >> > > > > > > > > > record
> > >> > > > > > > > > > > > and
> > >> > > > > > > > > > > > > delay a response after the disk reads. We
> could
> > >> > record
> > >> > > > the
> > >> > > > > > time
> > >> > > > > > > > > spent
> > >> > > > > > > > > > > on
> > >> > > > > > > > > > > > > the network thread when the response is
> complete
> > >> and
> > >> > > > > > introduce
> > >> > > > > > > a
> > >> > > > > > > > > > delay
> > >> > > > > > > > > > > > for
> > >> > > > > > > > > > > > > handling a subsequent request (separate out
> > >> recording
> > >> > > and
> > >> > > > > > quota
> > >> > > > > > > > > > > violation
> > >> > > > > > > > > > > > > handling in the case of network thread
> > overload).
> > >> > Does
> > >> > > > that
> > >> > > > > > > make
> > >> > > > > > > > > > sense?
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > > Regards,
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > > Rajini
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <
> > >> > > > > > > > becket....@gmail.com>
> > >> > > > > > > > > > > > wrote:
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > Hey Jay,
> > >> > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > Yeah, I agree that enforcing the CPU time
> is a
> > >> > little
> > >> > > > > > > tricky. I
> > >> > > > > > > > > am
> > >> > > > > > > > > > > > > thinking
> > >> > > > > > > > > > > > > > that maybe we can use the existing request
> > >> > > statistics.
> > >> > > > > They
> > >> > > > > > > are
> > >> > > > > > > > > > > already
> > >> > > > > > > > > > > > > > very detailed so we can probably see the
> > >> > approximate
> > >> > > > CPU
> > >> > > > > > time
> > >> > > > > > > > > from
> > >> > > > > > > > > > > it,
> > >> > > > > > > > > > > > > e.g.
> > >> > > > > > > > > > > > > > something like (total_time -
> > >> > > > request/response_queue_time
> > >> > > > > -
> > >> > > > > > > > > > > > remote_time).
> > >> > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > I agree with Guozhang that when a user is
> > >> throttled
> > >> > > it
> > >> > > > is
> > >> > > > > > > > likely
> > >> > > > > > > > > > that
> > >> > > > > > > > > > > > we
> > >> > > > > > > > > > > > > > need to see if anything has went wrong
> first,
> > >> and
> > >> > if
> > >> > > > the
> > >> > > > > > > users
> > >> > > > > > > > > are
> > >> > > > > > > > > > > well
> > >> > > > > > > > > > > > > > behaving and just need more resources, we
> will
> > >> have
> > >> > > to
> > >> > > > > bump
> > >> > > > > > > up
> > >> > > > > > > > > the
> > >> > > > > > > > > > > > quota
> > >> > > > > > > > > > > > > > for them. It is true that pre-allocating CPU
> > >> time
> > >> > > quota
> > >> > > > > > > > precisely
> > >> > > > > > > > > > for
> > >> > > > > > > > > > > > the
> > >> > > > > > > > > > > > > > users is difficult. So in practice it would
> > >> > probably
> > >> > > be
> > >> > > > > > more
> > >> > > > > > > > like
> > >> > > > > > > > > > > first
> > >> > > > > > > > > > > > > set
> > >> > > > > > > > > > > > > > a relative high protective CPU time quota
> for
> > >> > > everyone
> > >> > > > > and
> > >> > > > > > > > > increase
> > >> > > > > > > > > > > > that
> > >> > > > > > > > > > > > > > for some individual clients on demand.
> > >> > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > Thanks,
> > >> > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > >> > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang
> > Wang <
> > >> > > > > > > > > wangg...@gmail.com
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > > > wrote:
> > >> > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > This is a great proposal, glad to see it
> > >> > happening.
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > I am inclined to the CPU throttling, or
> more
> > >> > > > > specifically
> > >> > > > > > > > > > > processing
> > >> > > > > > > > > > > > > time
> > >> > > > > > > > > > > > > > > ratio instead of the request rate
> throttling
> > >> as
> > >> > > well.
> > >> > > > > > > Becket
> > >> > > > > > > > > has
> > >> > > > > > > > > > > very
> > >> > > > > > > > > > > > > > well
> > >> > > > > > > > > > > > > > > summed my rationales above, and one thing
> to
> > >> add
> > >> > > here
> > >> > > > > is
> > >> > > > > > > that
> > >> > > > > > > > > the
> > >> > > > > > > > > > > > > former
> > >> > > > > > > > > > > > > > > has a good support for both "protecting
> > >> against
> > >> > > rogue
> > >> > > > > > > > clients"
> > >> > > > > > > > > as
> > >> > > > > > > > > > > > well
> > >> > > > > > > > > > > > > as
> > >> > > > > > > > > > > > > > > "utilizing a cluster for multi-tenancy
> > usage":
> > >> > when
> > >> > > > > > > thinking
> > >> > > > > > > > > > about
> > >> > > > > > > > > > > > how
> > >> > > > > > > > > > > > > to
> > >> > > > > > > > > > > > > > > explain this to the end users, I find it
> > >> actually
> > >> > > > more
> > >> > > > > > > > natural
> > >> > > > > > > > > > than
> > >> > > > > > > > > > > > the
> > >> > > > > > > > > > > > > > > request rate since as mentioned above,
> > >> different
> > >> > > > > requests
> > >> > > > > > > > will
> > >> > > > > > > > > > have
> > >> > > > > > > > > > > > > quite
> > >> > > > > > > > > > > > > > > different "cost", and Kafka today already
> > have
> > >> > > > various
> > >> > > > > > > > request
> > >> > > > > > > > > > > types
> > >> > > > > > > > > > > > > > > (produce, fetch, admin, metadata, etc),
> > >> because
> > >> > of
> > >> > > > that
> > >> > > > > > the
> > >> > > > > > > > > > request
> > >> > > > > > > > > > > > > rate
> > >> > > > > > > > > > > > > > > throttling may not be as effective unless
> it
> > >> is
> > >> > set
> > >> > > > > very
> > >> > > > > > > > > > > > > conservatively.
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > Regarding to user reactions when they are
> > >> > > throttled,
> > >> > > > I
> > >> > > > > > > think
> > >> > > > > > > > it
> > >> > > > > > > > > > may
> > >> > > > > > > > > > > > > > differ
> > >> > > > > > > > > > > > > > > case-by-case, and need to be discovered /
> > >> guided
> > >> > by
> > >> > > > > > looking
> > >> > > > > > > > at
> > >> > > > > > > > > > > > relative
> > >> > > > > > > > > > > > > > > metrics. So in other words users would not
> > >> expect
> > >> > > to
> > >> > > > > get
> > >> > > > > > > > > > additional
> > >> > > > > > > > > > > > > > > information by simply being told "hey, you
> > are
> > >> > > > > > throttled",
> > >> > > > > > > > > which
> > >> > > > > > > > > > is
> > >> > > > > > > > > > > > all
> > >> > > > > > > > > > > > > > > what throttling does; they need to take a
> > >> > follow-up
> > >> > > > > step
> > >> > > > > > > and
> > >> > > > > > > > > see
> > >> > > > > > > > > > > > "hmm,
> > >> > > > > > > > > > > > > > I'm
> > >> > > > > > > > > > > > > > > throttled probably because of ..", which
> is
> > by
> > >> > > > looking
> > >> > > > > at
> > >> > > > > > > > other
> > >> > > > > > > > > > > > metric
> > >> > > > > > > > > > > > > > > values: e.g. whether I'm bombarding the
> > >> brokers
> > >> > > with
> > >> > > > > > > metadata
> > >> > > > > > > > > > > > request,
> > >> > > > > > > > > > > > > > > which are usually cheap to handle but I'm
> > >> sending
> > >> > > > > > thousands
> > >> > > > > > > > per
> > >> > > > > > > > > > > > second;
> > >> > > > > > > > > > > > > > or
> > >> > > > > > > > > > > > > > > is it because I'm catching up and hence
> > >> sending
> > >> > > very
> > >> > > > > > heavy
> > >> > > > > > > > > > fetching
> > >> > > > > > > > > > > > > > request
> > >> > > > > > > > > > > > > > > with large min.bytes, etc.
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > Regarding to the implementation, as once
> > >> > discussed
> > >> > > > with
> > >> > > > > > > Jun,
> > >> > > > > > > > > this
> > >> > > > > > > > > > > > seems
> > >> > > > > > > > > > > > > > not
> > >> > > > > > > > > > > > > > > very difficult since today we are already
> > >> > > collecting
> > >> > > > > the
> > >> > > > > > > > > "thread
> > >> > > > > > > > > > > pool
> > >> > > > > > > > > > > > > > > utilization" metrics, which is a single
> > >> > percentage
> > >> > > > > > > > > > > > "aggregateIdleMeter"
> > >> > > > > > > > > > > > > > > value; but we are already effectively
> > >> aggregating
> > >> > > it
> > >> > > > > for
> > >> > > > > > > each
> > >> > > > > > > > > > > > requests
> > >> > > > > > > > > > > > > in
> > >> > > > > > > > > > > > > > > KafkaRequestHandler, and we can just
> extend
> > >> it by
> > >> > > > > > recording
> > >> > > > > > > > the
> > >> > > > > > > > > > > > source
> > >> > > > > > > > > > > > > > > client id when handling them and
> aggregating
> > >> by
> > >> > > > > clientId
> > >> > > > > > as
> > >> > > > > > > > > well
> > >> > > > > > > > > > as
> > >> > > > > > > > > > > > the
> > >> > > > > > > > > > > > > > > total aggregate.
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > Guozhang
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay
> Kreps <
> > >> > > > > > > j...@confluent.io
> > >> > > > > > > > >
> > >> > > > > > > > > > > wrote:
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > Hey Becket/Rajini,
> > >> > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > When I thought about it more deeply I
> came
> > >> > around
> > >> > > > to
> > >> > > > > > the
> > >> > > > > > > > > > "percent
> > >> > > > > > > > > > > > of
> > >> > > > > > > > > > > > > > > > processing time" metric too. It seems a
> > lot
> > >> > > closer
> > >> > > > to
> > >> > > > > > the
> > >> > > > > > > > > thing
> > >> > > > > > > > > > > we
> > >> > > > > > > > > > > > > > > actually
> > >> > > > > > > > > > > > > > > > care about and need to protect. I also
> > think
> > >> > this
> > >> > > > > would
> > >> > > > > > > be
> > >> > > > > > > > a
> > >> > > > > > > > > > very
> > >> > > > > > > > > > > > > > useful
> > >> > > > > > > > > > > > > > > > metric even in the absence of throttling
> > >> just
> > >> > to
> > >> > > > > debug
> > >> > > > > > > > whose
> > >> > > > > > > > > > > using
> > >> > > > > > > > > > > > > > > > capacity.
> > >> > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > Two problems to consider:
> > >> > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > >    1. I agree that for the user it is
> > >> > > > understandable
> > >> > > > > > what
> > >> > > > > > > > > lead
> > >> > > > > > > > > > to
> > >> > > > > > > > > > > > > their
> > >> > > > > > > > > > > > > > > >    being throttled, but it is a bit hard
> > to
> > >> > > figure
> > >> > > > > out
> > >> > > > > > > the
> > >> > > > > > > > > safe
> > >> > > > > > > > > > > > range
> > >> > > > > > > > > > > > > > for
> > >> > > > > > > > > > > > > > > >    them. i.e. if I have a new app that
> > will
> > >> > send
> > >> > > > 200
> > >> > > > > > > > > > > messages/sec I
> > >> > > > > > > > > > > > > can
> > >> > > > > > > > > > > > > > > >    probably reason that I'll be under
> the
> > >> > > > throttling
> > >> > > > > > > limit
> > >> > > > > > > > of
> > >> > > > > > > > > > 300
> > >> > > > > > > > > > > > > > > req/sec.
> > >> > > > > > > > > > > > > > > >    However if I need to be under a 10%
> CPU
> > >> > > > resources
> > >> > > > > > > limit
> > >> > > > > > > > it
> > >> > > > > > > > > > may
> > >> > > > > > > > > > > > be
> > >> > > > > > > > > > > > > a
> > >> > > > > > > > > > > > > > > bit
> > >> > > > > > > > > > > > > > > >    harder for me to know a priori if i
> > will
> > >> or
> > >> > > > won't.
> > >> > > > > > > > > > > > > > > >    2. Calculating the available CPU time
> > is
> > >> a
> > >> > bit
> > >> > > > > > > difficult
> > >> > > > > > > > > > since
> > >> > > > > > > > > > > > > there
> > >> > > > > > > > > > > > > > > are
> > >> > > > > > > > > > > > > > > >    actually two thread pools--the I/O
> > >> threads
> > >> > and
> > >> > > > the
> > >> > > > > > > > network
> > >> > > > > > > > > > > > > threads.
> > >> > > > > > > > > > > > > > I
> > >> > > > > > > > > > > > > > > > think
> > >> > > > > > > > > > > > > > > >    it might be workable to count just
> the
> > >> I/O
> > >> > > > thread
> > >> > > > > > time
> > >> > > > > > > > as
> > >> > > > > > > > > in
> > >> > > > > > > > > > > the
> > >> > > > > > > > > > > > > > > > proposal,
> > >> > > > > > > > > > > > > > > >    but the network thread work is
> actually
> > >> > > > > non-trivial
> > >> > > > > > > > (e.g.
> > >> > > > > > > > > > all
> > >> > > > > > > > > > > > the
> > >> > > > > > > > > > > > > > disk
> > >> > > > > > > > > > > > > > > >    reads for fetches happen in that
> > >> thread). If
> > >> > > you
> > >> > > > > > count
> > >> > > > > > > > > both
> > >> > > > > > > > > > > the
> > >> > > > > > > > > > > > > > > network
> > >> > > > > > > > > > > > > > > > and
> > >> > > > > > > > > > > > > > > >    I/O threads it can skew things a bit.
> > >> E.g.
> > >> > say
> > >> > > > you
> > >> > > > > > > have
> > >> > > > > > > > 50
> > >> > > > > > > > > > > > network
> > >> > > > > > > > > > > > > > > > threads,
> > >> > > > > > > > > > > > > > > >    10 I/O threads, and 8 cores, what is
> > the
> > >> > > > available
> > >> > > > > > cpu
> > >> > > > > > > > > time
> > >> > > > > > > > > > > > > > available
> > >> > > > > > > > > > > > > > > > in a
> > >> > > > > > > > > > > > > > > >    second? I suppose this is a problem
> > >> whenever
> > >> > > you
> > >> > > > > > have
> > >> > > > > > > a
> > >> > > > > > > > > > > > bottleneck
> > >> > > > > > > > > > > > > > > > between
> > >> > > > > > > > > > > > > > > >    I/O and network threads or if you end
> > up
> > >> > > > > > significantly
> > >> > > > > > > > > > > > > > > over-provisioning
> > >> > > > > > > > > > > > > > > >    one pool (both of which are hard to
> > >> avoid).
> > >> > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > An alternative for CPU throttling would
> be
> > >> to
> > >> > use
> > >> > > > > this
> > >> > > > > > > api:
> > >> > > > > > > > > > > > > > > > http://docs.oracle.com/javase/
> > >> > > > > > 1.5.0/docs/api/java/lang/
> > >> > > > > > > > > > > > > > > > management/ThreadMXBean.html#
> > >> > > > getThreadCpuTime(long)
> > >> > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > That would let you track actual CPU
> usage
> > >> > across
> > >> > > > the
> > >> > > > > > > > network,
> > >> > > > > > > > > > I/O
> > >> > > > > > > > > > > > > > > threads,
> > >> > > > > > > > > > > > > > > > and purgatory threads and look at it as
> a
> > >> > > > percentage
> > >> > > > > of
> > >> > > > > > > > total
> > >> > > > > > > > > > > > cores.
> > >> > > > > > > > > > > > > I
> > >> > > > > > > > > > > > > > > > think this fixes many problems in the
> > >> > reliability
> > >> > > > of
> > >> > > > > > the
> > >> > > > > > > > > > metric.
> > >> > > > > > > > > > > > It's
> > >> > > > > > > > > > > > > > > > meaning is slightly different as it is
> > just
> > >> CPU
> > >> > > > (you
> > >> > > > > > > don't
> > >> > > > > > > > > get
> > >> > > > > > > > > > > > > charged
> > >> > > > > > > > > > > > > > > for
> > >> > > > > > > > > > > > > > > > time blocking on I/O) but that may be
> okay
> > >> > > because
> > >> > > > we
> > >> > > > > > > > already
> > >> > > > > > > > > > > have
> > >> > > > > > > > > > > > a
> > >> > > > > > > > > > > > > > > > throttle on I/O. The downside is I think
> > it
> > >> is
> > >> > > > > possible
> > >> > > > > > > > this
> > >> > > > > > > > > > api
> > >> > > > > > > > > > > > can
> > >> > > > > > > > > > > > > be
> > >> > > > > > > > > > > > > > > > disabled or isn't always available and
> it
> > >> may
> > >> > > also
> > >> > > > be
> > >> > > > > > > > > expensive
> > >> > > > > > > > > > > > (also
> > >> > > > > > > > > > > > > > > I've
> > >> > > > > > > > > > > > > > > > never used it so not sure if it really
> > works
> > >> > the
> > >> > > > way
> > >> > > > > i
> > >> > > > > > > > > think).
> > >> > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > -Jay
> > >> > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 3:17 PM, Becket
> > Qin
> > >> <
> > >> > > > > > > > > > > becket....@gmail.com>
> > >> > > > > > > > > > > > > > > wrote:
> > >> > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > If the purpose of the KIP is only to
> > >> protect
> > >> > > the
> > >> > > > > > > cluster
> > >> > > > > > > > > from
> > >> > > > > > > > > > > > being
> > >> > > > > > > > > > > > > > > > > overwhelmed by crazy clients and is
> not
> > >> > > intended
> > >> > > > to
> > >> > > > > > > > address
> > >> > > > > > > > > > > > > resource
> > >> > > > > > > > > > > > > > > > > allocation problem among the clients,
> I
> > am
> > >> > > > > wondering
> > >> > > > > > if
> > >> > > > > > > > > using
> > >> > > > > > > > > > > > > request
> > >> > > > > > > > > > > > > > > > > handling time quota (CPU time quota)
> is
> > a
> > >> > > better
> > >> > > > > > > option.
> > >> > > > > > > > > Here
> > >> > > > > > > > > > > are
> > >> > > > > > > > > > > > > the
> > >> > > > > > > > > > > > > > > > > reasons:
> > >> > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > 1. request handling time quota has
> > better
> > >> > > > > protection.
> > >> > > > > > > Say
> > >> > > > > > > > > we
> > >> > > > > > > > > > > have
> > >> > > > > > > > > > > > > > > request
> > >> > > > > > > > > > > > > > > > > rate quota and set that to some value
> > like
> > >> > 100
> > >> > > > > > > > > requests/sec,
> > >> > > > > > > > > > it
> > >> > > > > > > > > > > > is
> > >> > > > > > > > > > > > > > > > possible
> > >> > > > > > > > > > > > > > > > > that some of the requests are very
> > >> expensive
> > >> > > > > actually
> > >> > > > > > > > take
> > >> > > > > > > > > a
> > >> > > > > > > > > > > lot
> > >> > > > > > > > > > > > of
> > >> > > > > > > > > > > > > > > time
> > >> > > > > > > > > > > > > > > > to
> > >> > > > > > > > > > > > > > > > > handle. In that case a few clients may
> > >> still
> > >> > > > > occupy a
> > >> > > > > > > lot
> > >> > > > > > > > > of
> > >> > > > > > > > > > > CPU
> > >> > > > > > > > > > > > > time
> > >> > > > > > > > > > > > > > > > even
> > >> > > > > > > > > > > > > > > > > the request rate is low. Arguably we
> can
> > >> > > > carefully
> > >> > > > > > set
> > >> > > > > > > > > > request
> > >> > > > > > > > > > > > rate
> > >> > > > > > > > > > > > > > > quota
> > >> > > > > > > > > > > > > > > > > for each request and client id
> > >> combination,
> > >> > but
> > >> > > > it
> > >> > > > > > > could
> > >> > > > > > > > > > still
> > >> > > > > > > > > > > be
> > >> > > > > > > > > > > > > > > tricky
> > >> > > > > > > > > > > > > > > > to
> > >> > > > > > > > > > > > > > > > > get it right for everyone.
> > >> > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > If we use the request time handling
> > >> quota, we
> > >> > > can
> > >> > > > > > > simply
> > >> > > > > > > > > say
> > >> > > > > > > > > > no
> > >> > > > > > > > > > > > > > clients
> > >> > > > > > > > > > > > > > > > can
> > >> > > > > > > > > > > > > > > > > take up to more than 30% of the total
> > >> request
> > >> > > > > > handling
> > >> > > > > > > > > > capacity
> > >> > > > > > > > > > > > > > > (measured
> > >> > > > > > > > > > > > > > > > > by time), regardless of the difference
> > >> among
> > >> > > > > > different
> > >> > > > > > > > > > requests
> > >> > > > > > > > > > > > or
> > >> > > > > > > > > > > > > > what
> > >> > > > > > > > > > > > > > > > is
> > >> > > > > > > > > > > > > > > > > the client doing. In this case maybe
> we
> > >> can
> > >> > > quota
> > >> > > > > all
> > >> > > > > > > the
> > >> > > > > > > > > > > > requests
> > >> > > > > > > > > > > > > if
> > >> > > > > > > > > > > > > > > we
> > >> > > > > > > > > > > > > > > > > want to.
> > >> > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > 2. The main benefit of using request
> > rate
> > >> > limit
> > >> > > > is
> > >> > > > > > that
> > >> > > > > > > > it
> > >> > > > > > > > > > > seems
> > >> > > > > > > > > > > > > more
> > >> > > > > > > > > > > > > > > > > intuitive. It is true that it is
> > probably
> > >> > > easier
> > >> > > > to
> > >> > > > > > > > explain
> > >> > > > > > > > > > to
> > >> > > > > > > > > > > > the
> > >> > > > > > > > > > > > > > user
> > >> > > > > > > > > > > > > > > > > what does that mean. However, in
> > practice
> > >> it
> > >> > > > looks
> > >> > > > > > the
> > >> > > > > > > > > impact
> > >> > > > > > > > > > > of
> > >> > > > > > > > > > > > > > > request
> > >> > > > > > > > > > > > > > > > > rate quota is not more quantifiable
> than
> > >> the
> > >> > > > > request
> > >> > > > > > > > > handling
> > >> > > > > > > > > > > > time
> > >> > > > > > > > > > > > > > > quota.
> > >> > > > > > > > > > > > > > > > > Unlike the byte rate quota, it is
> still
> > >> > > difficult
> > >> > > > > to
> > >> > > > > > > > give a
> > >> > > > > > > > > > > > number
> > >> > > > > > > > > > > > > > > about
> > >> > > > > > > > > > > > > > > > > impact of throughput or latency when a
> > >> > request
> > >> > > > rate
> > >> > > > > > > quota
> > >> > > > > > > > > is
> > >> > > > > > > > > > > hit.
> > >> > > > > > > > > > > > > So
> > >> > > > > > > > > > > > > > it
> > >> > > > > > > > > > > > > > > > is
> > >> > > > > > > > > > > > > > > > > not better than the request handling
> > time
> > >> > > quota.
> > >> > > > In
> > >> > > > > > > fact
> > >> > > > > > > > I
> > >> > > > > > > > > > feel
> > >> > > > > > > > > > > > it
> > >> > > > > > > > > > > > > is
> > >> > > > > > > > > > > > > > > > > clearer to tell user that "you are
> > limited
> > >> > > > because
> > >> > > > > > you
> > >> > > > > > > > have
> > >> > > > > > > > > > > taken
> > >> > > > > > > > > > > > > 30%
> > >> > > > > > > > > > > > > > > of
> > >> > > > > > > > > > > > > > > > > the CPU time on the broker" than
> > otherwise
> > >> > > > > something
> > >> > > > > > > like
> > >> > > > > > > > > > "your
> > >> > > > > > > > > > > > > > request
> > >> > > > > > > > > > > > > > > > > rate quota on metadata request has
> > >> reached".
> > >> > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > Thanks,
> > >> > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > >> > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay
> > >> Kreps <
> > >> > > > > > > > > j...@confluent.io
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > > > wrote:
> > >> > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > I think this proposal makes a lot of
> > >> sense
> > >> > > > > > > (especially
> > >> > > > > > > > > now
> > >> > > > > > > > > > > that
> > >> > > > > > > > > > > > > it
> > >> > > > > > > > > > > > > > is
> > >> > > > > > > > > > > > > > > > > > oriented around request rate) and
> > fills
> > >> the
> > >> > > > > biggest
> > >> > > > > > > > > > remaining
> > >> > > > > > > > > > > > gap
> > >> > > > > > > > > > > > > > in
> > >> > > > > > > > > > > > > > > > the
> > >> > > > > > > > > > > > > > > > > > multi-tenancy story.
> > >> > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > I think for intra-cluster
> > communication
> > >> > > > > > (StopReplica,
> > >> > > > > > > > > etc)
> > >> > > > > > > > > > we
> > >> > > > > > > > > > > > > could
> > >> > > > > > > > > > > > > > > > avoid
> > >> > > > > > > > > > > > > > > > > > throttling entirely. You can secure
> or
> > >> > > > otherwise
> > >> > > > > > > > > lock-down
> > >> > > > > > > > > > > the
> > >> > > > > > > > > > > > > > > cluster
> > >> > > > > > > > > > > > > > > > > > communication to avoid any
> > unauthorized
> > >> > > > external
> > >> > > > > > > party
> > >> > > > > > > > > from
> > >> > > > > > > > > > > > > trying
> > >> > > > > > > > > > > > > > to
> > >> > > > > > > > > > > > > > > > > > initiate these requests. As a result
> > we
> > >> are
> > >> > > as
> > >> > > > > > likely
> > >> > > > > > > > to
> > >> > > > > > > > > > > cause
> > >> > > > > > > > > > > > > > > problems
> > >> > > > > > > > > > > > > > > > > as
> > >> > > > > > > > > > > > > > > > > > solve them by throttling these,
> right?
> > >> > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > I'm not so sure that we should
> exempt
> > >> the
> > >> > > > > consumer
> > >> > > > > > > > > requests
> > >> > > > > > > > > > > > such
> > >> > > > > > > > > > > > > as
> > >> > > > > > > > > > > > > > > > > > heartbeat. It's true that if we
> > >> throttle an
> > >> > > > app's
> > >> > > > > > > > > heartbeat
> > >> > > > > > > > > > > > > > requests
> > >> > > > > > > > > > > > > > > it
> > >> > > > > > > > > > > > > > > > > may
> > >> > > > > > > > > > > > > > > > > > cause it to fall out of its consumer
> > >> group.
> > >> > > > > However
> > >> > > > > > > if
> > >> > > > > > > > we
> > >> > > > > > > > > > > don't
> > >> > > > > > > > > > > > > > > > throttle
> > >> > > > > > > > > > > > > > > > > it
> > >> > > > > > > > > > > > > > > > > > it may DDOS the cluster if the
> > heartbeat
> > >> > > > interval
> > >> > > > > > is
> > >> > > > > > > > set
> > >> > > > > > > > > > > > > > incorrectly
> > >> > > > > > > > > > > > > > > or
> > >> > > > > > > > > > > > > > > > > if
> > >> > > > > > > > > > > > > > > > > > some client in some language has a
> > bug.
> > >> I
> > >> > > think
> > >> > > > > the
> > >> > > > > > > > > policy
> > >> > > > > > > > > > > with
> > >> > > > > > > > > > > > > > this
> > >> > > > > > > > > > > > > > > > kind
> > >> > > > > > > > > > > > > > > > > > of throttling is to protect the
> > cluster
> > >> > above
> > >> > > > any
> > >> > > > > > > > > > individual
> > >> > > > > > > > > > > > app,
> > >> > > > > > > > > > > > > > > > right?
> > >> > > > > > > > > > > > > > > > > I
> > >> > > > > > > > > > > > > > > > > > think in general this should be okay
> > >> since
> > >> > > for
> > >> > > > > most
> > >> > > > > > > > > > > deployments
> > >> > > > > > > > > > > > > > this
> > >> > > > > > > > > > > > > > > > > > setting is meant as more of a safety
> > >> > > > valve---that
> > >> > > > > > is
> > >> > > > > > > > > rather
> > >> > > > > > > > > > > > than
> > >> > > > > > > > > > > > > > set
> > >> > > > > > > > > > > > > > > > > > something very close to what you
> > expect
> > >> to
> > >> > > need
> > >> > > > > > (say
> > >> > > > > > > 2
> > >> > > > > > > > > > > req/sec
> > >> > > > > > > > > > > > or
> > >> > > > > > > > > > > > > > > > > whatever)
> > >> > > > > > > > > > > > > > > > > > you would have something quite high
> > >> (like
> > >> > 100
> > >> > > > > > > req/sec)
> > >> > > > > > > > > with
> > >> > > > > > > > > > > > this
> > >> > > > > > > > > > > > > > > meant
> > >> > > > > > > > > > > > > > > > to
> > >> > > > > > > > > > > > > > > > > > prevent a client gone crazy. I think
> > >> when
> > >> > > used
> > >> > > > > this
> > >> > > > > > > way
> > >> > > > > > > > > > > > allowing
> > >> > > > > > > > > > > > > > > those
> > >> > > > > > > > > > > > > > > > to
> > >> > > > > > > > > > > > > > > > > > be throttled would actually provide
> > >> > > meaningful
> > >> > > > > > > > > protection.
> > >> > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > -Jay
> > >> > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > On Fri, Feb 17, 2017 at 9:05 AM,
> > Rajini
> > >> > > > Sivaram <
> > >> > > > > > > > > > > > > > > > rajinisiva...@gmail.com
> > >> > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > wrote:
> > >> > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > > Hi all,
> > >> > > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > > I have just created KIP-124 to
> > >> introduce
> > >> > > > > request
> > >> > > > > > > rate
> > >> > > > > > > > > > > quotas
> > >> > > > > > > > > > > > to
> > >> > > > > > > > > > > > > > > > Kafka:
> > >> > > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/
> > >> > > > > > > > confluence/display/KAFKA/KIP-
> > >> > > > > > > > > > > > > > > > > > > 124+-+Request+rate+quotas
> > >> > > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > > The proposal is for a simple
> > >> percentage
> > >> > > > request
> > >> > > > > > > > > handling
> > >> > > > > > > > > > > time
> > >> > > > > > > > > > > > > > quota
> > >> > > > > > > > > > > > > > > > > that
> > >> > > > > > > > > > > > > > > > > > > can be allocated to *<client-id>*,
> > >> > *<user>*
> > >> > > > or
> > >> > > > > > > > *<user,
> > >> > > > > > > > > > > > > > client-id>*.
> > >> > > > > > > > > > > > > > > > > There
> > >> > > > > > > > > > > > > > > > > > > are a few other suggestions also
> > under
> > >> > > > > "Rejected
> > >> > > > > > > > > > > > alternatives".
> > >> > > > > > > > > > > > > > > > > Feedback
> > >> > > > > > > > > > > > > > > > > > > and suggestions are welcome.
> > >> > > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > > Thank you...
> > >> > > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > > Regards,
> > >> > > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > > Rajini
> > >> > > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > --
> > >> > > > > > > > > > > > > > > -- Guozhang
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > >
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > >
> > >> > > > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > --
> > >> > > > > > -- Guozhang
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Reply via email to