Re: [KIP-DISCUSSION] KIP-13 Quotas

Jay Kreps Wed, 11 Mar 2015 22:09:58 -0700

1. Cool

2. Yeah I just wanted to flag the dependency/interaction.


3. Cool, I think we are in agreement then that a pluggable system could
possibly be nice but we can get to know it operationally before deciding to
expose such a thing.

4. Yeah, I agree, let's do it as a separate discussion. We actually had a
full discussion and vote back when we started down the path with metrics,
but I think there were some concerns so let's talk about it a bit more and
see.

5. Yeah I think my concern was just the resulting api. Basically because
the logic for each quota is different--at the very least a different metric
to check and different requests type to compute the value from, it seems
that the seemingly generic api just masks the fact that we handle each case
separately. I.e. the implementation of the method internally would be

  def check(request: T) {
    if(request.instanceOf[ProduceRequest])
       [check produce request]
    if(request.instanceOf[FetchRequest])
       [check fetch request]
    ..
  }

So basically we have logic specific to each request, but rather than
putting that logic into the method for handling that request we kind of put
it into a big case statement. So it seems like this doesn't really abstract
things since any time you add a new thing to quota you have to jump instead
the big case statement and add a new case, right? I think I may be
misunderstanding though...in any case not arguing that we want to just
shove this into the existing methods I just want to make sure if we
introduce an abstraction its a good one.

6. Yes, I think it is preferable not to have the seesaw effect in the delay
time. So if you need to impose 20 seconds of delay it is better to delay
all 200 requests 100 ms each rather than 199 requests 0 ms and one request
20 seconds. Several reasons for this:
a. gives predictable latency to the producer.
b. avoids hitting the request timeout on the one slow request
c. there is a trade-off between window size and delay time. If the window
is too small the estimate will be inaccurate and you will accidentally
penalize an okay client (e.g. imagine a 100 ms window, one big request
could overflow it). If the window is too large you will allow the system to
be brought to its knees for a long period of time prior to the throttling.

The other important question here is the details of the windowing policy.
If the window resets every 30 seconds, the client exhausts it in 10
seconds, then is throttled for 20, then it resets and the client starts
blitzing again. The result is basically 10 second outages every 30 seconds
as the throttling expires and the client goes full tilt, crushing the
server. So the quotas don't really do their job very well.

-Jay


On Mon, Mar 9, 2015 at 6:22 PM, Aditya Auradkar <
[email protected]> wrote:

> Thanks for the comments Jay and Todd. Firstly, I'd like to address some of
> Jay's points.
>
> 1. You are right that we don't need to disable quotas on a per-client
> basis.
>
> 2. Configuration management: I must admit, I haven't read Joe's proposal
> in detail. My thinking was that we can keep configuration management
> separate from this discussion since this is already quite a meaty topic.
> Let me spend some time reading that KIP and then I can add more detail to
> the quota KIP.
>
> 3. Custom Quota implementations: I don't think it is necessarily a bad
> idea to have a interface called the QuotaManager(RequestThrottler). This
> doesn't necessarily mean exposing the interface as a public API. It is a
> mechanism to limit code changes to 1-2 specific classes. It prevents quota
> logic from bleeding into multiples places in the code as happens in any big
> piece of code. I fully agree that we should not expose this as a public API
> unless there is a very strong reason to. This seems to be more of an
> implementation detail.
>
> 4. Metrics Package: I'll add a section on the wiki about using things from
> the metrics package. Currently, the quota stuff is located in
> "clients/common/metrics". This means that we will have to migrate all that
> functionality into core. Do this also mean that we will need to replace the
> existing metrics code in "core" with the newly imported package as a part
> of this project? If so, that's a relatively large undertaking and it needs
> to be discussed separately IMO.
>
> 5. Request Throttler vs QuotaManager -
> I wanted my quota manager to do something similar to what you proposed.
> Inside KafkaApis, I could do:
>
> if(quotaManager.check())
>   // process request
> else
>   return
>
> Internally QuotaManager:check() could do exactly what you suggested
> try {
>      quotaMetric.record(newVal)
>    } catch (QuotaException e) {
>     // logic to calculate delay
>       requestThrottler.add(new DelayedResponse(...), ...)
>  return
>    }
>
> This approach gives us the flexibility of deciding what metric we want to
> record inside QuotaManager. This brings us back to the same discussion of
> pluggable quota policies. It's a bit hard to articulate, but for example:
> this allows us to quota on bytes in/out vs total number of requests in/out.
> It also encapsulates the logic to calculate delay. Recording the metrics in
> KafkaApis would make it look more complex.
>
> try {
>   bytesPerSecondMetrics.record(newVal);
>   produceRequestsPerSecondMetrics.record(newVal);
> }
> catch(QuotaException) {
> // logic to calculate delay
>      requestThrottler.add(new DelayedResponse(...), ...)
> }
>
> Stepping back a bit, I see that this is all internal to KafkaApis and we
> have the flexibility of refactoring this as we see fit at a later date. It
> should be ok to not have RequestThrottler, QuotaManager be an interface but
> proper implementations instead.
>
> 6. As for assigning exact delays to requests, I agree that having a 30
> second delay is rather bad. I was thinking that we could have 5 second
> windows instead. Even if a client exhausts their quota in the first 2
> seconds, they will only be delayed for another 3 seconds. Is there a reason
> this isn't good enough? I'm also going to dig into the existing quota
> package and see what we can leverage.
>
> 7. Exposing quota usage: We do plan to expose the quota metrics via JMX.
> Adding an API to query quota status is absolutely feasible and also a good
> idea.
>
> Thanks,
> Aditya
>
> ________________________________________
> From: Jay Kreps [[email protected]]
> Sent: Monday, March 09, 2015 5:01 PM
> To: [email protected]
> Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas
>
> Hey Todd,
>
> Nice points, let me try to respond:
>
> Plugins
>
> Yeah let me explain what I mean about plugins. The contracts that matter
> for us are public contracts, i.e. the protocol, the binary format, stuff in
> zk, and the various plug-in apis we expose. Adding an internal interface
> later will not be hard--the quota check is going to be done in 2-6 places
> in the code which would need to be updated, all internal to the broker.
>
> The challenge with making things pluggable up front is that the policy is
> usually fairly trivial to plug in but each policy requires different
> inputs--the number of partitions, different metrics, etc. Once we give a
> public api it is hard to add to it without breaking the original contract
> and hence breaking everyones plugins. So if we do this we want to get it
> right early if possible. In any case I think whether we want to design a
> pluggable api or just improve a single implementation, the work we need to
> do is the same: brainstorm the set of use cases the feature has and then
> figure out the gap in our proposed implementation that leaves some use case
> uncovered. Once we have these specific cases we can try to figure out if
> that could be solved with a plugin or by improving our default proposal.
>
> Enforcement
>
> I started out arguing your side (immediate error), but I have switched to
> preferring delay. Here is what convinced me, let's see if it moves you.
>
> First, the delay quota need not hold onto any request data. The produce
> response can be delayed after the request is completed and the fetch can be
> delayed prior to the fetch being executed. So no state needs to be
> maintained in memory, other than a single per-connection token. This is a
> really important implementation detail for large scale usage that I didn't
> realize at first. I would agree that maintaining a request per connection
> in memory is a non-starter for an environment with 10s of thousands of
> connections.
>
> The second argument is that I think this really expands the use cases where
> the quotas can be applicable.
>
> The use case I have heard people talk about is event collection from apps.
> In this use case the app is directly sending data at a more or less steady
> state and never really has a performance spike unless the app has a bug or
> the application itself experiences more traffic. So in this case you should
> actually never hit the quota, and if you do, the data is going to be
> dropped wither it is dropped by the server with an error or by the client.
> These use cases will never block the app (which would be dangerous) since
> the client is always non-blocking and drops data when it's buffer is full
> rather than blocking. I agree that for this use case either server-side
> delay or client side delay are both reasonable--the pro of a server-side
> delay is that it doesn't require special client handling, the pro of the
> server-side error is that it is more transparent.
>
> But now consider non-steady-state use cases. Here I am thinking of:
> 1. Data load from Hadoop
> 2. MM load into a cluster with live usage
> 3. Database changelog capture
> 4. Samza
>
> Each of these has a case where it is "catching up" or otherwise slammed by
> load from the source system:
> 1. A M/R job dump a ton of data all at once
> 2. MM when catching up after some downtime
> 3. Database changelog will have a load phase when populating data for the
> first time
> 4. Samza when restoring state or catching up after fail-over
>
> In each of these cases you want the consumer or producer to go as fast as
> possible but not impact the other users of the cluster. In these cases you
> are actually using the quotas totally differently. In the app event capture
> use case the quota was more like a safety valve that you expected to never
> hit. However in these cases I just listed you fully expect to hit and
> remain at the quota for extended periods of time and that will be totally
> normal.
>
> These are the cases where throttling throughput is better than sending an
> error. If we send an error then any producer who doesn't configure enough
> retries is going to start losing data. Further even if you set infinite
> retries the retry itself is going to mean resending the data over and over
> until you get a non-error. This is bad because in a period of high load you
> are then going to be incurring more network load as lots of producers start
> retrying (this isn't a problem on the consumer because the fetch request is
> small but is an issue on the producer).
>
> I take your point about the potential danger of slowing down a producing
> app that is configured to block. But actually this danger is no different
> than what will happen if it exceeds the node capacity now--when that
> happens requests will start getting slow and the app will block. The only
> difference is that that limit is now lower than when the node's capacity is
> totally exhausted. So I don't think that is a new danger.
>
> Exposing quota usage
>
> I agree we should make this available, good use of this feature obviously
> means knowing how close you are to your quota before you hit it.
>
> Cheers,
>
> -Jay
>
> On Mon, Mar 9, 2015 at 10:34 AM, Todd Palino <[email protected]> wrote:
>
> > First, a couple notes on this...
> >
> > 3 - I generally agree with the direction of not pre-optimizing. However,
> in
> > this case I'm concerned about the calculation of the cost of doing
> plugins
> > now vs. trying to refactor the code to do it later. It would seem to me
> > that doing it up front will have less friction. If we wait to do plugins
> > later, it will probably mean changing a lot of related code which will be
> > significantly more work. We've spent a lot of time talking about various
> > implementations, and I think it not unreasonable to believe that what one
> > group wants initially is not going to solve even most cases, as it will
> > vary by use case.
> >
> > 4 - I really disagree with this. Slowing down a request means that you're
> > going to hold onto it in the broker. This takes up resources and time,
> and
> > is generally not the way other services handle quota violations. In
> > addition you are causing potential problems with the clients by taking a
> > call that's supposed to return as quickly as possible and making it take
> a
> > long time. This increases latency and deprives the client of the ability
> to
> > make good decisions about what to do. By sending an error back to the
> > client you inform them of what the problem is, and you allow the client
> to
> > make an intelligent decision, such as queuing to send later, sending to
> > another resource, or handling anything from their upstreams differently.
> >
> > You're absolutely right that throwing back an immediate error has the
> > potential to turn a quota violation into a different problem for a badly
> > behaved client. But OS and upstream networking tools can see a problem
> > based on a layer 4 issue (rapidly reconnecting client) rather than layers
> > above. Out of the options provided, I think A is the correct choice. B
> > seems to be the most work (you have the delay, and the client still has
> to
> > handle errors and backoff), and C is what I disagree with doing.
> >
> > I would also like to see a provision for allowing the client to query its
> > quota status within the protocol. I think we should allow for a request
> (or
> > information within an existing response) where the client can ask what
> its
> > current quota status is. This will allow for the clients to manage their
> > quotas, and it will allow for emitting metrics on the client side for
> quota
> > status (rather than relying on the server-side metrics, which tends to
> put
> > the responsibility in the wrong place).
> >
> >
> > -Todd
> >
> >
> > On Sun, Mar 8, 2015 at 10:52 AM, Jay Kreps <[email protected]> wrote:
> >
> > > Hey Adi,
> > >
> > > Great write-up. Here are some comments:
> > >
> > > 1. I don't think you need a way to disable quotas on a per-client
> basis,
> > > that is just the equivalent of setting the quota to be infinite, right?
> > >
> > > 2. I agree that the configuration problem is a general part of doing
> > > dynamic configuration, and it is right to separate that into the config
> > > KIP. But Joe's proposal currently doesn't provide nearly what you need
> in
> > > its current form--it doesn't even handle client-id based configuration,
> > let
> > > alone the notification mechanism you would need to update your
> quota--so
> > we
> > > really need to give completely explicitly how that KIP is going to
> solve
> > > this problem.
> > >
> > > 3. Custom quota implementations: let's do this later. Pluggability
> comes
> > > with a high cost and we want to try really hard to avoid it. So in the
> > > future if we have a really solid case for an alternative quota approach
> > > let's see if we can't improve the current approach and stick with one
> > good
> > > implementation. If we really can't then let's add a plugin system. I
> > think
> > > doing it now is premature.
> > >
> > > 4. I think the ideal quota action from the users point of view is just
> to
> > > slow down the writer or reader transparently to match their capacity
> > > allocation. Let's try to see if we can make that work.
> > >
> > > I think immediate error can be ruled out entirely because it depends on
> > the
> > > client properly backing off. In cases where they don't we may actually
> > make
> > > things worse. Given the diversity of clients I think this is probably
> not
> > > going to happen.
> > >
> > > The only downside to just delaying the request that was pointed out was
> > > that if the delay exceeded the request timeout the user might retry.
> This
> > > is true but it applies to any approach that delays requests (both B and
> > C).
> > > I think with any sane request timeout and quota the per request delay
> we
> > > induce will be way lower (otherwise you would be hitting the timeout
> all
> > > the time just due to linux I/O variance, in which case you can't really
> > > complain).
> > >
> > > 5. We need to explain the relationship between the quota stuff in the
> > > metrics package and this. We need to either remove that stuff or use
> it.
> > We
> > > can't have two quota things. Since quota fundamentally apply to
> windowed
> > > metrics, I would suggest doing whatever improvements to that to make it
> > > usable for quotas.
> > >
> > > 6. I don't think the quota manager interface is really what we need if
> > I'm
> > > understanding it correctly. You give a method
> > > <T extends RequestOrResponse> boolean check(T request);
> > > But how would you implement this method? It seems like it would
> basically
> > > internally just be a switch statement with a different check for each
> > > request type. So this is a pretty terrible object oriented api, right?
> It
> > > seems like what we will be doing is taking code that would otherwise
> just
> > > be in the request handling flow, and moving it into this method, with a
> > > bunch of instanceof checks?
> > >
> > > I think what we need is just a delayqueue and a background thread that
> > > sends the delayed responses (we were calling it a purgatory but it
> isn't,
> > > it is just a timeout based delay--there are no watchers or keys or any
> of
> > > that).
> > >
> > > Let's rename the QuotaManager RequestThrottler and have it just have a
> > > single method:
> > > class RequestThrottler {
> > > sendDelayedResponse(response, delay, timeunit)
> > > }
> > > internally it will put the response into the delay queue and there will
> > be
> > > a background thread that sends out those responses after the delay
> > elapses.
> > >
> > > So usage in KafkaApis would look like:
> > > try {
> > > quotaMetric.record(newVal)
> > > } catch (QuotaException e) {
> > > requestThrottler.add(new DelayedResponse(...), ...)
> > > return
> > > }
> > >
> > > The advantage of this is that the logic of what metric is being checked
> > and
> > > the logic of how to appropriately correct the response, both of which
> > will
> > > be specific to each request, now remain in KafkaApis where they belong.
> > The
> > > throttler just delays the sending of the response for the appropriate
> > time
> > > and has no per-request logic whatsoever.
> > >
> > > 7. We need to think through and state the exact algorithm for how we
> will
> > > assign delays to requests for a use case that is over its quota. That
> is
> > > closely tied to how we calculate the metric used. Here would be a bad
> > > approach we should not use:
> > > a. measure in a 30 second window.
> > > b. when we have hit the cap in that window, delay for the remainder of
> > the
> > > 30 seconds
> > > As you can imagine with this bad algorithm you might then use all
> server
> > > resources for 5 seconds, then suddenly assign a 25 second delay to the
> > next
> > > request from that client, then the window would reset and this would
> > > repeat.
> > > The quota package is already doing a good job of the windowed metrics,
> > but
> > > we'll want to integrate the backoff calculation with that algorithm
> > > (assuming that is what we are using).
> > >
> > > Cheers,
> > >
> > > -Jay
> > >
> > > On Wed, Mar 4, 2015 at 3:51 PM, Aditya Auradkar <
> > > [email protected]> wrote:
> > >
> > > > Posted a KIP for quotas in kafka.
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas
> > > >
> > > > Appreciate any feedback.
> > > >
> > > > Aditya
> > > >
> > >
> >
>

Re: [KIP-DISCUSSION] KIP-13 Quotas

Reply via email to