First, a couple notes on this...

3 - I generally agree with the direction of not pre-optimizing. However, in
this case I'm concerned about the calculation of the cost of doing plugins
now vs. trying to refactor the code to do it later. It would seem to me
that doing it up front will have less friction. If we wait to do plugins
later, it will probably mean changing a lot of related code which will be
significantly more work. We've spent a lot of time talking about various
implementations, and I think it not unreasonable to believe that what one
group wants initially is not going to solve even most cases, as it will
vary by use case.

4 - I really disagree with this. Slowing down a request means that you're
going to hold onto it in the broker. This takes up resources and time, and
is generally not the way other services handle quota violations. In
addition you are causing potential problems with the clients by taking a
call that's supposed to return as quickly as possible and making it take a
long time. This increases latency and deprives the client of the ability to
make good decisions about what to do. By sending an error back to the
client you inform them of what the problem is, and you allow the client to
make an intelligent decision, such as queuing to send later, sending to
another resource, or handling anything from their upstreams differently.

You're absolutely right that throwing back an immediate error has the
potential to turn a quota violation into a different problem for a badly
behaved client. But OS and upstream networking tools can see a problem
based on a layer 4 issue (rapidly reconnecting client) rather than layers
above. Out of the options provided, I think A is the correct choice. B
seems to be the most work (you have the delay, and the client still has to
handle errors and backoff), and C is what I disagree with doing.

I would also like to see a provision for allowing the client to query its
quota status within the protocol. I think we should allow for a request (or
information within an existing response) where the client can ask what its
current quota status is. This will allow for the clients to manage their
quotas, and it will allow for emitting metrics on the client side for quota
status (rather than relying on the server-side metrics, which tends to put
the responsibility in the wrong place).


-Todd


On Sun, Mar 8, 2015 at 10:52 AM, Jay Kreps <jay.kr...@gmail.com> wrote:

> Hey Adi,
>
> Great write-up. Here are some comments:
>
> 1. I don't think you need a way to disable quotas on a per-client basis,
> that is just the equivalent of setting the quota to be infinite, right?
>
> 2. I agree that the configuration problem is a general part of doing
> dynamic configuration, and it is right to separate that into the config
> KIP. But Joe's proposal currently doesn't provide nearly what you need in
> its current form--it doesn't even handle client-id based configuration, let
> alone the notification mechanism you would need to update your quota--so we
> really need to give completely explicitly how that KIP is going to solve
> this problem.
>
> 3. Custom quota implementations: let's do this later. Pluggability comes
> with a high cost and we want to try really hard to avoid it. So in the
> future if we have a really solid case for an alternative quota approach
> let's see if we can't improve the current approach and stick with one good
> implementation. If we really can't then let's add a plugin system. I think
> doing it now is premature.
>
> 4. I think the ideal quota action from the users point of view is just to
> slow down the writer or reader transparently to match their capacity
> allocation. Let's try to see if we can make that work.
>
> I think immediate error can be ruled out entirely because it depends on the
> client properly backing off. In cases where they don't we may actually make
> things worse. Given the diversity of clients I think this is probably not
> going to happen.
>
> The only downside to just delaying the request that was pointed out was
> that if the delay exceeded the request timeout the user might retry. This
> is true but it applies to any approach that delays requests (both B and C).
> I think with any sane request timeout and quota the per request delay we
> induce will be way lower (otherwise you would be hitting the timeout all
> the time just due to linux I/O variance, in which case you can't really
> complain).
>
> 5. We need to explain the relationship between the quota stuff in the
> metrics package and this. We need to either remove that stuff or use it. We
> can't have two quota things. Since quota fundamentally apply to windowed
> metrics, I would suggest doing whatever improvements to that to make it
> usable for quotas.
>
> 6. I don't think the quota manager interface is really what we need if I'm
> understanding it correctly. You give a method
>   <T extends RequestOrResponse> boolean check(T request);
> But how would you implement this method? It seems like it would basically
> internally just be a switch statement with a different check for each
> request type. So this is a pretty terrible object oriented api, right? It
> seems like what we will be doing is taking code that would otherwise just
> be in the request handling flow, and moving it into this method, with a
> bunch of instanceof checks?
>
> I think what we need is just a delayqueue and a background thread that
> sends the delayed responses (we were calling it a purgatory but it isn't,
> it is just a timeout based delay--there are no watchers or keys or any of
> that).
>
> Let's rename the QuotaManager RequestThrottler and have it just have a
> single method:
> class RequestThrottler {
>   sendDelayedResponse(response, delay, timeunit)
> }
> internally it will put the response into the delay queue and there will be
> a background thread that sends out those responses after the delay elapses.
>
> So usage in KafkaApis would look like:
>    try {
>      quotaMetric.record(newVal)
>    } catch (QuotaException e) {
>      requestThrottler.add(new DelayedResponse(...), ...)
>  return
>    }
>
> The advantage of this is that the logic of what metric is being checked and
> the logic of how to appropriately correct the response, both of which will
> be specific to each request, now remain in KafkaApis where they belong. The
> throttler just delays the sending of the response for the appropriate time
> and has no per-request logic whatsoever.
>
> 7. We need to think through and state the exact algorithm for how we will
> assign delays to requests for a use case that is over its quota. That is
> closely tied to how we calculate the metric used. Here would be a bad
> approach we should not use:
> a. measure in a 30 second window.
> b. when we have hit the cap in that window, delay for the remainder of the
> 30 seconds
> As you can imagine with this bad algorithm you might then use all server
> resources for 5 seconds, then suddenly assign a 25 second delay to the next
> request from that client, then the window would reset and this would
> repeat.
> The quota package is already doing a good job of the windowed metrics, but
> we'll want to integrate the backoff calculation with that algorithm
> (assuming that is what we are using).
>
> Cheers,
>
> -Jay
>
> On Wed, Mar 4, 2015 at 3:51 PM, Aditya Auradkar <
> aaurad...@linkedin.com.invalid> wrote:
>
> > Posted a KIP for quotas in kafka.
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas
> >
> > Appreciate any feedback.
> >
> > Aditya
> >
>

Reply via email to