Hi David, One quick question about the implementation (I don't think it's spelled out in the KIP): Presumably if you make, for example, a create topics request with validate only it will check for quota violation, but not count towards quota violation, right?
Many thanks, Tom On Wed, Jun 3, 2020 at 3:51 PM Rajini Sivaram <rajinisiva...@gmail.com> wrote: > Hi David, > > 2) sorry, that was my mistake. > > Regards, > > Rajini > > > On Wed, Jun 3, 2020 at 3:08 PM David Jacot <dja...@confluent.io> wrote: > > > Hi Rajini, > > > > Thanks for your prompt response. > > 1) Good catch, fixed. > > 2) The retry mechanism will be in the client so a new field is not > > required in the requests. > > > > Regards, > > David > > > > On Wed, Jun 3, 2020 at 2:43 PM Rajini Sivaram <rajinisiva...@gmail.com> > > wrote: > > > > > Hi David, > > > > > > Thanks for the updates, looks good. Just a couple of minor comments: > > > 1) There is a typo in "*The channel will be mutated as well when > > > `throttle_time_ms > 0`." * Should be *muted*? > > > 2) Since the three requests will need a new field for ` > > > *retryQuotaViolatedException*`, we should perhaps add that change to > the > > > KIP. > > > > > > Regards, > > > > > > Rajini > > > > > > On Wed, Jun 3, 2020 at 1:02 PM David Jacot <dja...@confluent.io> > wrote: > > > > > > > Hi all, > > > > > > > > I have updated the KIP based on our recent discussions. I have mainly > > > > changed the > > > > following points: > > > > * I have renamed the quota as suggested by Jun. > > > > * I have changed the metrics to be "token bucket" agnostic. The idea > is > > > to > > > > report the > > > > burst and the rate per principal/clientid. > > > > * I have removed the `retry.quota.violation.exception` configuration > > and > > > > replaced it > > > > with options in the respective methods' options. > > > > > > > > Now, the public interfaces are not tight to the algorithm that we use > > > > internally to throttle > > > > the requests but keep the notion of burst. I hope that this will help > > to > > > > alleviate the > > > > tokens bucket vs rate discussions. > > > > > > > > Please, have a look and let me know your thoughts. > > > > > > > > Bests, > > > > David > > > > > > > > > > > > On Wed, Jun 3, 2020 at 10:17 AM David Jacot <dja...@confluent.io> > > wrote: > > > > > > > > > Hi Rajini, > > > > > > > > > > Thanks for your feedback. Please find my answers below: > > > > > > > > > > 1) Our main goal is to protect the controller from the extreme > users > > > > > (DDoS). We want > > > > > to protect it from large requests or repetitive requests coming > from > > a > > > > > single user. > > > > > That user could be used by multiple apps as you pointed out which > > makes > > > > it > > > > > even > > > > > worst. For instance, a buggy application could continuously create > > and > > > > > delete > > > > > topics in a tight loop. > > > > > > > > > > The idea of the burst is to still allow creating or deleting topics > > in > > > > > batch because > > > > > this is how applications tend to do it. However, we want to keep > the > > > > batch > > > > > size > > > > > under control with the burst. The burst does not allow requests of > > any > > > > > size. Topics > > > > > are accepted until the burst is passed. All the others are > rejected. > > > > > Ideally, regular > > > > > and well behaving applications should never or rarely be throttled. > > > > > > > > > > 2) No, there is no explicit bound on the maximum throttle time. > > Having > > > a > > > > > maximum > > > > > is straightforward here because the throttling time depends on the > > > actual > > > > > size of > > > > > the request. > > > > > > > > > > 3) That's a very good question that I haven't thought of. I was > > > thinking > > > > > about doing > > > > > it for the new quota only. I think that your suggestion of having a > > per > > > > > method argument > > > > > makes sense. I will update the KIP. > > > > > > > > > > 4) Indeed, it is outdated text. Let me update that. > > > > > > > > > > Regards, > > > > > David > > > > > > > > > > On Wed, Jun 3, 2020 at 12:01 AM Rajini Sivaram < > > > rajinisiva...@gmail.com> > > > > > wrote: > > > > > > > > > >> Hi David, > > > > >> > > > > >> Thanks for the KIP. A few questions below: > > > > >> > > > > >> 1) The KIP says: *`Typically, applications tend to send one > request > > to > > > > >> create all the topics that they need`*. What would the point of > > > > throttling > > > > >> be in this case? If there was a user quota for the principal used > by > > > > that > > > > >> application, wouldn't we just allow the request due to the burst > > > value? > > > > Is > > > > >> the KIP specifically for the case where multiple apps with the > same > > > user > > > > >> principal are started all at once? > > > > >> > > > > >> 2) Will there be a bound on the maximum throttle time? > > > > >> > > > > >> 3) Will *retry.quota.violation.exception* have any impact on > request > > > > quota > > > > >> throttling or is it limited to the three requests where the new > > quota > > > is > > > > >> applied? If it is only for the new quota, perhaps it would be > better > > > to > > > > >> add > > > > >> an option to the relevant requests rather than use an admin client > > > > config. > > > > >> > > > > >> 4) Is this outdated text, wasn't sure what this refers to : *While > > the > > > > >> configuration could be set by broker, we envision to have it set > > only > > > > has > > > > >> a > > > > >> cluster wide default for two reasons.* > > > > >> > > > > >> Regards, > > > > >> > > > > >> Rajini > > > > >> > > > > >> > > > > >> On Tue, Jun 2, 2020 at 2:55 PM David Jacot <dja...@confluent.io> > > > wrote: > > > > >> > > > > >> > Hi Jun, > > > > >> > > > > > >> > Thanks for your reply. > > > > >> > > > > > >> > 10. I think that both options are likely equivalent from an > > accuracy > > > > >> point > > > > >> > of > > > > >> > view. If we put the implementation aside, conceptually, I am not > > > > >> convinced > > > > >> > by the used based approach because resources don't have a clear > > > owner > > > > >> > in AK at the moment. A topic can be created by (Principal A, no > > > client > > > > >> id), > > > > >> > then can be extended by (no principal, Client B), and finally > > > deleted > > > > by > > > > >> > (Principal C, Client C). This does not sound right to me and I > > fear > > > > >> that it > > > > >> > is not going to be easy to grasp for our users. > > > > >> > > > > > >> > Regarding the naming, I do agree that we can make it more future > > > > proof. > > > > >> > I propose `controller_mutations_rate`. I think that > > differentiating > > > > the > > > > >> > mutations > > > > >> > from the reads is still a good thing for the future. > > > > >> > > > > > >> > 11. I am not convinced by your proposal for the following > reasons: > > > > >> > > > > > >> > First, in my toy example, I used 101 windows and 7 * 80 > requests. > > We > > > > >> could > > > > >> > effectively allocate 5 * 100 requests to the previous windows > > > assuming > > > > >> that > > > > >> > they are empty. What shall we do with the remaining 60 requests? > > > Shall > > > > >> we > > > > >> > allocate them to the current window or shall we divide it among > > all > > > > the > > > > >> > windows? > > > > >> > > > > > >> > Second, I don't think that we can safely change the behavior of > > all > > > > the > > > > >> > existing > > > > >> > rates used because it actually changes the computation of the > rate > > > as > > > > >> > values > > > > >> > allocated to past windows would expire before they would today. > > > > >> > > > > > >> > Overall, while trying to fit in the current rate, we are going > to > > > > build > > > > >> a > > > > >> > slightly > > > > >> > different version of the rate which will be even more confusing > > for > > > > >> users. > > > > >> > > > > > >> > Instead, I think that we should embrace the notion of burst as > it > > > > could > > > > >> > also > > > > >> > be applied to other quotas in the future. Users don't have to > know > > > > that > > > > >> we > > > > >> > use the Token Bucket or a special rate inside at the end of the > > day. > > > > It > > > > >> is > > > > >> > an > > > > >> > implementation detail. > > > > >> > > > > > >> > Users would be able to define: > > > > >> > - a rate R; and > > > > >> > - a maximum burst B. > > > > >> > > > > > >> > If we change the metrics to be as follow: > > > > >> > - the actual rate; > > > > >> > - the burst balance in %, 0 meaning that the user is throttled; > > > > >> > It remains disattach from the algorithm. > > > > >> > > > > > >> > I personally prefer this over having to define a rate and a > number > > > of > > > > >> > windows > > > > >> > while having to understand that the number of windows implicitly > > > > defines > > > > >> > the > > > > >> > allowed burst. I think that it is clearer and easier to grasp. > > Don't > > > > >> you? > > > > >> > > > > > >> > Best, > > > > >> > David > > > > >> > > > > > >> > On Fri, May 29, 2020 at 6:38 PM Jun Rao <j...@confluent.io> > wrote: > > > > >> > > > > > >> > > Hi, David, Anna, > > > > >> > > > > > > >> > > Thanks for the response. Sorry for the late reply. > > > > >> > > > > > > >> > > 10. Regarding exposing rate or usage as quota. Your argument > is > > > that > > > > >> > usage > > > > >> > > is not very accurate anyway and is harder to implement. So, > > let's > > > > just > > > > >> > be a > > > > >> > > bit loose and expose rate. I am sort of neutral on that. (1) > It > > > > seems > > > > >> to > > > > >> > me > > > > >> > > that overall usage will be relatively more accurate than rate. > > All > > > > the > > > > >> > > issues that Anna brought up seem to exist for rate too. Rate > has > > > the > > > > >> > > additional problem that the cost of each request may not be > > > uniform. > > > > >> (2) > > > > >> > In > > > > >> > > terms of implementation, a usage based approach requires > > tracking > > > > the > > > > >> > user > > > > >> > > info through the life cycle of an operation. However, as you > > > > >> mentioned, > > > > >> > > things like topic creation can generate additional > > > > >> > > LeaderAndIsr/UpdateMetadata requests. Longer term, we probably > > > want > > > > to > > > > >> > > associate those cost to the user who initiated the operation. > If > > > we > > > > do > > > > >> > > that, we sort of need to track the user for the full life > cycle > > of > > > > the > > > > >> > > processing of an operation anyway. (3) If you prefer rate > > > strongly, > > > > I > > > > >> > don't > > > > >> > > have strong objections. However, I do feel that the new quota > > name > > > > >> should > > > > >> > > be able to cover all controller related cost longer term. This > > KIP > > > > >> > > currently only covers topic creation/deletion. It would not be > > > ideal > > > > >> if > > > > >> > in > > > > >> > > the future, we have to add yet another type of quota for some > > > other > > > > >> > > controller related operations. The quota name in the KIP has > > > > partition > > > > >> > > mutation. In the future, if we allow things like topic > renaming, > > > it > > > > >> may > > > > >> > not > > > > >> > > be related to partition mutation directly and it would be > > trickier > > > > to > > > > >> fit > > > > >> > > those operations in the current quota. So, maybe sth more > > general > > > > like > > > > >> > > controller_operations_quota will be more future proof. > > > > >> > > > > > > >> > > 11. Regarding the difference between the token bucket > algorithm > > > and > > > > >> our > > > > >> > > current quota mechanism. That's a good observation. It seems > > that > > > we > > > > >> can > > > > >> > > make a slight change to our current quota mechanism to > achieve a > > > > >> similar > > > > >> > > thing. As you said, the issue was that we allocate all 7 * 80 > > > > >> requests in > > > > >> > > the last 1 sec window in our current mechanism. This is a bit > > > > >> unintuitive > > > > >> > > since each sec only has a quota capacity of 5. An alternative > > way > > > is > > > > >> to > > > > >> > > allocate the 7 * 80 requests to all previous windows, each up > to > > > the > > > > >> > > remaining quota capacity left. So, you will fill the current 1 > > sec > > > > >> window > > > > >> > > and all previous ones, each with 5. Then, it seems this will > > give > > > > the > > > > >> > same > > > > >> > > behavior as token bucket? The reason that I keep asking this > is > > > that > > > > >> from > > > > >> > > an operational perspective, it's simpler if all types of > quotas > > > work > > > > >> in > > > > >> > the > > > > >> > > same way. > > > > >> > > > > > > >> > > Jun > > > > >> > > > > > > >> > > On Tue, May 26, 2020 at 9:52 AM David Jacot < > > dja...@confluent.io> > > > > >> wrote: > > > > >> > > > > > > >> > > > Hi folks, > > > > >> > > > > > > > >> > > > I have updated the KIP. As mentioned by Jun, I have made the > > > > >> > > > quota per principal/clientid similarly to the other quotas. > I > > > have > > > > >> > > > also explained how this will work in conjunction with the > auto > > > > >> > > > topics creation. > > > > >> > > > > > > > >> > > > Please, take a look at it. I plan to call a vote for it in > the > > > > next > > > > >> few > > > > >> > > > days if there are no comments in this thread. > > > > >> > > > > > > > >> > > > Best, > > > > >> > > > David > > > > >> > > > > > > > >> > > > On Wed, May 13, 2020 at 10:57 AM Tom Bentley < > > > tbent...@redhat.com > > > > > > > > > >> > > wrote: > > > > >> > > > > > > > >> > > > > Hi David, > > > > >> > > > > > > > > >> > > > > Thanks for the explanation and confirmation that evolving > > the > > > > >> APIs is > > > > >> > > not > > > > >> > > > > off the table in the longer term. > > > > >> > > > > > > > > >> > > > > Kind regards, > > > > >> > > > > > > > > >> > > > > Tom > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > > > > > >