[ 
https://issues.apache.org/jira/browse/HBASE-28453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17831061#comment-17831061
 ] 

Hudson commented on HBASE-28453:
--------------------------------

Results for branch branch-3
        [build #172 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/172/]: 
(x) *{color:red}-1 overall{color}*
----
details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/172/General_20Nightly_20Build_20Report/]




(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/172/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(x) {color:red}-1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/172/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Support a middle ground between the Average and Fixed interval rate limiters
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-28453
>                 URL: https://issues.apache.org/jira/browse/HBASE-28453
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 2.6.0
>            Reporter: Ray Mattingly
>            Assignee: Ray Mattingly
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 2.6.0, 3.0.0-beta-2
>
>         Attachments: Screenshot 2024-03-21 at 2.08.51 PM.png, Screenshot 
> 2024-03-21 at 2.30.01 PM.png
>
>
> h3. Background
> HBase quotas support two rate limiters: a "fixed" and an "average" interval 
> rate limiter.
> h4. FixedIntervalRateLimiter
> The fixed interval rate limiter is simpler: it has a TimeUnit, say 1 second, 
> and it refills a resource allotment on the recurring interval. So you may get 
> 10 resources every second, and if you exhaust all 10 resources in the first 
> millisecond of an interval then you will need to wait 999ms to acquire even 1 
> more resource.
> h4. AverageIntervalRateLimiter
> The average interval rate limiter, HBase's default, allows for more flexibly 
> timed refilling of the resource allotment. Extending our previous example, 
> say you have a 10 reads/sec quota and you have exhausted all 10 resources 
> within 1ms of the last full refill. If you request 1 more read then, rather 
> than returning a 999ms wait interval indicating the next full refill time, 
> the rate limiter will recognize that you only need to wait 99ms before 1 read 
> can be available. After 100ms has passed in aggregate since the last full 
> refill, it will support the refilling of 1/10th the limit to facilitate the 
> request for 1/10th the resources.
> h3. The Problems with Current RateLimiters
> The problem with the fixed interval rate limiter is that it is too strict 
> from a latency perspective. It results in quota limits to which we cannot 
> fully subscribe with any consistency.
> The problem with the average interval rate limiter is that, in practice, it 
> is far too optimistic. For example, a real rate limiter might limit to 
> 100MB/sec of read IO per machine. Any multigets that come in will require 
> only a tiny fraction of this limit; for example, a 64kb block is only 0.06% 
> of the total. As a result, the vast majority of wait intervals end up being 
> tiny — like <5ms. This can actually cause an inverse of your intention, where 
> setting up a throttle causes a DDOS of your RPC layer via continuous 
> throttling and ~immediate retrying. I've discussed this problem in 
> https://issues.apache.org/jira/browse/HBASE-28429 and proposed a minimum wait 
> interval as the solution there; after some more thinking, I believe this new 
> rate limiter would be a less hacky solution to this deficit so I'd like to 
> close that Jira in favor of this one.
> See the attached chart where I put in place a 10k req/sec/machine throttle 
> for this user at 10:43 to try to curb this high traffic, and it resulted in a 
> huge spike of req/sec due to the throttle/retry loop created by the 
> AverageIntervalRateLimiter.
> h3. Original Proposal: PartialIntervalRateLimiter as a Solution
> I've implemented a RateLimiter which allows for partial chunks of the overall 
> interval to be refilled, by default these chunks are 10% (or 100ms of a 1s 
> interval). I've deployed this to a test cluster at my day job and have seen 
> this really help our ability to full subscribe to a quota limit without 
> executing superfluous retries. See the other attached chart which shows a 
> cluster undergoing a rolling restart from using FixedIntervalRateLimiter to 
> my new PartialIntervalRateLimiter and how it is then able to fully subscribe 
> to its allotted 25MB/sec/machine read IO quota.
> h3. Updated Proposal: Improving FixedIntervalRateLimiter
> Rather than implement a new rate limiter, we can make a lower touch change 
> which just adds support for a refill interval that is less than the time unit 
> on a FixedIntervalRateLimiter. This can be a no-op change for those who have 
> not opted into the feature by having the refill interval default to the time 
> unit. For clarity, see [my branch 
> here|https://github.com/apache/hbase/compare/master...HubSpot:hbase:HBASE-28453]
>  which I will PR soon



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to