[
https://issues.apache.org/jira/browse/HBASE-28672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Dimiduk resolved HBASE-28672.
----------------------------------
Fix Version/s: 2.7.0
3.0.0-beta-2
2.6.1
Resolution: Fixed
Pushed to branch-2.6+. Thanks [~rmdmattingly] for the contribution and to
[~zhangduo] for build system quick-fixes.
[~rmdmattingly] should this also go back to 2.5? The patch did not apply
cleanly, it looked like some interfaces aren't present there. Maybe a
dependency needs to be backported first?
> Ensure large batches are not indefinitely blocked by quotas
> -----------------------------------------------------------
>
> Key: HBASE-28672
> URL: https://issues.apache.org/jira/browse/HBASE-28672
> Project: HBase
> Issue Type: Improvement
> Components: Quotas
> Affects Versions: 2.6.0
> Reporter: Ray Mattingly
> Assignee: Ray Mattingly
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2, 2.6.1
>
>
> At my day job we are trying to implement default quotas for a variety of
> access patterns. We began by introducing a default read IO limit per-user,
> per-machine — this has been very successful in reducing hotspots, even on
> clusters with thousands of distinct users.
> While implementing a default writes/second throttle, I realized that doing so
> would put us in a precarious situation where large-enough batches may never
> succeed. If your batch size is greater than your TimeLimiter's max
> throughput, then you will always fail in the quota estimation stage.
> Meanwhile [IO estimates are more
> optimistic|https://github.com/apache/hbase/blob/bdb3f216e864e20eb2b09352707a751a5cf7460f/hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/DefaultOperationQuota.java#L192-L193],
> deliberately, which can let large requests do targeted oversubscription of
> an IO quota:
>
> {code:java}
> // assume 1 block required for reads. this is probably a low estimate, which
> is okay
> readConsumed = numReads > 0 ? blockSizeBytes : 0;{code}
>
> This is okay because the Limiter's availability will go negative and force a
> longer backoff on subsequent requests. I believe this is preferable UX
> compared to a doomed throttling loop.
> In my opinion, we should do something similar in batch request estimation, by
> estimating a batch request's workload at {{Math.min(batchSize,
> limiterMaxThroughput)}} rather than simply {{{}batchSize{}}}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)