[ https://issues.apache.org/jira/browse/HBASE-28672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nick Dimiduk resolved HBASE-28672. ---------------------------------- Fix Version/s: 2.7.0 3.0.0-beta-2 2.6.1 Resolution: Fixed Pushed to branch-2.6+. Thanks [~rmdmattingly] for the contribution and to [~zhangduo] for build system quick-fixes. [~rmdmattingly] should this also go back to 2.5? The patch did not apply cleanly, it looked like some interfaces aren't present there. Maybe a dependency needs to be backported first? > Ensure large batches are not indefinitely blocked by quotas > ----------------------------------------------------------- > > Key: HBASE-28672 > URL: https://issues.apache.org/jira/browse/HBASE-28672 > Project: HBase > Issue Type: Improvement > Components: Quotas > Affects Versions: 2.6.0 > Reporter: Ray Mattingly > Assignee: Ray Mattingly > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2, 2.6.1 > > > At my day job we are trying to implement default quotas for a variety of > access patterns. We began by introducing a default read IO limit per-user, > per-machine — this has been very successful in reducing hotspots, even on > clusters with thousands of distinct users. > While implementing a default writes/second throttle, I realized that doing so > would put us in a precarious situation where large-enough batches may never > succeed. If your batch size is greater than your TimeLimiter's max > throughput, then you will always fail in the quota estimation stage. > Meanwhile [IO estimates are more > optimistic|https://github.com/apache/hbase/blob/bdb3f216e864e20eb2b09352707a751a5cf7460f/hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/DefaultOperationQuota.java#L192-L193], > deliberately, which can let large requests do targeted oversubscription of > an IO quota: > > {code:java} > // assume 1 block required for reads. this is probably a low estimate, which > is okay > readConsumed = numReads > 0 ? blockSizeBytes : 0;{code} > > This is okay because the Limiter's availability will go negative and force a > longer backoff on subsequent requests. I believe this is preferable UX > compared to a doomed throttling loop. > In my opinion, we should do something similar in batch request estimation, by > estimating a batch request's workload at {{Math.min(batchSize, > limiterMaxThroughput)}} rather than simply {{{}batchSize{}}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)