[ https://issues.apache.org/jira/browse/HBASE-12829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16815935#comment-16815935 ]
Biju Nair edited comment on HBASE-12829 at 4/12/19 3:39 AM: ------------------------------------------------------------ In the current version of SLB, [Read-writeRequestCostFunction|https://github.com/apache/hbase/blob/baf3ae80f5588ee848176adefc9f56818458a387/hbaseserver/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L1465] extends [CostFromRegionLoadAsRateFunction|https://github.com/apache/hbase/blob/baf3ae80f5588ee848176adefc9f56818458a387/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L1436] which in turn uses the [average of the region requests stored for a period|https://github.com/apache/hbase/blob/baf3ae80f5588ee848176adefc9f56818458a387/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L1443] to calculate cost which seems to address this issue. Can this be closed? was (Author: gsbiju): In the current version of SLB, [Read-writeRequestCostFunction|https://github.com/apache/hbase/blob/baf3ae80f5588ee848176adefc9f56818458a387/hbaseserver/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L1465] extends [CostFromRegionLoadAsRateFunction|https://github.com/apache/hbase/blob/baf3ae80f5588ee848176adefc9f56818458a387/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L1436] which in turn uses the [average of the region requests stored for a period|https://github.com/apache/hbase/blob/baf3ae80f5588ee848176adefc9f56818458a387/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L1443] which seems to address this issue. Can this be closed? > Request count in RegionLoad may not accurate to compute the load cost for > region > -------------------------------------------------------------------------------- > > Key: HBASE-12829 > URL: https://issues.apache.org/jira/browse/HBASE-12829 > Project: HBase > Issue Type: Improvement > Components: Balancer > Affects Versions: 0.99.2 > Reporter: Jianwei Cui > Priority: Minor > > StochasticLoadBalancer#RequestCostFunction(ReadRequestCostFunction and > WriteRequestCostFunction) will compute load cost for a region based on a > number of remembered region loads. Each region load records the total count > for read/write request at reported time since it opened. However, the request > count will be reset if region moved, making the new reported count could not > represent the total request. For example, if a region has high write > throughput, the WrtieRequest in region load will be very big after onlined > for a long time, then if the region moved, the new WriteRequest will be much > smaller, making the region contributes much smaller to the cost of its > belonging rs. We may need to consider the region open time to get more > accurate region load. > As another way, how about using read/write request count at each time slots > instead of total request count? The total count will make older read/write > request throughput contribute more to the cost by > CostFromRegionLoadFunction#getRegionLoadCost: > {code} > protected double getRegionLoadCost(Collection<RegionLoad> regionLoadList) > { > double cost = 0; > for (RegionLoad rl : regionLoadList) { > double toAdd = getCostFromRl(rl); > if (cost == 0) { > cost = toAdd; > } else { > cost = (.5 * cost) + (.5 * toAdd); > } > } > return cost; > } > {code} > For example, assume the balancer now remembers three loads for a region at > time t1, t2, t3(t1 < t2 < t3), the write request is w1, w2, w3 respectively > for time slots [0, t1), [t1, t2), [t2, t3), so the WriteRequest in the region > load at t1, t2, t3 will be w1, w1 + w2, w1 + w2 + w3 and the WriteRequest > cost will be: > {code} > 0.5 * (w1 + w2 + w3) + 0.25 * (w1 + w2) + 0.25 * w1 = w1 + 0.75 * w2 + > 0.5 * w3 > {code} > The w1 contributes more to the cost than w2 and w3. However, intuitively, I > think the recent read/write throughput should represent the current load of > the region better than the older ones. Therefore, how about using w1, w2 and > w3 directly when computing? Then, the cost will become: > {code} > 0.25 * w1 + 0.25 * w2 + 0.5 * w3 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)