cuijianwei created HBASE-12829:
----------------------------------
Summary: Request count in RegionLoad may not accurate to compute
the region load cost
Key: HBASE-12829
URL: https://issues.apache.org/jira/browse/HBASE-12829
Project: HBase
Issue Type: Improvement
Components: Balancer
Affects Versions: 0.99.2
Reporter: cuijianwei
Priority: Minor
StochasticLoadBalancer#RequestCostFunction(ReadRequestCostFunction and
WriteRequestCostFunction) will compute load cost for a region based on a number
of remembered region loads. Each region load records the total count for
read/write request at reported time since it opened. However, the request count
will be reset if region moved, making the new reported count could not
represent the total request. For example, if a region has high write
throughput, the WrtieRequest in region load will be very big after onlined for
a long time, then if the region moved, the new WriteRequest will be much
smaller, making the region contributes much smaller to the cost of its
belonging rs. We may need to consider the region open time to get more accurate
region load.
As another way, how about using read/write request count at each time slots
instead of total request count? The total count will make older read/write
request throughput contribute more to the cost by
CostFromRegionLoadFunction#getRegionLoadCost:
{code}
protected double getRegionLoadCost(Collection<RegionLoad> regionLoadList) {
double cost = 0;
for (RegionLoad rl : regionLoadList) {
double toAdd = getCostFromRl(rl);
if (cost == 0) {
cost = toAdd;
} else {
cost = (.5 * cost) + (.5 * toAdd);
}
}
return cost;
}
{code}
For example, assume the balancer now remembers three loads for a region at time
t1, t2, t3(t1 < t2 < t3), the write request is w1, w2, w3 respectively for time
slots [0, t1), [t1, t2), [t2, t3), so the WriteRequest in the region load at
t1, t2, t3 will be w1, w1 + w2, w1 + w2 + w3 and the WriteRequest cost will be:
{code}
0.5 * (w1 + w2 + w3) + 0.25 * (w1 + w2) + 0.25 * w1 = w1 + 0.75 * w2 + 0.5
* w3
{code}
The w1 contributes more to the cost than w2 and w3. However, intuitively, I
think the recent read/write throughput should represent the current load of the
region better than the older ones. Therefore, how about using w1, w2 and w3
directly when computing? Then, the cost will become:
{code}
0.25 * w1 + 0.25 * w2 + 0.5 * w3
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)