[ https://issues.apache.org/jira/browse/HBASE-17462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15830789#comment-15830789 ]
Ted Yu commented on HBASE-17462: -------------------------------- {code} + for (RegionLoad rl : regionLoadList) { + double current = getCostFromRl(rl); + if (previous != null) { + cost += current - previous; + } + previous = current; {code} (Through debug logging) what is length of regionLoadList for the read / write requests ? The average length would give us some idea how many samples are taken for the sliding window. Can you share (improved) performance when you tested the change on your cluster ? Thanks > Investigate using sliding window for read/write request costs in > StochasticLoadBalancer > --------------------------------------------------------------------------------------- > > Key: HBASE-17462 > URL: https://issues.apache.org/jira/browse/HBASE-17462 > Project: HBase > Issue Type: Improvement > Reporter: Ted Yu > Assignee: Tim Brown > Labels: patch > Attachments: HBASE-17642.patch > > > In the thread, http://search-hadoop.com/m/HBase/YGbbyUZKXWALkX1, Timothy was > asking whether the read/write request costs in StochasticLoadBalancer should > be calculated as rates. > This makes sense since read / write load on region server tends to fluctuate > over time. Using sliding window would reflect more recent trend in read / > write load. > Some factors to consider: > The data structure used by StochasticLoadBalancer should be concise. The > number of regions in a cluster can be expected to approach 1 million. We > cannot afford to store long history of read / write requests in master. > Efficiency of cost calculation should be high - there're many cost > functions the balancer goes through, it is expected for each cost function > to return quickly. Otherwise we would not come up with proper region > movement plan(s) in time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)