[ https://issues.apache.org/jira/browse/HBASE-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460131#comment-13460131 ]
Elliott Clark commented on HBASE-4191: -------------------------------------- It seems like the stochastic load balancer gives HBase the locality awareness when balancing. > hbase load balancer needs locality awareness > -------------------------------------------- > > Key: HBASE-4191 > URL: https://issues.apache.org/jira/browse/HBASE-4191 > Project: HBase > Issue Type: New Feature > Components: Balancer > Reporter: Ted Yu > Assignee: Liyin Tang > > Previously, HBASE-4114 implements the metrics for HFile HDFS block locality, > which provides the HFile level locality information. > But in order to work with load balancer and region assignment, we need the > region level locality information. > Let's define the region locality information first, which is almost the same > as HFile locality index. > HRegion locality index (HRegion A, RegionServer B) = > (Total number of HDFS blocks that can be retrieved locally by the > RegionServer B for the HRegion A) / ( Total number of the HDFS blocks for the > Region A) > So the HRegion locality index tells us that how much locality we can get if > the HMaster assign the HRegion A to the RegionServer B. > So there will be 2 steps involved to assign regions based on the locality. > 1) During the cluster start up time, the master will scan the hdfs to > calculate the "HRegion locality index" for each pair of HRegion and Region > Server. It is pretty expensive to scan the dfs. So we only needs to do this > once during the start up time. > 2) During the cluster run time, each region server will update the "HRegion > locality index" as metrics periodically as HBASE-4114 did. The Region Server > can expose them to the Master through ZK, meta table, or just RPC messages. > Based on the "HRegion locality index", the assignment manager in the master > would have a global knowledge about the region locality distribution and can > run the MIN COST MAXIMUM FLOW solver to reach the global optimization. > Let's construct the graph first: > [Graph] > Imaging there is a bipartite graph and the left side is the set of regions > and the right side is the set of region servers. > There is a source node which links itself to each node in the region set. > There is a sink node which is linked from each node in the region server set. > [Capacity] > The capacity between the source node and region nodes is 1. > And the capacity between the region nodes and region server nodes is also 1. > (The purpose is each region can ONLY be assigned to one region server at one > time) > The capacity between the region server nodes and sink node are the avg number > of regions which should be assigned each region server. > (The purpose is balance the load for each region server) > [Cost] > The cost between each region and region server is the opposite of locality > index, which means the higher locality is, if region A is assigned to region > server B, the lower cost it is. > The cost function could be more sophisticated when we put more metrics into > account. > So after running the min-cost max flow solver, the master could assign the > regions based on the global locality optimization. > Also the master should share this global view to secondary master in case the > master fail over happens. > In addition, the HBASE-4491 (Locality Checker) is the tool, which is based on > the same metrics, to proactively to scan dfs to calculate the global locality > information in the cluster. It will help us to verify data locality > information during the run time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira