Clara Xiong created HBASE-26178: ----------------------------------- Summary: Improve data structure for BalanceClusterState to improve computation speed for large cluster Key: HBASE-26178 URL: https://issues.apache.org/jira/browse/HBASE-26178 Project: HBase Issue Type: Bug Reporter: Clara Xiong
With ~800 node and ~500 regions per node on our large production cluster, balancer cannot complete within hours even after we just add 2% servers after maintenance. The unit tests with larger number of regions are taking longer and longer with changes to balancer with recent changes too, evident with the increment of the time limit recent PR's included. It is time to replace some of the data structure for better time complexity including: int[][] regionsPerServer; // serverIndex -> region list int[][] regionsPerHost; // hostIndex -> list of regions int[][] regionsPerRack; // rackIndex -> region list // serverIndex -> sorted list of regions by primary region index ArrayList<HashSet<Integer>> primariesOfRegionsPerServer; // hostIndex -> sorted list of regions by primary region index int[][] primariesOfRegionsPerHost; // rackIndex -> sorted list of regions by primary region index int[][] primariesOfRegionsPerRack; which needs O(n) time to update for every move test iteration. (n = number of regions per server/host/rack). -- This message was sent by Atlassian Jira (v8.3.4#803005)