Daryn Sharp created HDFS-7967:
---------------------------------

             Summary: Reduce the performance impact of the balancer
                 Key: HDFS-7967
                 URL: https://issues.apache.org/jira/browse/HDFS-7967
             Project: Hadoop HDFS
          Issue Type: Sub-task
    Affects Versions: 2.0.0-alpha
            Reporter: Daryn Sharp
            Assignee: Daryn Sharp


The balancer needs to query for blocks to move from overly full DNs.  The block 
lookup is extremely inefficient.  An iterator of the node's blocks is created 
from the iterators of its storages' blocks.  A random number is chosen 
corresponding to how many blocks will be skipped via the iterator.  Each skip 
requires costly scanning of triplets.

The current design also only considers node imbalances while ignoring 
imbalances within the nodes's storages.  A more efficient and intelligent 
design may eliminate the costly skipping of blocks via round-robin selection of 
blocks from the storages based on remaining capacity.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to