[ 
https://issues.apache.org/jira/browse/HDFS-3495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13502685#comment-13502685
 ] 

Junping Du commented on HDFS-3495:
----------------------------------

Hi Nicholas,
   Thanks for your review and your great comments! Please see my reply below:
1.It seems that we are not going to subclass Balancer. If it is the case, we 
should not change the fields/methods to protected.
[JP] Sure. I remove all protected method in Balancer. The exception could be on 
unit test as it needs a subclass of MiniDFSCluster 
(MiniDFSClusterWithNodeGroup) which have different configuration in 
initialization.
2. There are a few places initializing a NetworkTopology instance from conf. 
How about adding a new getInstance(conf) method to NetworkTopology so that 
Balancer, DatanodeManager and some tests could call it? 
[JP] Sure. Updated.
3. The one line method areDataNodesOnSameNodeGroup(..) seems unnecessary. How 
about using cluster.isOnSameNodeGroup(..) directly?
[JP] Agree. it could be better. Updated.
4.If we have an addTo(..) method in PendingBlockMove, chooseProxySource() can 
be simplified as below: 
[JP] It is awesome! It makes code much simplified and clear. Thanks!
5. The addTo method could also be used in Source.chooseNextBlockToMove().
[JP] It looks slight different case there. If we use addTo() to replace 
original code, then we will lose "pendingBlock.source = this;" and add " 
proxySource = target" which is different with original logic?
6.Let's combine chooseNodes() and chooseNodesForCustomFaultDomains()?
[JP] Sure. Two methods are combined.
7. chooseTargetOnSameNodeGroup and chooseSourceOnSameNodeGroup look the same. 
Similarly, chooseTargetsOnSameNodeGroup and chooseSourcesOnSameNodeGroup looks 
the same. I spent sometime to combine them. See if you think it is good. 
[JP] It is super! I put it in new changes and the method of 
selectCandidateOnSameNodeGroup can be used by other code in Balancer (I rename 
to matchSourceWithTargetToMove as it is more general).
I will update patch soon.
                
> Update Balancer to support new NetworkTopology with NodeGroup
> -------------------------------------------------------------
>
>                 Key: HDFS-3495
>                 URL: https://issues.apache.org/jira/browse/HDFS-3495
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: balancer
>    Affects Versions: 1.1.0, 2.0.2-alpha
>            Reporter: Junping Du
>            Assignee: Junping Du
>         Attachments: HADOOP-8473-Balancer-NodeGroup-aware.patch, 
> HDFS-3495-v2.patch, HDFS-3495-v3.patch
>
>
> Since the Balancer is a Hadoop Tool, it was updated to be directly aware of 
> four-layer hierarchy instead of creating an alternative Balancer 
> implementation. To accommodate extensibility, a new protected method, 
> doChooseNodesForCustomFaultDomain is now called from the existing chooseNodes 
> method so that a subclass of the Balancer could customize the balancer 
> algotirhm for other failure and locality topologies. An alternative option is 
> to encapsulate the algorithm used for the four-layer hierarchy into a 
> collaborating strategy class.
> The key changes introduced to support a four-layer hierarchy were to override 
> the algorithm of choosing <source, target> pairs for balancing. Unit tests 
> were created to test the new algorithm.
> The algorithm now makes sure to choose the target and source node on the same 
> node group for balancing as the first priority. Then the overall balancing 
> policy is: first doing balancing between nodes within the same nodegroup then 
> the same rack and off rack at last. Also, we need to check no duplicated 
> replicas live in the same node group after balancing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to