[ 
https://issues.apache.org/jira/browse/HDFS-4898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13736268#comment-13736268
 ] 

Hadoop QA commented on HDFS-4898:
---------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12597086/h4898_20130809.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
                        Please justify why no new tests are needed for this 
patch.
                        Also please list what manual steps were performed to 
verify this patch.

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

    {color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

                  
org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks

    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4799//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4799//console

This message is automatically generated.
                
> BlockPlacementPolicyWithNodeGroup.chooseRemoteRack() fails to properly 
> fallback to local rack
> ---------------------------------------------------------------------------------------------
>
>                 Key: HDFS-4898
>                 URL: https://issues.apache.org/jira/browse/HDFS-4898
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 1.2.0, 2.0.4-alpha
>            Reporter: Eric Sirianni
>            Assignee: Tsz Wo (Nicholas), SZE
>            Priority: Minor
>         Attachments: h4898_20130809.patch
>
>
> As currently implemented, {{BlockPlacementPolicyWithNodeGroup}} does not 
> properly fallback to local rack when no nodes are available in remote racks, 
> resulting in an improper {{NotEnoughReplicasException}}.
> {code:title=BlockPlacementPolicyWithNodeGroup.java}
>   @Override
>   protected void chooseRemoteRack(int numOfReplicas,
>       DatanodeDescriptor localMachine, HashMap<Node, Node> excludedNodes,
>       long blocksize, int maxReplicasPerRack, List<DatanodeDescriptor> 
> results,
>       boolean avoidStaleNodes) throws NotEnoughReplicasException {
>     int oldNumOfReplicas = results.size();
>     // randomly choose one node from remote racks
>     try {
>       chooseRandom(
>           numOfReplicas,
>           "~" + 
> NetworkTopology.getFirstHalf(localMachine.getNetworkLocation()),
>           excludedNodes, blocksize, maxReplicasPerRack, results,
>           avoidStaleNodes);
>     } catch (NotEnoughReplicasException e) {
>       chooseRandom(numOfReplicas - (results.size() - oldNumOfReplicas),
>           localMachine.getNetworkLocation(), excludedNodes, blocksize,
>           maxReplicasPerRack, results, avoidStaleNodes);
>     }
>   }
> {code}
> As currently coded the {{chooseRandom()}} call in the {{catch}} block will 
> never succeed as the set of nodes within the passed in node path (e.g. 
> {{/rack1/nodegroup1}}) is entirely contained within the set of excluded nodes 
> (both are the set of nodes within the same nodegroup as the node chosen first 
> replica).
> The bug is that the fallback {{chooseRandom()}} call in the catch block 
> should be passing in the _complement_ of the node path used in the initial 
> {{chooseRandom()}} call in the try block (e.g. {{/rack1}})  - namely:
> {code}
> NetworkTopology.getFirstHalf(localMachine.getNetworkLocation())
> {code}
> This will yield the proper fallback behavior of choosing a random node from 
> _within the same rack_, but still excluding those nodes _in the same 
> nodegroup_

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to