[ 
https://issues.apache.org/jira/browse/HDFS-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542446#comment-13542446
 ] 

Jing Zhao commented on HDFS-4351:
---------------------------------

Yeah, we should use an updated numOfReplicas for the chooseTarget when a 
NotEnoughReplicasException is thrown.
For the current patch, 

{noformat}
// Required since chooseRandom() is passed numOfReplicas by value,
// so it's not updated when returning a partial result
numOfReplicas = oldNumOfReplicas - results.size();
{noformat}
Because the value of the oldNumOfReplicas is set to the initial value of 
numOfReplicas, the above calculation may be wrong if the initial results list 
is not empty? Do we need to use "totalReplicasExpected - results.size()" 
instead?

Besides, for the new testcase, I think maybe it's better to reset 
AvoidStaleDataNodesForWrite to false in the end (so that it would not affect 
other possible new testcases added after 
testChooseTargetWithMoreThanAvailableNodesWithStaleness)?
                
> Fix BlockPlacementPolicyDefault#chooseTarget when avoiding stale nodes
> ----------------------------------------------------------------------
>
>                 Key: HDFS-4351
>                 URL: https://issues.apache.org/jira/browse/HDFS-4351
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.0.0
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>         Attachments: hdfs-4351-2.patch, hdfs-4351.patch
>
>
> There's a bug in {{BlockPlacementPolicyDefault#chooseTarget}} with stale node 
> avoidance enabled (HDFS-3912). If a NotEnoughReplicasException is thrown in 
> the call to {{chooseRandom()}}, {{numOfReplicas}} is not updated together 
> with the partial result in {{result}} since it is pass by value. The retry 
> call to {{chooseTarget}} then uses this incorrect value.
> This can be seen if you enable stale node detection for 
> {{TestReplicationPolicy#testChooseTargetWithMoreThanAvaiableNodes()}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to