[ 
https://issues.apache.org/jira/browse/HDFS-7833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335228#comment-14335228
 ] 

Chris Nauroth commented on HDFS-7833:
-------------------------------------

This is a repeat of the comment I mentioned on HDFS-7830.  Thank you to 
[~eddyxu] for volunteering to take assignment of the issue.

Another potential problem that I've noticed in the DataNode reconfiguration 
code is that it never recalculates {{FsDatasetImpl#validVolsRequired}}. This is 
a final variable calculated as (# volumes configured) - (# volume failures 
tolerated):
{code}
    this.validVolsRequired = volsConfigured - volFailuresTolerated;
{code}
If this variable is not updated for DataNode reconfigurations, then it could 
lead to some unexpected situations. For example:
# DataNode starts running with 6 volumes (all healthy) and 
{{dfs.datanode.failed.volumes.tolerated}} set to 2.
# {{FsDatasetImpl#validVolsRequired}} is set to 6 - 2 = 4.
# DataNode is reconfigured to run with 8 volumes (all still healthy).
# Now 3 volumes fail. The admin would expect the DataNode to abort, but there 
are 8 - 3 = 5 good volumes left, and {{FsDatasetImpl#validVolsRequired}} is 
still 4, so {{FsDatasetImpl#hasEnoughResource}} returns true.



> DataNode reconfiguration does not recalculate valid volumes required, based 
> on configured failed volumes tolerated.
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-7833
>                 URL: https://issues.apache.org/jira/browse/HDFS-7833
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.6.0
>            Reporter: Chris Nauroth
>            Assignee: Lei (Eddy) Xu
>
> DataNode reconfiguration never recalculates 
> {{FsDatasetImpl#validVolsRequired}}.  This may cause incorrect behavior of 
> the {{dfs.datanode.failed.volumes.tolerated}} property if reconfiguration 
> causes the DataNode to run with a different total number of volumes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to