Jelle Smet created HADOOP-9091:
----------------------------------

             Summary: Allow daemon startup when at least 1 (or configurable) 
disk is in an OK state.
                 Key: HADOOP-9091
                 URL: https://issues.apache.org/jira/browse/HADOOP-9091
             Project: Hadoop Common
          Issue Type: Improvement
          Components: fs
    Affects Versions: 0.20.2
            Reporter: Jelle Smet


The given example is if datanode disk definitions but should be applicable to 
all configuration where a list of disks are provided.

I have defined multiple local disks defined for a datanode:
<property>
<name>dfs.data.dir</name>
<value>/data/01/dfs/dn,/data/02/dfs/dn,/data/03/dfs/dn,/data/04/dfs/dn,/data/05/dfs/dn,/data/06/dfs/dn</value>
<final>true</final>
</property>

When one of those disks breaks and is unmounted then the mountpoint (such as 
/data/03 in this example) becomes a regular directory which doesn't have the 
valid permissions and possible directory structure Hadoop is expecting.
When this situation happens, the datanode fails to restart because of this 
while actually we have enough disks in an OK state to proceed.  The only way 
around this is to alter the configuration and omit that specific disk 
configuration.

To my opinion, It would be more practical to let Hadoop daemons start when at 
least 1 disks/partition in the provided list is in a usable state.  This 
prevents having to roll out custom configurations for systems which have 
temporarily a disk (and therefor directory layout) missing.  This might also be 
configurable that at least X partitions out of he available ones are in OK 
state.







--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to