DataNodes do not start up when a previous version has not been cleaned up
-------------------------------------------------------------------------

                 Key: HADOOP-5342
                 URL: https://issues.apache.org/jira/browse/HADOOP-5342
             Project: Hadoop Core
          Issue Type: Bug
    Affects Versions: 0.18.2
            Reporter: Christian Kunz
            Priority: Blocker


After restarting a cluster (including rebooting) the dfs got corrupted because 
many DataNodes did not start up, running into the following exception:

2009-02-26 22:33:53,774 ERROR org.apache.hadoop.dfs.DataNode: 
org.apache.hadoop.dfs.InconsistentFSStateException: Directory xxx  is in an 
inconsistent state: version file in current directory is missing.
        at 
org.apache.hadoop.dfs.Storage$StorageDirectory.analyzeStorage(Storage.java:326)
        at 
org.apache.hadoop.dfs.DataStorage.recoverTransitionRead(DataStorage.java:105)
        at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:306)
        at org.apache.hadoop.dfs.DataNode.<init>(DataNode.java:223)
        at org.apache.hadoop.dfs.DataNode.makeInstance(DataNode.java:3030)
        at 
org.apache.hadoop.dfs.DataNode.instantiateDataNode(DataNode.java:2985)
        at org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:2993)
        at org.apache.hadoop.dfs.DataNode.main(DataNode.java:3115)


This happens when using multiple disks with at least one previously marked as 
read-only, such that the storage version became out-dated, but after reboot it 
was mounted read-write, resulting in the DataNode not starting because of 
out-dated version.

This is a big headache. If a DataNode has multiple disks of which at least one 
has the correct storage version then out-dated versions should not bring down 
the DataNode.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to