Datanode 'alive' but with its disk failed, Namenode thinks it's alive
---------------------------------------------------------------------

                 Key: HDFS-1234
                 URL: https://issues.apache.org/jira/browse/HDFS-1234
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: name-node
    Affects Versions: 0.20.1
            Reporter: Thanh Do


- Summary: Datanode 'alive' but with its disk failed, Namenode still thinks 
it's alive
 
- Setups:
+ Replication = 1
+ # available datanodes = 2
+ # disks / datanode = 1
+ # failures = 1
+ Failure type = bad disk
+ When/where failure happens = first phase of the pipeline
 
- Details:
In this experiment we have two datanodes. Each node has 1 disk.
However, if one datanode has a failed disk (but the node is still alive), the 
datanode
does not keep track of this.  From the perspective of the namenode,
that datanode is still alive, and thus the namenode gives back the same datanode
to the client.  The client will retry 3 times by asking the namenode to
give a new set of datanodes, and always get the same datanode.
And every time the client wants to write there, it gets an exception.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to