Jinghui Wang created HDFS-5280:
----------------------------------

             Summary: Corrupted meta files on data nodes prevents DFClient from 
connecting to data nodes and updating corruption status to name node.
                 Key: HDFS-5280
                 URL: https://issues.apache.org/jira/browse/HDFS-5280
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: datanode, hdfs-client
    Affects Versions: 2.0.4-alpha, 2.1.0-beta, 1.1.1
         Environment: Red hat enterprise 6.4
Hadoop-2.1.0
            Reporter: Jinghui Wang


Meta files being corrupted causes the DFSClient not able to connect to the 
datanodes to access the blocks, so DFSClient never perform a read on the block, 
which is what throws the ChecksumException when file blocks are corrupted and 
report to the namenode to mark the block as corrupt.  Since the client never 
got to that far, thus the file status remain as healthy and so are all the 
blocks.

To replicate the error, put a file onto HDFS.
run hadoop fsck /tmp/bogus.csv -files -blocks -location will get that following 
output.
FSCK started for path /tmp/bogus.csv at 11:33:29
/tmp/bogus.csv 109 bytes, 1 block(s):  OK
0. blk_-4255166695856420554_5292 len=109 repl=3

find the block/meta files for 4255166695856420554 by running 
ssh datanode1.address find /hadoop/ -name "*4255166695856420554*" and it will 
get the following output:
/hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554
/hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554_5292.meta

now corrupt the meta file by running 
ssh datanode1.address "sed -i -e '1i 1234567891' 
/hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554_5292.meta" 

now run hadoop fs -cat /tmp/bogus.csv
will show the stack trace of DFSClient failing to connect to the data node with 
the corrupted meta file.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to