[ https://issues.apache.org/jira/browse/HDFS-2263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Harsh J updated HDFS-2263: -------------------------- Attachment: HDFS-2263.patch (Issue affects trunk, and attached patch is against that.) Aaron/Arpit, An error in OP_READ_BLOCK operation can also arise out of xceiver loads apart from truncation of block files and missing / bad-permission block files. Attached patch reports for every error encountered, and not just the final tried LocatedBlock. I do know this is wrong, as it'd spark a replication storm for a reason as simple as filled up xceiver loads causing the read error -- but let me know if am wrong, and I'll tweak the patches and the tests a bit to accomodate final-retry corrupt marking. > Make DFSClient report bad blocks more quickly > --------------------------------------------- > > Key: HDFS-2263 > URL: https://issues.apache.org/jira/browse/HDFS-2263 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs client > Affects Versions: 0.20.2 > Reporter: Aaron T. Myers > Assignee: Harsh J > Attachments: HDFS-2263.patch > > > In certain circumstances the DFSClient may detect a block as being bad > without reporting it promptly to the NN. > If when reading a file a client finds an invalid checksum of a block, it > immediately reports that bad block to the NN. If when serving up a block a DN > finds a truncated block, it reports this to the client, but the client merely > adds that DN to the list of dead nodes and moves on to trying another DN, > without reporting this to the NN. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira