[ https://issues.apache.org/jira/browse/HDFS-10178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kihwal Lee updated HDFS-10178: ------------------------------ Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.7.3 Status: Resolved (was: Patch Available) Committed to trunk through branch-2.7. The 2.7 cherry-pick was clean but the test was modified to use {{DFSConfigKeys}} instead of {{HdfsClientConfigKeys}}. Thanks for reviews and comments, Akira, Arpit, Masataki and Vinay. > Permanent write failures can happen if pipeline recoveries occur for the > first packet > ------------------------------------------------------------------------------------- > > Key: HDFS-10178 > URL: https://issues.apache.org/jira/browse/HDFS-10178 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Kihwal Lee > Assignee: Kihwal Lee > Priority: Critical > Fix For: 2.7.3 > > Attachments: HDFS-10178.patch, HDFS-10178.v2.patch, > HDFS-10178.v3.patch, HDFS-10178.v4.patch, HDFS-10178.v5.patch > > > We have observed that write fails permanently if the first packet doesn't go > through properly and pipeline recovery happens. If the write op creates a > pipeline, but the actual data packet does not reach one or more datanodes in > time, the pipeline recovery will be done against the 0-byte partial block. > If additional datanodes are added, the block is transferred to the new nodes. > After the transfer, each node will have a meta file containing the header > and 0-length data block file. The pipeline recovery seems to work correctly > up to this point, but write fails when actual data packet is resent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)