[ https://issues.apache.org/jira/browse/HDFS-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15471097#comment-15471097 ]
Yongjun Zhang commented on HDFS-10714: -------------------------------------- Hi [~brahmareddy] and [~vinayrpet], Thanks for working on this. When visiting this issue (per discussion in HDFS-6937), {quote} DN1->DN2->DN3 => DN3 Gives ERROR_CHECKSUM ack. And so DN2 marked as bad {quote} In HDFS-6937 case, if DN3 gives ERROR_CHECKSUM error, DN3 will be replaced. But here DN2 got replaced. Would you please add some code snippet to explain how that happened? thanks. > Issue in handling checksum errors in write pipeline when fault DN is > LAST_IN_PIPELINE > ------------------------------------------------------------------------------------- > > Key: HDFS-10714 > URL: https://issues.apache.org/jira/browse/HDFS-10714 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Brahma Reddy Battula > Assignee: Brahma Reddy Battula > Attachments: HDFS-10714-01-draft.patch > > > We had come across one issue, where write is failed even 7 DN’s are available > due to network fault at one datanode which is LAST_IN_PIPELINE. It will be > similar to HDFS-6937 . > Scenario : (DN3 has N/W Fault and Min repl=2). > Write pipeline: > DN1->DN2->DN3 => DN3 Gives ERROR_CHECKSUM ack. And so DN2 marked as bad > DN1->DN4-> DN3 => DN3 Gives ERROR_CHECKSUM ack. And so DN4 is marked as bad > …. > And so on ( all the times DN3 is LAST_IN_PIPELINE) ... Continued till no more > datanodes to construct the pipeline. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org