[ https://issues.apache.org/jira/browse/HDFS-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603154#comment-14603154 ]
Daryn Sharp commented on HDFS-8675: ----------------------------------- The solution isn't as simple as replacing the IOE with UnregisteredNodeException. The DN will commit suicide. Handling the exception will require buffering the IBR (or any other calls this may affect), registering, resending the rejected message. Issue might be related to network issues or defects that stall BPOfferService like the synchronous clearing of the trash after RU finalize. > IBRs from dead DNs go into infinite loop > ---------------------------------------- > > Key: HDFS-8675 > URL: https://issues.apache.org/jira/browse/HDFS-8675 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Affects Versions: 2.6.0 > Reporter: Daryn Sharp > > If the DN sends an IBR after the NN declares it dead, the NN returns an IOE > of unregistered or dead. The DN catches the IOE, ignores it, and infinitely > loops spamming the NN with retries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)