[ https://issues.apache.org/jira/browse/HDFS-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13398427#comment-13398427 ]
Vinay commented on HDFS-3541: ----------------------------- I found out the Actual Problem. Problem resides in PacketResponder#close() {code}public synchronized void close() { while (running && ackQueue.size() != 0 && datanode.shouldRun) { try { wait(); } catch (InterruptedException e) { running = false; } } if(LOG.isDebugEnabled()) { LOG.debug(myString + ": closing"); } running = false; notifyAll(); }{code} Here InterruptedException is handled but, intterrupted flag is not reset. and BlockReceiver is waiting for PacketResponder to join. But PacketResponder is BLOCKED. > Deadlock between recovery, xceiver and packet responder > ------------------------------------------------------- > > Key: HDFS-3541 > URL: https://issues.apache.org/jira/browse/HDFS-3541 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node > Affects Versions: 0.23.3, 2.0.1-alpha > Reporter: suja s > Assignee: Vinay > Attachments: DN_dump.rar > > > Block Recovery initiated while write in progress at Datanode side. Found a > lock between recovery, xceiver and packet responder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira