[ https://issues.apache.org/jira/browse/HDFS-8491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Junping Du updated HDFS-8491: ----------------------------- Target Version/s: (was: 2.8.0) > DN shutdown race conditions with open xceivers > ---------------------------------------------- > > Key: HDFS-8491 > URL: https://issues.apache.org/jira/browse/HDFS-8491 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Affects Versions: 2.6.0 > Reporter: Daryn Sharp > > DN shutdowns at least for restarts have many race conditions. Shutdown is > very noisy with exceptions. The DN notifies writers of the restart, waits 1s > and then interrupts the xceiver threads but does not join. The ipc server is > stopped and then the bpos services are stopped. > Xceivers then encounter NPEs in closeBlock because the block no longer exists > in the volume map when transient storage is checked. Just before that, the > DN notifies the NN the block was received. This does not appear to always be > true, but rather that the thread was interrupted. They race with bpos > shutdown, and luckily appear to lose, to send the block received. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org