[ 
https://issues.apache.org/jira/browse/HDFS-8491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-8491:
-----------------------------
    Target Version/s:   (was: 2.8.0)

> DN shutdown race conditions with open xceivers
> ----------------------------------------------
>
>                 Key: HDFS-8491
>                 URL: https://issues.apache.org/jira/browse/HDFS-8491
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.6.0
>            Reporter: Daryn Sharp
>
> DN shutdowns at least for restarts have many race conditions.  Shutdown is 
> very noisy with exceptions.  The DN notifies writers of the restart, waits 1s 
> and then interrupts the xceiver threads but does not join.  The ipc server is 
> stopped and then the bpos services are stopped.
> Xceivers then encounter NPEs in closeBlock because the block no longer exists 
> in the volume map when transient storage is checked.  Just before that, the 
> DN notifies the NN the block was received.  This does not appear to always be 
> true, but rather that the thread was interrupted. They race with bpos 
> shutdown, and luckily appear to lose, to send the block received.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to