[ 
https://issues.apache.org/jira/browse/HDFS-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-8776:
-----------------------------
    Target Version/s:   (was: 2.8.0)

> Decom manager should not be active on standby
> ---------------------------------------------
>
>                 Key: HDFS-8776
>                 URL: https://issues.apache.org/jira/browse/HDFS-8776
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.6.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>
> The decommission manager should not be actively processing on the standby.
> The decomm manager goes through the costly computation for determining every 
> block on the node requires replication yet doesn't queue them for replication 
> - because it's in standby. The decomm manager is holding the namesystem write 
> lock, causing DNs to timeout on heartbeats or IBRs, NN purges the call queue 
> of timed out clients, NN processes some heartbeats/IBRs before the decomm 
> manager locks up the namesystem again. Nodes attempting to register will be 
> sending full BRs which are more costly to send and discard than a heartbeat.
> If a failover is required, the standby will likely have to struggle very hard 
> to not GC while "catching up" on its queued IBRs while DNs continue to fill 
> the call queue and time out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to