[ 
https://issues.apache.org/jira/browse/HDFS-11180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15702354#comment-15702354
 ] 

Kihwal Lee commented on HDFS-11180:
-----------------------------------

bq. NameNode holds a lock of FSEditLog and requires a lock of MetricsSystemImpl 
when registering IPCLoggerChannel metrics.
It looks like this deadlock can happen only when the QJM is used.

Metrics update does not need a precise txid. We could introduce unsynchronized 
methods for metrics and perhaps use volatile for txid?

> Intermittent deadlock in NameNode when failover happens.
> --------------------------------------------------------
>
>                 Key: HDFS-11180
>                 URL: https://issues.apache.org/jira/browse/HDFS-11180
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Abhishek Modi
>              Labels: high-availability
>         Attachments: HDFS-11180.00.patch, HDFS-11180.01.patch, jstack.log
>
>
> It is happening due to metrics getting updated at the same time when failover 
> is happening. Please find attached jstack at that point of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to