[ https://issues.apache.org/jira/browse/HDFS-11180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15702354#comment-15702354 ]
Kihwal Lee commented on HDFS-11180: ----------------------------------- bq. NameNode holds a lock of FSEditLog and requires a lock of MetricsSystemImpl when registering IPCLoggerChannel metrics. It looks like this deadlock can happen only when the QJM is used. Metrics update does not need a precise txid. We could introduce unsynchronized methods for metrics and perhaps use volatile for txid? > Intermittent deadlock in NameNode when failover happens. > -------------------------------------------------------- > > Key: HDFS-11180 > URL: https://issues.apache.org/jira/browse/HDFS-11180 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 2.6.0 > Reporter: Abhishek Modi > Labels: high-availability > Attachments: HDFS-11180.00.patch, HDFS-11180.01.patch, jstack.log > > > It is happening due to metrics getting updated at the same time when failover > is happening. Please find attached jstack at that point of time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org