[ https://issues.apache.org/jira/browse/HDFS-12323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Erik Krogen updated HDFS-12323: ------------------------------- Attachment: HDFS-12323.002.patch Alright, seems fair. Attaching v002 patch; only difference is the addition of a no-arg constructor. > NameNode terminates after full GC thinking QJM unresponsive if full GC is > much longer than timeout > -------------------------------------------------------------------------------------------------- > > Key: HDFS-12323 > URL: https://issues.apache.org/jira/browse/HDFS-12323 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, qjm > Affects Versions: 2.7.4 > Reporter: Erik Krogen > Assignee: Erik Krogen > Attachments: HDFS-12323.000.patch, HDFS-12323.001.patch, > HDFS-12323.002.patch > > > HDFS-10733 attempted to fix the issue where the Namenode process would > terminate itself if it had a GC pause which lasted longer than the QJM > timeout, since it would think that the QJM had taken too long to respond. > However, it only bumps up the timeout expiration by one timeout length, so if > the GC pause was e.g. 2x the length of the timeout, a TimeoutException will > be thrown and the NN will still terminate itself. > Thanks to [~yangjiandan] for noting this issue as a comment on HDFS-10733; we > have also seen this issue on a real cluster even after HDFS-10733 is applied. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org