[ 
https://issues.apache.org/jira/browse/HDFS-12323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16168545#comment-16168545
 ] 

Andrew Wang commented on HDFS-12323:
------------------------------------

Given that this is going into maintenance releases, any reason not to put this 
into branch-3.0 for beta1 also?

> NameNode terminates after full GC thinking QJM unresponsive if full GC is 
> much longer than timeout
> --------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-12323
>                 URL: https://issues.apache.org/jira/browse/HDFS-12323
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode, qjm
>    Affects Versions: 2.7.4
>            Reporter: Erik Krogen
>            Assignee: Erik Krogen
>             Fix For: 2.9.0, 2.8.3, 2.7.5, 3.1.0
>
>         Attachments: HDFS-12323.000.patch, HDFS-12323.001.patch, 
> HDFS-12323.002.patch, HDFS-12323.003.patch, HDFS-12323.004.patch
>
>
> HDFS-10733 attempted to fix the issue where the Namenode process would 
> terminate itself if it had a GC pause which lasted longer than the QJM 
> timeout, since it would think that the QJM had taken too long to respond. 
> However, it only bumps up the timeout expiration by one timeout length, so if 
> the GC pause was e.g. 2x the length of the timeout, a TimeoutException will 
> be thrown and the NN will still terminate itself.
> Thanks to [~yangjiandan] for noting this issue as a comment on HDFS-10733; we 
> have also seen this issue on a real cluster even after HDFS-10733 is applied.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to