[ https://issues.apache.org/jira/browse/HDFS-7991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14389178#comment-14389178 ]
Jing Zhao commented on HDFS-7991: --------------------------------- bq. This is easily fixed by just increasing the timeout or adding logic other logic such as asking if the NN is still alive, etc. But it's hard to know if the NN is still doing checkpoint or NN is stuck in somewhere else. Also it is hard to get a deterministic bound for the timeout value. bq. The problem is that HADOOP_OPTS has the NN's configuration inside it. So, for example, if a user sets the heap size to 64g Good catch. I will try to fix this in a later patch. bq. The code absolutely must shell out another bin/hdfs process to get the proper HADOOP_OPTS setting. I suspect it will actually have to use a subshell plus parameter captures so that the environment is clean due to various export statements throughout the code and in a lot of user's *-env.sh files. One question here is: can we just simply capture the value of {{HADOOP_OPTS}} before appending {{HADOOP_NAMENODE_OPTS}} to it, and use the captured value for this checkpoint? Looks like this way equals to using a dfsadmin command in the NN's machine. > Allow users to skip checkpoint when stopping NameNode > ----------------------------------------------------- > > Key: HDFS-7991 > URL: https://issues.apache.org/jira/browse/HDFS-7991 > Project: Hadoop HDFS > Issue Type: Improvement > Affects Versions: 3.0.0 > Reporter: Jing Zhao > Assignee: Jing Zhao > Attachments: HDFS-7991.000.patch, HDFS-7991.001.patch, > HDFS-7991.002.patch, HDFS-7991.003.patch > > > This is a follow-up jira of HDFS-6353. HDFS-6353 adds the functionality to > check if saving namespace is necessary before stopping namenode. As [~kihwal] > pointed out in this > [comment|https://issues.apache.org/jira/browse/HDFS-6353?focusedCommentId=14380898&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14380898], > in a secured cluster this new functionality requires the user to be kinit'ed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)