[ 
https://issues.apache.org/jira/browse/HDFS-7991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14556586#comment-14556586
 ] 

Suresh Srinivas commented on HDFS-7991:
---------------------------------------

bq. Yup, and in those cases, that's what they pay vendors to fix. For those of 
that don't, they roll back to the last good copy and move on. 
The proposal here ensure that no vendor needs to be involved to remove faulty 
editlog record (BTW I have not seen regex issues, only out of order editlog 
entries that could not be applied or editlog records became too big (n^2 
growth) and applying it became laboriously slow).

bq. All of the discussion up until recently has been about fixing the broken 
bits in the shell code. If we want to switch the discussion to make the 
namenode checkpoint optional when it's sent a kill, that's great. It means we 
can clean out the shell code and make this entirely a Java-level fix, as it 
should be.
We can fix issues in the code. Currently NN is sent kill -9 after a timeout. 
That needs to be changed to work with NN shutdown hook. Also NN shutdown hook 
and ensuring all the daemon services are done in the right order without 
causing failures to namespace requires careful design. It also requires putting 
namenode into safemode. I think doing it outside, as done in the current 
approach, using save namespace, is much simpler and cleaner. But if you want to 
do it as part of shutdown you are welcome to do make that change. If that 
change takes some time, I prefer the current mechanism until it gets ready. The 
current mechanism can be removed when *better* working solution is available.

> Allow users to skip checkpoint when stopping NameNode
> -----------------------------------------------------
>
>                 Key: HDFS-7991
>                 URL: https://issues.apache.org/jira/browse/HDFS-7991
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 3.0.0
>            Reporter: Jing Zhao
>            Assignee: Jing Zhao
>         Attachments: HDFS-7991-shellpart.patch, HDFS-7991.000.patch, 
> HDFS-7991.001.patch, HDFS-7991.002.patch, HDFS-7991.003.patch, 
> HDFS-7991.004.patch
>
>
> This is a follow-up jira of HDFS-6353. HDFS-6353 adds the functionality to 
> check if saving namespace is necessary before stopping namenode. As [~kihwal] 
> pointed out in this 
> [comment|https://issues.apache.org/jira/browse/HDFS-6353?focusedCommentId=14380898&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14380898],
>  in a secured cluster this new functionality requires the user to be kinit'ed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to