[ https://issues.apache.org/jira/browse/HDFS-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855748#comment-16855748 ]
He Xiaoqiao commented on HDFS-14513: ------------------------------------ Sorry I am not familiar with mechanism of junit using ShutdownHook. For unit test in [^HDFS-14513.004.patch], there are full log expected as following before patch, So assert.fail in ShutdownHook run after the unit test finished, and could not fail the unit test? {code:java} 2019-06-04 22:16:56,160 [Thread-10] WARN util.ShutdownHookManager (ShutdownHookManager.java:executeShutdown(131)) - ShutdownHook 'TestSaveNamespace$$Lambda$31/754186396' failed, java.util.concurrent.ExecutionException: java.lang.AssertionError: FSImageSaver checkpoint not clean by ShutdownHook. java.util.concurrent.ExecutionException: java.lang.AssertionError: FSImageSaver checkpoint not clean by ShutdownHook. at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:206) at org.apache.hadoop.util.ShutdownHookManager.executeShutdown(ShutdownHookManager.java:124) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:95) Caused by: java.lang.AssertionError: FSImageSaver checkpoint not clean by ShutdownHook. at org.junit.Assert.fail(Assert.java:88) at org.apache.hadoop.hdfs.server.namenode.TestSaveNamespace.lambda$testCleanCheckpointWhenShutdown$0(TestSaveNamespace.java:879) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) at java.util.concurrent.FutureTask.run(FutureTask.java) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Disconnected from the target VM, address: '127.0.0.1:59691', transport: 'socket' {code} > FSImage which is saving should be clean while NameNode shutdown > --------------------------------------------------------------- > > Key: HDFS-14513 > URL: https://issues.apache.org/jira/browse/HDFS-14513 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Reporter: He Xiaoqiao > Assignee: He Xiaoqiao > Priority: Major > Attachments: HDFS-14513.001.patch, HDFS-14513.002.patch, > HDFS-14513.003.patch, HDFS-14513.004.patch > > > Checkpointer/FSImageSaver is regular tasks and dump NameNode meta to disk, at > most per hour by default. If it receive some command (e.g. transition to > active in HA mode) it will cancel checkpoint and delete tmp files using > {{FSImage#deleteCancelledCheckpoint}}. However if NameNode shutdown when > checkpoint, the tmp files will not be cleaned anymore. > Consider there are 500m inodes+blocks, it could cost 5~10min to finish once > checkpoint, if we shutdown NameNode during checkpointing, fsimage checkpoint > file will never be cleaned, after long time, there could be many useless > checkpoint files. So I propose that we should add hook to clean that when > shutdown. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org