[ 
https://issues.apache.org/jira/browse/HDFS-11709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987466#comment-15987466
 ] 

Konstantin Shvachko commented on HDFS-11709:
--------------------------------------------

A unit test would've been appropriate for this jira.

> StandbyCheckpointer should handle an non-existing legacyOivImageDir gracefully
> ------------------------------------------------------------------------------
>
>                 Key: HDFS-11709
>                 URL: https://issues.apache.org/jira/browse/HDFS-11709
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha, namenode
>    Affects Versions: 2.6.1
>            Reporter: Zhe Zhang
>            Assignee: Erik Krogen
>            Priority: Critical
>             Fix For: 2.9.0, 2.7.4, 3.0.0-alpha3, 2.8.1
>
>         Attachments: HDFS-11709.000.patch
>
>
> In {{StandbyCheckpointer}}, if the legacy OIV directory is not properly 
> created, or was deleted for some reason (e.g. mis-operation), all checkpoint 
> ops will fall. Not only the ANN won't receive new fsimages, the JNs will get 
> full with edit log files, and cause NN to crash.
> {code}
>       // Save the legacy OIV image, if the output dir is defined.
>       String outputDir = checkpointConf.getLegacyOivImageDir();
>       if (outputDir != null && !outputDir.isEmpty()) {
>         img.saveLegacyOIVImage(namesystem, outputDir, canceler);
>       }
> {code}
> It doesn't make sense to let such an unimportant part (saving OIV) abort all 
> checkpoints and cause NN crash (and possibly lose data).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to