Todd Lipcon created HDFS-3653: --------------------------------- Summary: 1.x: Add a retention period for purged edit logs Key: HDFS-3653 URL: https://issues.apache.org/jira/browse/HDFS-3653 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 1.1.0 Reporter: Todd Lipcon Assignee: Todd Lipcon
Occasionally we have a bug which causes something to go wrong with edits files. Even more occasionally the bug is such that the namenode mistakenly deletes an {{edits}} file without merging it into {{fsimage}} properly -- e.g if the bug mistakenly writes an OP_INVALID at the top of the log. In trunk/2.0 we retain many edit log segments going back in time to be more robust to this kind of error. I'd like to implement something similar (but much simpler) in 1.x, which would be used only by HDFS developers in root-causing or repairing from these rare scenarios: the NN should never directly delete an edit log file. Instead, it should rename the file into some kind of "trash" directory inside the name dir, and associate it with a timestamp. Then, periodically a separate thread should scan the trash dirs and delete any logs older than a configurable time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira