Zhe Zhang created HDFS-8303:
-------------------------------

             Summary: QJM should purge old logs in the current directory 
through FJM
                 Key: HDFS-8303
                 URL: https://issues.apache.org/jira/browse/HDFS-8303
             Project: Hadoop HDFS
          Issue Type: Sub-task
            Reporter: Zhe Zhang
            Assignee: Zhe Zhang


As the first step of the consolidation effort, QJM should call its FJM to purge 
the current directory. 

The current QJM logic of purging current dir is very similar to FJM purging 
logic.

QJM:
{code}
 private static final List<Pattern> CURRENT_DIR_PURGE_REGEXES =
      ImmutableList.of(
        Pattern.compile("edits_\\d+-(\\d+)"),
        Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
...
          long txid = Long.parseLong(matcher.group(1));
          if (txid < minTxIdToKeep) {
            LOG.info("Purging no-longer needed file " + txid);
            if (!f.delete()) {
...
{code}

FJM:
{code}
  private static final Pattern EDITS_REGEX = Pattern.compile(
    NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
  private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
    NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
  private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
      NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
...
    List<EditLogFile> editLogs = matchEditLogs(files, true);
    for (EditLogFile log : editLogs) {
      if (log.getFirstTxId() < minTxIdToKeep &&
          log.getLastTxId() < minTxIdToKeep) {
        purger.purgeLog(log);
      }
    }
{code}

I can see 2 differences:
# FJM has a slightly stricter match for empty/corrupt in-progress files: the 
suffix shouldn't have blank space
# FJM verifies that both start and end txID of a finalized edit file to be old 
enough

Both seem safer than the QJM logic. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to