Wei Yan created HDFS-12800:
------------------------------

             Summary: Potential disk/block missing when DataNode upgrade with 
data layout changed
                 Key: HDFS-12800
                 URL: https://issues.apache.org/jira/browse/HDFS-12800
             Project: Hadoop HDFS
          Issue Type: Bug
            Reporter: Wei Yan
            Assignee: Wei Yan


During upgrade with a data layout change, we found some disks are not formatted 
as new layout version, causing some blocks are missing. The root cause is 
because of race conflict in the doUpgrade process.

In current DataStorage.java's loadBlockPoolSliceStorage implementation, for 
each datadir, it will restore trash, generate upgrade task, and execute these 
tasks at the end of each datadir for-loop. 
{code}
    for (StorageLocation dataDir : dataDirs) {
      dataDir.makeBlockPoolDir(bpid, null);
      try {
        final List<Callable<StorageDirectory>> callables = Lists.newArrayList();
        final List<StorageDirectory> dirs = bpStorage.recoverTransitionRead(
            nsInfo, dataDir, startOpt, callables, datanode.getConf());
        if (callables.isEmpty()) {
          ......
        } else {
          for(Callable<StorageDirectory> c : callables) {
            tasks.add(new UpgradeTask(dataDir, executor.submit(c)));
          }
        }
      } catch (IOException e) {
        ......
      }
    }
{code}

Inside the doUpgrade task, it will actually update the layoutVersion variable.
{code}
this.layoutVersion = HdfsServerConstants.DATANODE_LAYOUT_VERSION;
{code}
This will break the upgrade task generation for other datadirs 
(BlockPoolSliceStorage.java). The 2nd if condition will fail, causing some 
disks are not added to the upgrade task lists. As a results, only part of disks 
are upgraded to the new layout format, and few are not. Restarting DataNodes 
will reduce the missing number.
{code}
    if (this.layoutVersion > HdfsServerConstants.DATANODE_LAYOUT_VERSION) {
      int restored = restoreBlockFilesFromTrash(getTrashRootDir(sd));
      LOG.info("Restored " + restored + " block files from trash " +
        "before the layout upgrade. These blocks will be moved to " +
        "the previous directory during the upgrade");
    }
    if (this.layoutVersion > HdfsServerConstants.DATANODE_LAYOUT_VERSION
        || this.cTime < nsInfo.getCTime()) {
      doUpgrade(sd, nsInfo, callables, conf); // upgrade
      return true;
    }
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to