[ https://issues.apache.org/jira/browse/HDFS-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14286128#comment-14286128 ]
Arpit Agarwal commented on HDFS-7645: ------------------------------------- The first restore was by design when the rolling upgrade feature was added (HDFS-6005). It simplified the rollback procedure by not requiring the {{-rollback}} flag to the DataNode, so regular startup/rollback could be treated similarly by restoring from trash. HDFS-6800 added back the requirement to pass the {{-rollback}} flag during RU rollback, to support layout changes. The second restore was a side effect of the same fix. We can probably eliminate both restores now. DN layout changes will be rare for minor/point releases. I am wary of eliminating trash without some numbers showing hard link performance with millions of blocks is on par with trash. Even a few seconds per DN adds up to many hours/days when upgrading thousands of DNs sequentially. Once we fix this issue raised by Nathan the overhead of trash as compared to regular startup is nil. > Rolling upgrade is restoring blocks from trash multiple times > ------------------------------------------------------------- > > Key: HDFS-7645 > URL: https://issues.apache.org/jira/browse/HDFS-7645 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Affects Versions: 2.6.0 > Reporter: Nathan Roberts > > When performing an HDFS rolling upgrade, the trash directory is getting > restored twice when under normal circumstances it shouldn't need to be > restored at all. iiuc, the only time these blocks should be restored is if we > need to rollback a rolling upgrade. > On a busy cluster, this can cause significant and unnecessary block churn > both on the datanodes, and more importantly in the namenode. > The two times this happens are: > 1) restart of DN onto new software > {code} > private void doTransition(DataNode datanode, StorageDirectory sd, > NamespaceInfo nsInfo, StartupOption startOpt) throws IOException { > if (startOpt == StartupOption.ROLLBACK && sd.getPreviousDir().exists()) { > Preconditions.checkState(!getTrashRootDir(sd).exists(), > sd.getPreviousDir() + " and " + getTrashRootDir(sd) + " should not > " + > " both be present."); > doRollback(sd, nsInfo); // rollback if applicable > } else { > // Restore all the files in the trash. The restored files are retained > // during rolling upgrade rollback. They are deleted during rolling > // upgrade downgrade. > int restored = restoreBlockFilesFromTrash(getTrashRootDir(sd)); > LOG.info("Restored " + restored + " block files from trash."); > } > {code} > 2) When heartbeat response no longer indicates a rollingupgrade is in progress > {code} > /** > * Signal the current rolling upgrade status as indicated by the NN. > * @param inProgress true if a rolling upgrade is in progress > */ > void signalRollingUpgrade(boolean inProgress) throws IOException { > String bpid = getBlockPoolId(); > if (inProgress) { > dn.getFSDataset().enableTrash(bpid); > dn.getFSDataset().setRollingUpgradeMarker(bpid); > } else { > dn.getFSDataset().restoreTrash(bpid); > dn.getFSDataset().clearRollingUpgradeMarker(bpid); > } > } > {code} > HDFS-6800 and HDFS-6981 were modifying this behavior making it not completely > clear whether this is somehow intentional. -- This message was sent by Atlassian JIRA (v6.3.4#6332)