[ https://issues.apache.org/jira/browse/HDFS-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15004464#comment-15004464 ]
Vinod Kumar Vavilapalli commented on HDFS-7645: ----------------------------------------------- Hey folks, Trying to the understand the compatibility story here in the context of 2.7.2, without much of a domain expertise. This JIRA is marked as incompatible. Now, is the combination of this JIRA (HDFS-7645) and the follow up JIRA (HDFS-8656) plus the newly issue discovered issue+fix (HDFS-9426) considered a compatible change? If not, does it make sense to drop all the three from branch-2 and only put them on trunk? > Rolling upgrade is restoring blocks from trash multiple times > ------------------------------------------------------------- > > Key: HDFS-7645 > URL: https://issues.apache.org/jira/browse/HDFS-7645 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode > Affects Versions: 2.6.0 > Reporter: Nathan Roberts > Assignee: Keisuke Ogiwara > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-7645.01.patch, HDFS-7645.02.patch, > HDFS-7645.03.patch, HDFS-7645.04.patch, HDFS-7645.05.patch, > HDFS-7645.06.patch, HDFS-7645.07.patch > > > When performing an HDFS rolling upgrade, the trash directory is getting > restored twice when under normal circumstances it shouldn't need to be > restored at all. iiuc, the only time these blocks should be restored is if we > need to rollback a rolling upgrade. > On a busy cluster, this can cause significant and unnecessary block churn > both on the datanodes, and more importantly in the namenode. > The two times this happens are: > 1) restart of DN onto new software > {code} > private void doTransition(DataNode datanode, StorageDirectory sd, > NamespaceInfo nsInfo, StartupOption startOpt) throws IOException { > if (startOpt == StartupOption.ROLLBACK && sd.getPreviousDir().exists()) { > Preconditions.checkState(!getTrashRootDir(sd).exists(), > sd.getPreviousDir() + " and " + getTrashRootDir(sd) + " should not > " + > " both be present."); > doRollback(sd, nsInfo); // rollback if applicable > } else { > // Restore all the files in the trash. The restored files are retained > // during rolling upgrade rollback. They are deleted during rolling > // upgrade downgrade. > int restored = restoreBlockFilesFromTrash(getTrashRootDir(sd)); > LOG.info("Restored " + restored + " block files from trash."); > } > {code} > 2) When heartbeat response no longer indicates a rollingupgrade is in progress > {code} > /** > * Signal the current rolling upgrade status as indicated by the NN. > * @param inProgress true if a rolling upgrade is in progress > */ > void signalRollingUpgrade(boolean inProgress) throws IOException { > String bpid = getBlockPoolId(); > if (inProgress) { > dn.getFSDataset().enableTrash(bpid); > dn.getFSDataset().setRollingUpgradeMarker(bpid); > } else { > dn.getFSDataset().restoreTrash(bpid); > dn.getFSDataset().clearRollingUpgradeMarker(bpid); > } > } > {code} > HDFS-6800 and HDFS-6981 were modifying this behavior making it not completely > clear whether this is somehow intentional. -- This message was sent by Atlassian JIRA (v6.3.4#6332)