[ https://issues.apache.org/jira/browse/HDFS-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15227375#comment-15227375 ]
Yongjun Zhang commented on HDFS-10263: -------------------------------------- The following code {code} /** * Recursively compute the difference between snapshots under a given * directory/file. * @param snapshotRoot The directory where snapshots were taken. * @param node The directory/file under which the diff is computed. * @param parentPath Relative path (corresponding to the snapshot root) of * the node's parent. * @param diffReport data structure used to store the diff. */ private void computeDiffRecursively(final INodeDirectory snapshotRoot, INode node, List<byte[]> parentPath, SnapshotDiffInfo diffReport) { final Snapshot earlierSnapshot = diffReport.isFromEarlier() ? diffReport.getFrom() : diffReport.getTo(); final Snapshot laterSnapshot = diffReport.isFromEarlier() ? diffReport.getTo() : diffReport.getFrom(); byte[][] relativePath = parentPath.toArray(new byte[parentPath.size()][]); if (node.isDirectory()) { final ChildrenDiff diff = new ChildrenDiff(); INodeDirectory dir = node.asDirectory(); DirectoryWithSnapshotFeature sf = dir.getDirectoryWithSnapshotFeature(); if (sf != null) { boolean change = sf.computeDiffBetweenSnapshots(earlierSnapshot, laterSnapshot, diff, dir); if (change) { diffReport.addDirDiff(dir, relativePath, diff); } } ReadOnlyList<INode> children = dir.getChildrenList(earlierSnapshot .getId()); for (INode child : children) { final byte[] name = child.getLocalNameBytes(); boolean toProcess = diff.searchIndex(ListType.DELETED, name) < 0; if (!toProcess && child instanceof INodeReference.WithName) { byte[][] renameTargetPath = findRenameTargetPath( snapshotRoot, (WithName) child, laterSnapshot == null ? Snapshot.CURRENT_STATE_ID : laterSnapshot.getId()); if (renameTargetPath != null) { toProcess = true; diffReport.setRenameTarget(child.getId(), renameTargetPath); } } if (toProcess) { parentPath.add(name); computeDiffRecursively(snapshotRoot, child, parentPath, diffReport); parentPath.remove(parentPath.size() - 1); } } } else if (node.isFile() && node.asFile().isWithSnapshot()) { INodeFile file = node.asFile(); boolean change = file.getFileWithSnapshotFeature() .changedBetweenSnapshots(file, earlierSnapshot, laterSnapshot); if (change) { diffReport.addFileDiff(file, relativePath); } } } {code} calcs earlierSnapshot and laterSnapshot then use does {code} boolean change = sf.computeDiffBetweenSnapshots(earlierSnapshot, laterSnapshot, diff, dir); {code} for both forward and backward diff calculation. The bug may be in the related code. > Reversed snapshot diff report contains incorrect entries > -------------------------------------------------------- > > Key: HDFS-10263 > URL: https://issues.apache.org/jira/browse/HDFS-10263 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Yongjun Zhang > > Steps to reproduce: > 1. Take a snapshot s1 at: > {code} > drwxr-xr-x - yzhang supergroup 0 2016-04-05 14:48 /target/bar > -rw-r--r-- 1 yzhang supergroup 1024 2016-04-05 14:48 /target/bar/f1 > drwxr-xr-x - yzhang supergroup 0 2016-04-05 14:48 /target/foo > -rw-r--r-- 1 yzhang supergroup 1024 2016-04-05 14:48 /target/foo/f1 > {code} > 2. Make the following change: > {code} > private int changeData7(Path dir) throws Exception { > final Path foo = new Path(dir, "foo"); > final Path foo2 = new Path(dir, "foo2"); > final Path foo_f1 = new Path(foo, "f1"); > final Path foo2_f2 = new Path(foo2, "f2"); > final Path foo2_f1 = new Path(foo2, "f1"); > final Path foo_d1 = new Path(foo, "d1"); > final Path foo_d1_f3 = new Path(foo_d1, "f3"); > int numDeletedAndModified = 0; > dfs.rename(foo, foo2); > dfs.delete(foo2_f1, true); > > DFSTestUtil.createFile(dfs, foo_f1, BLOCK_SIZE, DATA_NUM, 0L); > DFSTestUtil.appendFile(dfs, foo_f1, (int) BLOCK_SIZE); > dfs.rename(foo_f1, foo2_f2); > numDeletedAndModified += 1; // "M ./foo" > DFSTestUtil.createFile(dfs, foo_d1_f3, BLOCK_SIZE, DATA_NUM, 0L); > return numDeletedAndModified; > } > {code} > that results in > {code} > drwxr-xr-x - yzhang supergroup 0 2016-04-05 14:48 /target/bar > -rw-r--r-- 1 yzhang supergroup 1024 2016-04-05 14:48 /target/bar/f1 > drwxr-xr-x - yzhang supergroup 0 2016-04-05 14:48 /target/foo > drwxr-xr-x - yzhang supergroup 0 2016-04-05 14:48 /target/foo/d1 > -rw-r--r-- 1 yzhang supergroup 1024 2016-04-05 14:48 /target/foo/d1/f3 > drwxr-xr-x - yzhang supergroup 0 2016-04-05 14:48 /target/foo2 > -rw-r--r-- 1 yzhang supergroup 2048 2016-04-05 14:48 /target/foo2/f2 > {code} > 3. take snapshot s2 here > 4. Do the following to revert the change done in step 2 > {code} > private int revertChangeData7(Path dir) throws Exception { > final Path foo = new Path(dir, "foo"); > final Path foo2 = new Path(dir, "foo2"); > final Path foo_f1 = new Path(foo, "f1"); > final Path foo2_f2 = new Path(foo2, "f2"); > final Path foo2_f1 = new Path(foo2, "f1"); > final Path foo_d1 = new Path(foo, "d1"); > final Path foo_d1_f3 = new Path(foo_d1, "f3"); > int numDeletedAndModified = 0; > > dfs.delete(foo_d1, true); > dfs.rename(foo2_f2, foo_f1); > > dfs.delete(foo, true); > > DFSTestUtil.createFile(dfs, foo2_f1, BLOCK_SIZE, DATA_NUM, 0L); > DFSTestUtil.appendFile(dfs, foo2_f1, (int) BLOCK_SIZE); > dfs.rename(foo2, foo); > > return numDeletedAndModified; > } > {code} > that get the following results: > {code} > drwxr-xr-x - yzhang supergroup 0 2016-04-05 14:48 /target/bar > -rw-r--r-- 1 yzhang supergroup 1024 2016-04-05 14:48 /target/bar/f1 > drwxr-xr-x - yzhang supergroup 0 2016-04-05 14:48 /target/foo > -rw-r--r-- 1 yzhang supergroup 2048 2016-04-05 14:48 /target/foo/f1 > {code} > 4. Take snapshot s3 here. > Below is the different snapshots > {code} > s1-s2: Difference between snapshot s1 and snapshot s2 under directory /target: > M . > + ./foo > R ./foo -> ./foo2 > M ./foo > + ./foo/f2 > - ./foo/f1 > s2-s1: Difference between snapshot s2 and snapshot s1 under directory /target: > M . > - ./foo > R ./foo2 -> ./foo > M ./foo > - ./foo/f2 > + ./foo/f1 > s2-s3: Difference between snapshot s2 and snapshot s3 under directory /target: > M . > - ./foo > R ./foo2 -> ./foo > M ./foo2 > + ./foo2/f1 > - ./foo2/f2 > s3-s2: Difference between snapshot s3 and snapshot s2 under directory /target: > M . > + ./foo > R ./foo -> ./foo2 > M ./foo2 > - ./foo2/f1 > + ./foo2/f2 > {code} > The s2-s1 snapshot is supposed to be the same as s2-s3, because the change > from s2 to s3 is an exact reversion of the change from s1 to s2. We can see > that s1 and s3 have same file structures. > However, the resulted shown above is not. I expect the following part > {code} > M ./foo > - ./foo/f2 > + ./foo/f1 > {code} > in s2-s1 diff should be > {code} > M ./foo2 > + ./foo2/f1 > - ./foo2/f2 > {code} > (same as in s2-s3) > instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)