umamaheswararao commented on code in PR #5885:
URL: https://github.com/apache/hadoop/pull/5885#discussion_r1274301605


##########
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/SimpleCopyListing.java:
##########
@@ -316,6 +303,42 @@ protected void doBuildListingWithSnapshotDiff(
     }
   }
 
+  /**
+   * Handle create Diffs and add to the copyList.
+   * If the path is a directory, iterate it recursively and add the paths
+   * to the result copyList.
+   *
+   * @param fileListWriter the list for holding processed results
+   * @param context The DistCp context with associated input options
+   * @param sourceRoot The rootDir of the source snapshot
+   * @param sourceFS the source Filesystem
+   * @param fileStatuses store the result fileStatuses to add to the copyList
+   * @param diff the SnapshotDiff report
+   * @throws IOException
+   */
+  protected void addCreateDiffsToFileListing(SequenceFile.Writer 
fileListWriter,
+      DistCpContext context, Path sourceRoot, FileSystem sourceFS,
+      List<FileStatusInfo> fileStatuses, DiffInfo diff) throws IOException {
+    addToFileListing(fileListWriter, sourceRoot, diff.getTarget(), context);
+
+    FileStatus sourceStatus = sourceFS.getFileStatus(diff.getTarget());

Review Comment:
   Have you thought about just having a advanced flag to control this? I am not 
sure we will be having many implementations of these copyListings 
   I am not against the current design, but it's a simple thought to keep it 
simple



##########
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestCopyListing.java:
##########
@@ -167,6 +169,77 @@ public void testDuplicates() {
     }
   }
 
+  @Test(expected = DuplicateFileException.class, timeout = 10000)
+  public void testDiffBasedSimpleCopyListing() throws IOException {
+    FileSystem fs = null;
+    Configuration configuration = getConf();
+    DistCpSync distCpSync = Mockito.mock(DistCpSync.class);
+    Path listingFile = new Path("/tmp/list");
+    // Throws DuplicateFileException when copyListing is SimpleCopyListing
+    // as it recursively traverses src3 directory and also adds 3.txt,4.txt 
twice
+    configuration.set(DistCpConstants.CONF_LABEL_DIFF_COPY_LISTING_CLASS,
+        SimpleCopyListing.class.getName());

Review Comment:
   How this config should be configured when we want to copy from HDFS to Ozone 
or vise versa ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to