jkff commented on a change in pull request #4145: Many simplifications to 
WriteFiles
URL: https://github.com/apache/beam/pull/4145#discussion_r153291513
 
 

 ##########
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java
 ##########
 @@ -686,40 +756,38 @@ public int compare(
      */
     @VisibleForTesting
     @Experimental(Kind.FILESYSTEM)
-    final void copyToOutputFiles(
+    final void moveToOutputFiles(
         List<KV<FileResult<DestinationT>, ResourceId>> 
resultsToFinalFilenames) throws IOException {
       int numFiles = resultsToFinalFilenames.size();
-      if (numFiles > 0) {
-        LOG.debug("Copying {} files.", numFiles);
-        List<ResourceId> srcFiles = new 
ArrayList<>(resultsToFinalFilenames.size());
-        List<ResourceId> dstFiles = new 
ArrayList<>(resultsToFinalFilenames.size());
-        for (KV<FileResult<DestinationT>, ResourceId> entry : 
resultsToFinalFilenames) {
-          srcFiles.add(entry.getKey().getTempFilename());
-          dstFiles.add(entry.getValue());
-          LOG.info(
-              "Will copy temporary file {} to final location {}",
-              entry.getKey().getTempFilename(),
-              entry.getValue());
-        }
-        // During a failure case, files may have been deleted in an earlier 
step. Thus
-        // we ignore missing files here.
-        FileSystems.copy(srcFiles, dstFiles, 
StandardMoveOptions.IGNORE_MISSING_FILES);
-      } else {
-        LOG.info("No output files to write.");
+      LOG.debug("Copying {} files.", numFiles);
+      List<ResourceId> srcFiles = new 
ArrayList<>(resultsToFinalFilenames.size());
+      List<ResourceId> dstFiles = new 
ArrayList<>(resultsToFinalFilenames.size());
 
 Review comment:
   Seems overkill - they are created right next to each other in code, and 
FileSystems.copy() already does that verification. I removed the size hints to 
make it a little simpler (preallocation probably doesn't matter here).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to