ashutoshc commented on a change in pull request #552: Hive 21279
URL: https://github.com/apache/hive/pull/552#discussion_r260597741
 
 

 ##########
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
 ##########
 @@ -1476,42 +1500,31 @@ private static String replaceTaskIdFromFilename(String 
filename, String oldTaskI
   }
 
   public static void mvFileToFinalPath(Path specPath, Configuration hconf,
-      boolean success, Logger log, DynamicPartitionCtx dpCtx, FileSinkDesc 
conf,
-      Reporter reporter) throws IOException,
+                                       boolean success, Logger log, 
DynamicPartitionCtx dpCtx, FileSinkDesc conf,
+                                       Reporter reporter) throws IOException,
       HiveException {
 
-    //
-    // Runaway task attempts (which are unable to be killed by MR/YARN) can 
cause HIVE-17113,
-    // where they can write duplicate output files to tmpPath after 
de-duplicating the files,
-    // but before tmpPath is moved to specPath.
-    // Fixing this issue will be done differently for blobstore (e.g. S3)
-    // vs non-blobstore (local filesystem, HDFS) filesystems due to 
differences in
-    // implementation - a directory move in a blobstore effectively results in 
file-by-file
-    // moves for every file in a directory, while in HDFS/localFS a directory 
move is just a
-    // single filesystem operation.
-    // - For non-blobstore FS, do the following:
-    //   1) Rename tmpPath to a new directory name to prevent additional files
-    //      from being added by runaway processes.
-    //   2) Remove duplicates from the temp directory
-    //   3) Rename/move the temp directory to specPath
-    //
-    // - For blobstore FS, do the following:
-    //   1) Remove duplicates from tmpPath
-    //   2) Use moveSpecifiedFiles() to perform a file-by-file move of the 
de-duped files
-    //      to specPath. On blobstore FS, assuming n files in the directory, 
this results
-    //      in n file moves, compared to 2*n file moves with the previous 
solution
-    //      (each directory move would result in a file-by-file move of the 
files in the directory)
-    //
+    // There are following two paths this could could take based on the value 
of shouldAvoidRename
 
 Review comment:
   Rest of the earlier comment still applies for true. we can retain that.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to