Re: [PR] MAPREDUCE-7474. Improve Manifest committer resilience [hadoop]

via GitHub Tue, 16 Apr 2024 22:18:59 -0700


snvijaya commented on code in PR #6716:
URL: https://github.com/apache/hadoop/pull/6716#discussion_r1567404776



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/committer/manifest/stages/CleanupJobStage.java:
##########
@@ -142,64 +154,93 @@ protected Result executeStage(
     }
 
     Outcome outcome = null;
-    IOException exception;
+    IOException exception = null;
+    boolean baseDirDeleted = false;
 
 
     // to delete.
     LOG.info("{}: Deleting job directory {}", getName(), baseDir);
 
     if (args.deleteTaskAttemptDirsInParallel) {
-      // Attempt to do a parallel delete of task attempt dirs;
-      // don't overreact if a delete fails, but stop trying
-      // to delete the others, and fall back to deleting the
-      // job dir.
-      Path taskSubDir
-          = getStageConfig().getJobAttemptTaskSubDir();
-      try (DurationInfo info = new DurationInfo(LOG,
-          "parallel deletion of task attempts in %s",
-          taskSubDir)) {
-        RemoteIterator<FileStatus> dirs =
-            RemoteIterators.filteringRemoteIterator(
-                listStatusIterator(taskSubDir),
-                FileStatus::isDirectory);
-        TaskPool.foreach(dirs)
-            .executeWith(getIOProcessors())
-            .stopOnFailure()
-            .suppressExceptions(false)
-            .run(this::rmTaskAttemptDir);
-        getIOStatistics().aggregate((retrieveIOStatistics(dirs)));
-
-        if (getLastDeleteException() != null) {
-          // one of the task attempts failed.
-          throw getLastDeleteException();
+
+      // parallel delete of task attempt dirs.
+
+      if (args.parallelDeleteAttemptBaseDeleteFirst) {
+        // attempt to delete the base dir first.
+        // This can reduce ABFS delete load but may time out
+        // (which the fallback to parallel delete will handle).
+        // on GCS it is slow.
+        try (DurationInfo info = new DurationInfo(LOG, true,
+            "Initial delete of %s", baseDir)) {
+          exception = deleteOneDir(baseDir);
+          if (exception == null) {

Review Comment:
   As added by you in this logic, when the directory tree is very large and is 
over OAuth authentication, Azure cloud could fail the baseDir delete due to 
exhaustive ACL permissions checks. But this delete will entry the retry loop as 
it was request timeout and for this scenario all the retries too might fail and 
can take a while to report failure with backoff and retry attempts as per 
AZURE_MAX_IO_RETRIES (default value 30). 
   
   Default max retry count is 30 today just to ensure any 5-10 min 
network/service transient failures do not lead failures of long running 
workloads. 
   
   If this logic to attempt basedir delete before falling back to parallel 
deletes, is optimal only for Azure cloud, we could look for ways to fail fast 
for Delete with recursive.
   
   Would this work - Add a new config MAX_RETRIES_RECURSIVE_DELETE which by 
default will be the same as AZURE_MAX_IO_RETRIES in ABFS driver. 
AzureManifestCommitterFactory could probably set this config to 0 before 
FileSystem.get() call happens.
   
   If this sounds ok, we can look into changes needed in AbfsClient, 
AbfsRestOperation and ExponentialRetry to make MAX_RETRIES_RECURSIVE_DELETE 
config effective. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Re: [PR] MAPREDUCE-7474. Improve Manifest committer resilience [hadoop]

Reply via email to