This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new e081f06ea401 [SPARK-47121][CORE] Avoid RejectedExecutionExceptions 
during StandaloneSchedulerBackend shutdown
e081f06ea401 is described below

commit e081f06ea401a2b6b8c214a36126583d35eaf55f
Author: Josh Rosen <joshro...@databricks.com>
AuthorDate: Wed Feb 21 13:35:32 2024 -0800

    [SPARK-47121][CORE] Avoid RejectedExecutionExceptions during 
StandaloneSchedulerBackend shutdown
    
    ### What changes were proposed in this pull request?
    
    This PR adds logic to avoid uncaught `RejectedExecutionException`s while 
`StandaloneSchedulerBackend` is shutting down.
    
    When the backend is shut down, its `stop()` method calls 
`executorDelayRemoveThread.shutdownNow()`. After this point, though, it's 
possible that its `StandaloneDriverEndpoint` might still process 
`onDisconnected` events and those might trigger calls to schedule new tasks on 
the `executorDelayRemoveThread`. This causes uncaught 
`java.util.concurrent.RejectedExecutionException`s to be thrown in RPC threads.
    
    This patch adds a `try-catch` to catch those exceptions and log a short 
warning if the exceptions occur while the scheduler is stopping. This approach 
is consistent with other similar code in Spark, including:
    
    - 
https://github.com/apache/spark/blob/9b53b803998001e4b706666e37e5f86f900a7430/core/src/main/scala/org/apache/spark/scheduler/TaskResultGetter.scala#L160-L163
    - 
https://github.com/apache/spark/blob/9b53b803998001e4b706666e37e5f86f900a7430/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala#L754-L756
    
    ### Why are the changes needed?
    
    Remove log and exception noise.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    No new tests: it is difficult to reliably reproduce the scenario that leads 
to the log noise.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #45203 from JoshRosen/reduce-scheduler-backend-shutdown-noise.
    
    Authored-by: Josh Rosen <joshro...@databricks.com>
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
---
 .../spark/scheduler/cluster/StandaloneSchedulerBackend.scala | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
 
b/core/src/main/scala/org/apache/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
index bd1cb164b4be..2150b996f058 100644
--- 
a/core/src/main/scala/org/apache/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
+++ 
b/core/src/main/scala/org/apache/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
@@ -18,7 +18,7 @@
 package org.apache.spark.scheduler.cluster
 
 import java.util.Locale
-import java.util.concurrent.{Semaphore, TimeUnit}
+import java.util.concurrent.{RejectedExecutionException, Semaphore, TimeUnit}
 import java.util.concurrent.atomic.AtomicBoolean
 
 import scala.concurrent.Future
@@ -343,8 +343,14 @@ private[spark] class StandaloneSchedulerBackend(
             }
           }
         }
-        executorDelayRemoveThread.schedule(removeExecutorTask,
-          _executorRemoveDelay, TimeUnit.MILLISECONDS)
+        try {
+          executorDelayRemoveThread.schedule(removeExecutorTask,
+            _executorRemoveDelay, TimeUnit.MILLISECONDS)
+        } catch {
+          case _: RejectedExecutionException if stopping.get() =>
+            logWarning(
+              "Skipping onDisconnected RemoveExecutor call because the 
scheduler is stopping")
+        }
       }
     }
   }


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to