Deng An created SPARK-41483: ------------------------------- Summary: MetricsSystem report takes too much time, which may lead to spark application failed on yarn. Key: SPARK-41483 URL: https://issues.apache.org/jira/browse/SPARK-41483 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 2.4.8 Reporter: Deng An
My issue is similar to: SPARK-31625( [https://github.com/apache/spark/pull/28435).] In the scenario where the shutdown hook does not run (e.g., timeouts, etc.), the application is not unregistered, resulting in YARN RM resubmitting the application even if it succeeded. ```scala 22/12/08 09:28:06 INFO ApplicationMaster: Final app status: SUCCEEDED, exitCode: 0 22/12/08 09:28:06 INFO SparkContext :Invoking stop() from shut down hook 22/12/08 09:28:06 INFO SparkContext :SparklJI : Stopped Spark web UI at xxx 22/12/08 09:28:16 WARN ShutdownHookManager: ShutdownHook '$anon$2' timeout, java.util.concurrent.TimeoutException java.util.concurrent.TimeoutException at java.util.concurrent.FutureTask.get(FutureTask.java:205) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:67) 22/12/08 09:28:26 WARN ShutdownHookManager: ShutdownHook 'ClientFinalizer' timeout, java.util.concurrent.TimeoutException java.util.concurrent.TimeoutException at java.util.concurrent.FutureTask.get(FutureTask.java:205) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:67) 22/12/08 09:28:36 ERROR ShutdownHookManager: ShutdownHookManger shutdown forcefully. ``` >From the log, it seems that the shutdown hook of SparkContext is hang after >the UI is closed. Finally, the hadoop shutdown manager threw a timeout >exception and shutdown forcefully. This eventually led to the Spark Application being marked as FAILED by Yarn, because the unregister in the ApplicationMaster was not executed. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org