Deng An created SPARK-41483:
-------------------------------

             Summary: MetricsSystem report takes too much time, which may lead 
to spark application failed on yarn.
                 Key: SPARK-41483
                 URL: https://issues.apache.org/jira/browse/SPARK-41483
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
    Affects Versions: 2.4.8
            Reporter: Deng An


My issue is similar to: SPARK-31625( 
[https://github.com/apache/spark/pull/28435).]

In the scenario where the shutdown hook does not run (e.g., timeouts, etc.), 
the application is not unregistered, resulting in YARN RM resubmitting the 
application even if it succeeded.

```scala

22/12/08 09:28:06 INFO ApplicationMaster: Final app status: SUCCEEDED, 
exitCode: 0
22/12/08 09:28:06 INFO SparkContext :Invoking stop() from shut down hook 
22/12/08 09:28:06 INFO SparkContext :SparklJI : Stopped Spark web UI at xxx
22/12/08 09:28:16 WARN ShutdownHookManager: ShutdownHook '$anon$2' timeout, 
java.util.concurrent.TimeoutException java.util.concurrent.TimeoutException     
at java.util.concurrent.FutureTask.get(FutureTask.java:205)     at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:67) 
22/12/08 09:28:26 WARN ShutdownHookManager: ShutdownHook 'ClientFinalizer' 
timeout, java.util.concurrent.TimeoutException 
java.util.concurrent.TimeoutException     at 
java.util.concurrent.FutureTask.get(FutureTask.java:205)     at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:67) 
22/12/08 09:28:36 ERROR ShutdownHookManager: ShutdownHookManger shutdown 
forcefully.

```

>From the log, it seems that the shutdown hook of SparkContext is hang after 
>the UI is closed. Finally, the hadoop shutdown manager threw a timeout 
>exception and shutdown forcefully.

This eventually led to the Spark Application being marked as FAILED by Yarn, 
because the unregister in the ApplicationMaster was not executed.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to