[
https://issues.apache.org/jira/browse/SPARK-48547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kent Yao resolved SPARK-48547.
------------------------------
Fix Version/s: 4.1.0
Resolution: Fixed
Issue resolved by pull request 52091
[https://github.com/apache/spark/pull/52091]
> Add opt-in flag to have SparkSubmit automatically call System.exit after user
> code main method exits
> ----------------------------------------------------------------------------------------------------
>
> Key: SPARK-48547
> URL: https://issues.apache.org/jira/browse/SPARK-48547
> Project: Spark
> Issue Type: Improvement
> Components: Deploy
> Affects Versions: 4.0.0
> Reporter: Josh Rosen
> Assignee: Josh Rosen
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.1.0
>
>
> This PR proposes to add a new flag, `spark.submit.callSystemExitOnMainExit`
> (default false), which when true will instruct SparkSubmit to call
> System.exit() in the JVM once the user code's main method has exited (for
> Java / Scala jobs) or once the user's Python or R script has exited.
> This is intended to address a longstanding issue where SparkSubmit
> invocations might hang after user code has completed:
> [According to Java’s java.lang.Runtime
> docs|https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Runtime.html#shutdown]:
> {quote}The Java Virtual Machine initiates the _shutdown sequence_ in response
> to one of several events:
> # when the number of
> [live|https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Thread.html#isAlive()]
> non-daemon threads drops to zero for the first time (see note below on the
> JNI Invocation API);
> # when the {{Runtime.exit}} or {{System.exit}} method is called for the
> first time; or
> # when some external event occurs, such as an interrupt or a signal is
> received from the operating system.{quote}
> For Python and R programs, SparkSubmit’s PythonRunner and RRunner will call
> {{System.exit()}} if the user program exits with a non-zero exit code (see
> [python|https://github.com/apache/spark/blob/d5c33c6bfb5757b243fc8e1734daeaa4fe3b9b32/core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala#L101-L104]
> and
> [R|https://github.com/apache/spark/blob/d5c33c6bfb5757b243fc8e1734daeaa4fe3b9b32/core/src/main/scala/org/apache/spark/deploy/RRunner.scala#L109-L111]
> runner code).
> But for Java and Scala programs, plus any _successful_ R or Python programs,
> Spark will _not_ automatically call System.exit.
> In those situation, the JVM will only shutdown when, via event (1), all
> non-[daemon|https://stackoverflow.com/questions/2213340/what-is-a-daemon-thread-in-java]
> threads have exited (unless the job is cancelled and sent an external
> interrupt / kill signal, corresponding to event (3)).
> Thus, *non-daemon* threads might cause logically-completed spark-submit jobs
> to hang rather than completing.
> The non-daemon threads are not always under Spark's own control and may not
> necessarily be cleaned up by SparkContext.stop().
> Thus, it is useful to have an opt-in functionality to have SparkSubmit
> automatically call `System.exit()` upon main method exit (which usually, but
> not always, corresponds to job completion): this option will allow users and
> data platform operators to enforce System.exit() calls without having to
> modify individual jobs' code.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]