[ 
https://issues.apache.org/jira/browse/SPARK-48547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao resolved SPARK-48547.
------------------------------
    Fix Version/s: 4.1.0
       Resolution: Fixed

Issue resolved by pull request 52091
[https://github.com/apache/spark/pull/52091]

> Add opt-in flag to have SparkSubmit automatically call System.exit after user 
> code main method exits
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-48547
>                 URL: https://issues.apache.org/jira/browse/SPARK-48547
>             Project: Spark
>          Issue Type: Improvement
>          Components: Deploy
>    Affects Versions: 4.0.0
>            Reporter: Josh Rosen
>            Assignee: Josh Rosen
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.1.0
>
>
> This PR proposes to add a new flag, `spark.submit.callSystemExitOnMainExit` 
> (default false), which when true will instruct SparkSubmit to call 
> System.exit() in the JVM once the user code's main method has exited (for 
> Java / Scala jobs) or once the user's Python or R script has exited.
> This is intended to address a longstanding issue where SparkSubmit 
> invocations might hang after user code has completed:
> [According to Java’s java.lang.Runtime 
> docs|https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Runtime.html#shutdown]:
> {quote}The Java Virtual Machine initiates the _shutdown sequence_ in response 
> to one of several events:
>  # when the number of 
> [live|https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Thread.html#isAlive()]
>  non-daemon threads drops to zero for the first time (see note below on the 
> JNI Invocation API);
>  # when the {{Runtime.exit}} or {{System.exit}} method is called for the 
> first time; or
>  # when some external event occurs, such as an interrupt or a signal is 
> received from the operating system.{quote}
> For Python and R programs, SparkSubmit’s PythonRunner and RRunner will call 
> {{System.exit()}} if the user program exits with a non-zero exit code (see 
> [python|https://github.com/apache/spark/blob/d5c33c6bfb5757b243fc8e1734daeaa4fe3b9b32/core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala#L101-L104]
>  and 
> [R|https://github.com/apache/spark/blob/d5c33c6bfb5757b243fc8e1734daeaa4fe3b9b32/core/src/main/scala/org/apache/spark/deploy/RRunner.scala#L109-L111]
>  runner code).
> But for Java and Scala programs, plus any _successful_ R or Python programs, 
> Spark will _not_ automatically call System.exit.
> In those situation, the JVM will only shutdown when, via event (1), all 
> non-[daemon|https://stackoverflow.com/questions/2213340/what-is-a-daemon-thread-in-java]
>  threads have exited (unless the job is cancelled and sent an external 
> interrupt / kill signal, corresponding to event (3)).
> Thus, *non-daemon* threads might cause logically-completed spark-submit jobs 
> to hang rather than completing.
> The non-daemon threads are not always under Spark's own control and may not 
> necessarily be cleaned up by SparkContext.stop().
> Thus, it is useful to have an opt-in functionality to have SparkSubmit 
> automatically call `System.exit()` upon main method exit (which usually, but 
> not always, corresponds to job completion): this option will allow users and 
> data platform operators to enforce System.exit() calls without having to 
> modify individual jobs' code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to