[ 
https://issues.apache.org/jira/browse/SPARK-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14483625#comment-14483625
 ] 

Michal Klos commented on SPARK-4783:
------------------------------------

We are running into this exact issue. We have a driver application that has 
other responsibilities beyond submitting spark work and we don't want it to die 
if there is an issue with the cluster. The cluster can be recovered or a new 
one can be spun up with the same DNS, and we can start fresh with a new 
context. But, in the meantime, we want the driver app to continue running and 
carry on its other business. 

Specifically for us, we are having issues with this exit:
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala#L409-L411

We are considering patching it, but not sure whether or not this will cause 
other problems or if there was a good reason it hasn't been removed yet.

> System.exit() calls in SparkContext disrupt applications embedding Spark
> ------------------------------------------------------------------------
>
>                 Key: SPARK-4783
>                 URL: https://issues.apache.org/jira/browse/SPARK-4783
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>            Reporter: David Semeria
>
> A common architectural choice for integrating Spark within a larger 
> application is to employ a gateway to handle Spark jobs. The gateway is a 
> server which contains one or more long-running sparkcontexts.
> A typical server is created with the following pseudo code:
> var continue = true
> while (continue){
>  try {
>     server.run() 
>   } catch (e) {
>   continue = log_and_examine_error(e)
> }
> The problem is that sparkcontext frequently calls System.exit when it 
> encounters a problem which means the server can only be re-spawned at the 
> process level, which is much more messy than the simple code above.
> Therefore, I believe it makes sense to replace all System.exit calls in 
> sparkcontext with the throwing of a fatal error. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to