Aaron Davidson created SPARK-2949:
-------------------------------------

             Summary: SparkContext does not fate-share with ActorSystem
                 Key: SPARK-2949
                 URL: https://issues.apache.org/jira/browse/SPARK-2949
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
            Reporter: Aaron Davidson


It appears that an uncaught fatal error in Spark's Driver ActorSystem does not 
cause the SparkContext to terminate. We observed an issue in production that 
caused a PermGen error, but it just kept throwing this error:

{code}
14/08/09 15:07:24 ERROR ActorSystemImpl: Uncaught fatal error from thread 
[spark-akka.actor.default-dispatcher-26] shutting down ActorSystem [spark]
java.lang.OutOfMemoryError: PermGen space
{code}

We should probably do something similar for what we did in the DAGSCheduler and 
ensure that we call SparkContext#stop() if the entire ActorSystem dies with a 
fatal error.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to