Herman Schistad created SPARK-13198:
---------------------------------------

             Summary: sc.stop() does not clean up on driver, causes Java heap 
OOM.
                 Key: SPARK-13198
                 URL: https://issues.apache.org/jira/browse/SPARK-13198
             Project: Spark
          Issue Type: Bug
          Components: Mesos
    Affects Versions: 1.6.0
            Reporter: Herman Schistad


When starting and stopping multiple SparkContext's linearly eventually the 
driver stops working with a "io.netty.handler.codec.EncoderException: 
java.lang.OutOfMemoryError: Java heap space" error.

Reproduce by running the following code and loading in ~7MB parquet data each 
time. The driver heap space is not changed and thus defaults to 1GB:

{code:java}
def main(args: Array[String]) {
  val conf = new SparkConf().setMaster("MASTER_URL").setAppName("")
  conf.set("spark.mesos.coarse", "true")
  conf.set("spark.cores.max", "10")

  for (i <- 1 until 100) {
    val sc = new SparkContext(conf)
    val sqlContext = new SQLContext(sc)

    val events = sqlContext.read.parquet("hdfs://locahost/tmp/something")
    println(s"Context ($i), number of events: " + events.count)
    sc.stop()
  }
}
{code}

The heap space fills up within 20 loops on my cluster. Increasing the number of 
cores to 50 in the above example results in heap space error after 12 contexts.

Dumping the heap reveals many equally sized "CoarseMesosSchedulerBackend" 
objects (see attachments). Digging into the inner objects tells me that the 
`executorDataMap` is where 99% of the data in said object is stored. I do 
believe though that this is beside the point as I'd expect this whole object to 
be garbage collected or freed on sc.stop(). 

Additionally I can see in the Spark web UI that each time a new context is 
created the number of the "SQL" tab increments by one (i.e. last iteration 
would have SQL99). After doing stop and creating a completely new context I was 
expecting this number to be reset to 1 ("SQL").

I'm submitting the jar file with `spark-submit` and no special flags. The 
cluster is running Mesos 0.23. I'm running Spark 1.6.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to