[ https://issues.apache.org/jira/browse/SPARK-13198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15136774#comment-15136774 ]
Herman Schistad commented on SPARK-13198: ----------------------------------------- Digging more into the dumped heap and running a memory leak report (using Eclipse Memory Analyzer) I'm seeing the following result: !Screen Shot 2016-02-08 at 10.03.04.png|width=400! > sc.stop() does not clean up on driver, causes Java heap OOM. > ------------------------------------------------------------ > > Key: SPARK-13198 > URL: https://issues.apache.org/jira/browse/SPARK-13198 > Project: Spark > Issue Type: Bug > Components: Mesos > Affects Versions: 1.6.0 > Reporter: Herman Schistad > Attachments: Screen Shot 2016-02-04 at 16.31.28.png, Screen Shot > 2016-02-04 at 16.31.40.png, Screen Shot 2016-02-04 at 16.31.51.png, Screen > Shot 2016-02-08 at 09.30.59.png, Screen Shot 2016-02-08 at 09.31.10.png, > Screen Shot 2016-02-08 at 10.03.04.png, gc.log > > > When starting and stopping multiple SparkContext's linearly eventually the > driver stops working with a "io.netty.handler.codec.EncoderException: > java.lang.OutOfMemoryError: Java heap space" error. > Reproduce by running the following code and loading in ~7MB parquet data each > time. The driver heap space is not changed and thus defaults to 1GB: > {code:java} > def main(args: Array[String]) { > val conf = new SparkConf().setMaster("MASTER_URL").setAppName("") > conf.set("spark.mesos.coarse", "true") > conf.set("spark.cores.max", "10") > for (i <- 1 until 100) { > val sc = new SparkContext(conf) > val sqlContext = new SQLContext(sc) > val events = sqlContext.read.parquet("hdfs://locahost/tmp/something") > println(s"Context ($i), number of events: " + events.count) > sc.stop() > } > } > {code} > The heap space fills up within 20 loops on my cluster. Increasing the number > of cores to 50 in the above example results in heap space error after 12 > contexts. > Dumping the heap reveals many equally sized "CoarseMesosSchedulerBackend" > objects (see attachments). Digging into the inner objects tells me that the > `executorDataMap` is where 99% of the data in said object is stored. I do > believe though that this is beside the point as I'd expect this whole object > to be garbage collected or freed on sc.stop(). > Additionally I can see in the Spark web UI that each time a new context is > created the number of the "SQL" tab increments by one (i.e. last iteration > would have SQL99). After doing stop and creating a completely new context I > was expecting this number to be reset to 1 ("SQL"). > I'm submitting the jar file with `spark-submit` and no special flags. The > cluster is running Mesos 0.23. I'm running Spark 1.6.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org