[ https://issues.apache.org/jira/browse/SPARK-11700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15012292#comment-15012292 ]
Yin Huai commented on SPARK-11700: ---------------------------------- [~zsxwing] take a look? > Memory leak at SparkContext jobProgressListener stageIdToData map > ----------------------------------------------------------------- > > Key: SPARK-11700 > URL: https://issues.apache.org/jira/browse/SPARK-11700 > Project: Spark > Issue Type: Bug > Components: Spark Core, SQL > Affects Versions: 1.5.0, 1.5.1, 1.5.2 > Environment: Ubuntu 14.04 LTS, Oracle JDK 1.8.51 Apache tomcat > 8.0.28. Spring 4 > Reporter: Kostas papageorgopoulos > Priority: Critical > Labels: leak, memory-leak > Attachments: AbstractSparkJobRunner.java, > SparkContextPossibleMemoryLeakIDEA_DEBUG.png, SparkHeapSpaceProgress.png, > SparkMemoryAfterLotsOfConsecutiveRuns.png, > SparkMemoryLeakAfterLotsOfRunsWithinTheSameContext.png > > > it seems that there is A SparkContext jobProgressListener memory leak.*. > Bellow i describe the steps i do to reproduce that. > I have created a java webapp trying to abstractly Run some Spark Sql jobs > that read data from HDFS (join them) and Write them To ElasticSearch using ES > hadoop connector. After a Lot of consecutive runs i noticed that my heap > space was full so i got an out of heap space error. > At the attached file {code} AbstractSparkJobRunner {code} the {code} public > final void run(T jobConfiguration, ExecutionLog executionLog) throws > Exception {code} runs each time an Spark Sql Job is triggered. So tried to > reuse the same SparkContext for a number of consecutive runs. If some rules > apply i try to clean up the SparkContext by first calling {code} > killSparkAndSqlContext {code}. This code eventually runs {code} synchronized > (sparkContextThreadLock) { > if (javaSparkContext != null) { > LOGGER.info("!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! CLEARING SPARK > CONTEXT!!!!!!!!!!!!!!!!!!!!!!!!!!!"); > javaSparkContext.stop(); > javaSparkContext = null; > sqlContext = null; > System.gc(); > } > numberOfRunningJobsForSparkContext.getAndSet(0); > } > {code}. > So at some point in time i suppose that if no other SparkSql job should run i > should kill the sparkContext (The > AbstractSparkJobRunner.killSparkAndSqlContext runs) and this should be > garbage collected from garbage collector. However this is not the case, Even > if in my debugger shows that my JavaSparkContext object is null see attached > picture {code} SparkContextPossibleMemoryLeakIDEA_DEBUG.png {code}. > The jvisual vm shows an incremental heap space even when the garbage > collector is called. See attached picture {code} SparkHeapSpaceProgress.png > {code}. > The memory analyser Tool shows that a big part of the retained heap to be > assigned to _jobProgressListener see attached picture {code} > SparkMemoryAfterLotsOfConsecutiveRuns.png {code} and summary picture {code} > SparkMemoryLeakAfterLotsOfRunsWithinTheSameContext.png {code}. Although at > the same time in Singleton Service the JavaSparkContext is null. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org