Re: [SQL] Memory leak with spark streaming and spark sql in spark 1.5.1
Thanks for reporting it Terry. I submitted a PR to fix it: https://github.com/apache/spark/pull/9132 Best Regards, Shixiong Zhu 2015-10-15 2:39 GMT+08:00 Reynold Xin: > +dev list > > On Wed, Oct 14, 2015 at 1:07 AM, Terry Hoo wrote: > >> All, >> >> Does anyone meet memory leak issue with spark streaming and spark sql in >> spark 1.5.1? I can see the memory is increasing all the time when running >> this simple sample: >> >> val sc = new SparkContext(conf) >> val sqlContext = new HiveContext(sc) >> import sqlContext.implicits._ >> val ssc = new StreamingContext(sc, Seconds(1)) >> val s1 = ssc.socketTextStream("localhost", ).map(x => >> (x,1)).reduceByKey((x : Int, y : Int) => x + y) >> s1.print >> s1.foreachRDD(rdd => { >> rdd.foreach(_ => Unit) >> sqlContext.createDataFrame(rdd).registerTempTable("A") >> sqlContext.sql("""select * from A""").show(1) >> }) >> >> After dump the the java heap, I can see there is about 22K entries >> in SQLListener._stageIdToStageMetrics after 2 hour running (other maps in >> this SQLListener has about 1K entries), is this a leak in SQLListener? >> >> Thanks! >> Terry >> > >
Re: [SQL] Memory leak with spark streaming and spark sql in spark 1.5.1
+dev list On Wed, Oct 14, 2015 at 1:07 AM, Terry Hoowrote: > All, > > Does anyone meet memory leak issue with spark streaming and spark sql in > spark 1.5.1? I can see the memory is increasing all the time when running > this simple sample: > > val sc = new SparkContext(conf) > val sqlContext = new HiveContext(sc) > import sqlContext.implicits._ > val ssc = new StreamingContext(sc, Seconds(1)) > val s1 = ssc.socketTextStream("localhost", ).map(x => > (x,1)).reduceByKey((x : Int, y : Int) => x + y) > s1.print > s1.foreachRDD(rdd => { > rdd.foreach(_ => Unit) > sqlContext.createDataFrame(rdd).registerTempTable("A") > sqlContext.sql("""select * from A""").show(1) > }) > > After dump the the java heap, I can see there is about 22K entries > in SQLListener._stageIdToStageMetrics after 2 hour running (other maps in > this SQLListener has about 1K entries), is this a leak in SQLListener? > > Thanks! > Terry >
[SQL] Memory leak with spark streaming and spark sql in spark 1.5.1
All, Does anyone meet memory leak issue with spark streaming and spark sql in spark 1.5.1? I can see the memory is increasing all the time when running this simple sample: val sc = new SparkContext(conf) val sqlContext = new HiveContext(sc) import sqlContext.implicits._ val ssc = new StreamingContext(sc, Seconds(1)) val s1 = ssc.socketTextStream("localhost", ).map(x => (x,1)).reduceByKey((x : Int, y : Int) => x + y) s1.print s1.foreachRDD(rdd => { rdd.foreach(_ => Unit) sqlContext.createDataFrame(rdd).registerTempTable("A") sqlContext.sql("""select * from A""").show(1) }) After dump the the java heap, I can see there is about 22K entries in SQLListener._stageIdToStageMetrics after 2 hour running (other maps in this SQLListener has about 1K entries), is this a leak in SQLListener? Thanks! Terry