[jira] [Commented] (SPARK-6844) Memory leak occurs when register temp table with cache table on

2015-04-23 Thread Jack Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508597#comment-14508597
 ] 

Jack Hu commented on SPARK-6844:


Hi, [~marmbrus]]
I mean 1.3.X, like 1.3.2. The master seems not much different with branch 1.3 
(May be i am wrong)

> Memory leak occurs when register temp table with cache table on
> ---
>
> Key: SPARK-6844
> URL: https://issues.apache.org/jira/browse/SPARK-6844
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.3.0
>Reporter: Jack Hu
>  Labels: Memory, SQL
> Fix For: 1.4.0
>
>
> There is a memory leak in register temp table with cache on
> This is the simple code to reproduce this issue:
> {code}
> val sparkConf = new SparkConf().setAppName("LeakTest")
> val sparkContext = new SparkContext(sparkConf)
> val sqlContext = new SQLContext(sparkContext)
> val tableName = "tmp"
> val jsonrdd = sparkContext.textFile("""sample.json""")
> var loopCount = 1L
> while(true) {
>   sqlContext.jsonRDD(jsonrdd).registerTempTable(tableName)
>   sqlContext.cacheTable(tableName)
>   println("L: " +loopCount + " R:" + sqlContext.sql("""select count(*) 
> from tmp""").count())
>   sqlContext.uncacheTable(tableName)
>   loopCount += 1
> }
> {code}
> The cause is that the {{InMemoryRelation}}. {{InMemoryColumnarTableScan}} 
> uses the accumulator 
> ({{InMemoryRelation.batchStats}},{{InMemoryColumnarTableScan.readPartitions}},
>  {{InMemoryColumnarTableScan.readBatches}} ) to get some information from 
> partitions or for test. These accumulators will register itself into a static 
> map in {{Accumulators.originals}} and never get cleaned up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6844) Memory leak occurs when register temp table with cache table on

2015-04-16 Thread Michael Armbrust (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498587#comment-14498587
 ] 

Michael Armbrust commented on SPARK-6844:
-

I was not planning to.  I do not think that it is a regression from 1.2 and it 
is a little risky to backport changes to the way we initialize cached relations.

> Memory leak occurs when register temp table with cache table on
> ---
>
> Key: SPARK-6844
> URL: https://issues.apache.org/jira/browse/SPARK-6844
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.3.0
>Reporter: Jack Hu
>  Labels: Memory, SQL
> Fix For: 1.4.0
>
>
> There is a memory leak in register temp table with cache on
> This is the simple code to reproduce this issue:
> {code}
> val sparkConf = new SparkConf().setAppName("LeakTest")
> val sparkContext = new SparkContext(sparkConf)
> val sqlContext = new SQLContext(sparkContext)
> val tableName = "tmp"
> val jsonrdd = sparkContext.textFile("""sample.json""")
> var loopCount = 1L
> while(true) {
>   sqlContext.jsonRDD(jsonrdd).registerTempTable(tableName)
>   sqlContext.cacheTable(tableName)
>   println("L: " +loopCount + " R:" + sqlContext.sql("""select count(*) 
> from tmp""").count())
>   sqlContext.uncacheTable(tableName)
>   loopCount += 1
> }
> {code}
> The cause is that the {{InMemoryRelation}}. {{InMemoryColumnarTableScan}} 
> uses the accumulator 
> ({{InMemoryRelation.batchStats}},{{InMemoryColumnarTableScan.readPartitions}},
>  {{InMemoryColumnarTableScan.readBatches}} ) to get some information from 
> partitions or for test. These accumulators will register itself into a static 
> map in {{Accumulators.originals}} and never get cleaned up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6844) Memory leak occurs when register temp table with cache table on

2015-04-15 Thread Jack Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14497627#comment-14497627
 ] 

Jack Hu commented on SPARK-6844:


Hi, [~marmbrus]

Do we have a plan to port this to 1.3.X branch? 


> Memory leak occurs when register temp table with cache table on
> ---
>
> Key: SPARK-6844
> URL: https://issues.apache.org/jira/browse/SPARK-6844
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.3.0
>Reporter: Jack Hu
>  Labels: Memory, SQL
> Fix For: 1.4.0
>
>
> There is a memory leak in register temp table with cache on
> This is the simple code to reproduce this issue:
> {code}
> val sparkConf = new SparkConf().setAppName("LeakTest")
> val sparkContext = new SparkContext(sparkConf)
> val sqlContext = new SQLContext(sparkContext)
> val tableName = "tmp"
> val jsonrdd = sparkContext.textFile("""sample.json""")
> var loopCount = 1L
> while(true) {
>   sqlContext.jsonRDD(jsonrdd).registerTempTable(tableName)
>   sqlContext.cacheTable(tableName)
>   println("L: " +loopCount + " R:" + sqlContext.sql("""select count(*) 
> from tmp""").count())
>   sqlContext.uncacheTable(tableName)
>   loopCount += 1
> }
> {code}
> The cause is that the {{InMemoryRelation}}. {{InMemoryColumnarTableScan}} 
> uses the accumulator 
> ({{InMemoryRelation.batchStats}},{{InMemoryColumnarTableScan.readPartitions}},
>  {{InMemoryColumnarTableScan.readBatches}} ) to get some information from 
> partitions or for test. These accumulators will register itself into a static 
> map in {{Accumulators.originals}} and never get cleaned up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6844) Memory leak occurs when register temp table with cache table on

2015-04-11 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491166#comment-14491166
 ] 

Apache Spark commented on SPARK-6844:
-

User 'viirya' has created a pull request for this issue:
https://github.com/apache/spark/pull/5475

> Memory leak occurs when register temp table with cache table on
> ---
>
> Key: SPARK-6844
> URL: https://issues.apache.org/jira/browse/SPARK-6844
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.3.0
>Reporter: Jack Hu
>  Labels: Memory, SQL
>
> There is a memory leak in register temp table with cache on
> This is the simple code to reproduce this issue:
> {code}
> val sparkConf = new SparkConf().setAppName("LeakTest")
> val sparkContext = new SparkContext(sparkConf)
> val sqlContext = new SQLContext(sparkContext)
> val tableName = "tmp"
> val jsonrdd = sparkContext.textFile("""sample.json""")
> var loopCount = 1L
> while(true) {
>   sqlContext.jsonRDD(jsonrdd).registerTempTable(tableName)
>   sqlContext.cacheTable(tableName)
>   println("L: " +loopCount + " R:" + sqlContext.sql("""select count(*) 
> from tmp""").count())
>   sqlContext.uncacheTable(tableName)
>   loopCount += 1
> }
> {code}
> The cause is that the {{InMemoryRelation}}. {{InMemoryColumnarTableScan}} 
> uses the accumulator 
> ({{InMemoryRelation.batchStats}},{{InMemoryColumnarTableScan.readPartitions}},
>  {{InMemoryColumnarTableScan.readBatches}} ) to get some information from 
> partitions or for test. These accumulators will register itself into a static 
> map in {{Accumulators.originals}} and never get cleaned up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org