Spark SQL question: is cached SchemaRDD storage controlled by spark.storage.memoryFraction?

2014-09-26 Thread Haopu Wang
Hi, I'm querying a big table using Spark SQL. I see very long GC time in some stages. I wonder if I can improve it by tuning the storage parameter. The question is: the schemaRDD has been cached with cacheTable() function. So is the cached schemaRDD part of memory storage controlled by the

Re: Spark SQL question: is cached SchemaRDD storage controlled by spark.storage.memoryFraction?

2014-09-26 Thread Cheng Lian
Yes it is. The in-memory storage used with |SchemaRDD| also uses |RDD.cache()| under the hood. On 9/26/14 4:04 PM, Haopu Wang wrote: Hi, I'm querying a big table using Spark SQL. I see very long GC time in some stages. I wonder if I can improve it by tuning the storage parameter. The

Fwd: Spark SQL question: is cached SchemaRDD storage controlled by spark.storage.memoryFraction?

2014-09-26 Thread Liquan Pei
-- Forwarded message -- From: Liquan Pei liquan...@gmail.com Date: Fri, Sep 26, 2014 at 1:33 AM Subject: Re: Spark SQL question: is cached SchemaRDD storage controlled by spark.storage.memoryFraction? To: Haopu Wang hw...@qilinsoft.com Hi Haopu, Internally, cactheTable