---------- Forwarded message ---------- From: Liquan Pei <liquan...@gmail.com> Date: Fri, Sep 26, 2014 at 1:33 AM Subject: Re: Spark SQL question: is cached SchemaRDD storage controlled by "spark.storage.memoryFraction"? To: Haopu Wang <hw...@qilinsoft.com>
Hi Haopu, Internally, cactheTable on a schemaRDD is implemented as a cache() on a MapPartitionsRDD. As memory reserved for caching RDDs is controlled by spark.storage.memoryFraction, memory storage of cached schemaRDD is controlled by spark.storage.memoryFraction. Hope this helps! Liquan On Fri, Sep 26, 2014 at 1:04 AM, Haopu Wang <hw...@qilinsoft.com> wrote: > Hi, I'm querying a big table using Spark SQL. I see very long GC time in > some stages. I wonder if I can improve it by tuning the storage > parameter. > > The question is: the schemaRDD has been cached with "cacheTable()" > function. So is the cached schemaRDD part of memory storage controlled > by the "spark.storage.memoryFraction" parameter? > > Thanks! > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > -- Liquan Pei Department of Physics University of Massachusetts Amherst -- Liquan Pei Department of Physics University of Massachusetts Amherst