It won't be transparent, but you can do so something like:

CACHE TABLE newData AS SELECT * FROM allData WHERE date > "..."

and then query newData.

On Fri, Oct 24, 2014 at 12:06 PM, Sadhan Sood <sadhan.s...@gmail.com> wrote:

> Is there a way to cache certain (or most latest) partitions of certain
> tables ?
>
> On Fri, Oct 24, 2014 at 2:35 PM, Michael Armbrust <mich...@databricks.com>
> wrote:
>
>> It does have support for caching using either CACHE TABLE <tablename> or
>> CACHE TABLE <tablename> AS SELECT ....
>>
>> On Fri, Oct 24, 2014 at 1:05 AM, ankits <ankitso...@gmail.com> wrote:
>>
>>> I want to set up spark SQL to allow ad hoc querying over the last X days
>>> of
>>> processed data, where the data is processed through spark. This would
>>> also
>>> have to cache data (in memory only), so the approach I was thinking of
>>> was
>>> to build a layer that persists the appropriate RDDs and stores them in
>>> memory.
>>>
>>> I see spark sql allows ad hoc querying through JDBC though I have never
>>> used
>>> that before. Will using JDBC offer any advantages (e.g does it have
>>> built in
>>> support for caching?) over rolling my own solution for this use case?
>>>
>>> Thanks!
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Is-SparkSQL-JDBC-server-a-good-approach-for-caching-tp17196.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>

Reply via email to