It won't be transparent, but you can do so something like: CACHE TABLE newData AS SELECT * FROM allData WHERE date > "..."
and then query newData. On Fri, Oct 24, 2014 at 12:06 PM, Sadhan Sood <sadhan.s...@gmail.com> wrote: > Is there a way to cache certain (or most latest) partitions of certain > tables ? > > On Fri, Oct 24, 2014 at 2:35 PM, Michael Armbrust <mich...@databricks.com> > wrote: > >> It does have support for caching using either CACHE TABLE <tablename> or >> CACHE TABLE <tablename> AS SELECT .... >> >> On Fri, Oct 24, 2014 at 1:05 AM, ankits <ankitso...@gmail.com> wrote: >> >>> I want to set up spark SQL to allow ad hoc querying over the last X days >>> of >>> processed data, where the data is processed through spark. This would >>> also >>> have to cache data (in memory only), so the approach I was thinking of >>> was >>> to build a layer that persists the appropriate RDDs and stores them in >>> memory. >>> >>> I see spark sql allows ad hoc querying through JDBC though I have never >>> used >>> that before. Will using JDBC offer any advantages (e.g does it have >>> built in >>> support for caching?) over rolling my own solution for this use case? >>> >>> Thanks! >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Is-SparkSQL-JDBC-server-a-good-approach-for-caching-tp17196.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> >