>From the spark
documentation(http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistence)
yes you can use persist on a dataframe instead of cache. All cache is, is a
shorthand for the default persist storage level "MEMORY_ONLY". If you want
to persist the dataframe to disk you should do
dataframe.persist(StorageLevel.DISK_ONLY). 

IMO If reads are expensive against the DB and your afraid of failure why not
just save the data as a parquet on your cluster in hive and read from there?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-DF-CacheTable-method-Will-it-save-data-to-disk-tp27533p27551.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to