AFAIK cache() is just a shortcut to the persist method with "MEMORY_ONLY"
as storage level..

from the source code of RDD:

>  /** Persist this RDD with the default storage level (`MEMORY_ONLY`). */
>   def persist(): RDD[T] = persist(StorageLevel.MEMORY_ONLY)
>
>   /** Persist this RDD with the default storage level (`MEMORY_ONLY`). */
>   def cache(): RDD[T] = persist()
>


2014-04-13 16:26 GMT+02:00 Joe L <selme...@yahoo.com>:

>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/what-is-the-difference-between-persist-and-cache-tp4181.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Reply via email to