You should use `df.cache()`
`df.rdd.cache()` won't work, because `df.rdd` generate a new RDD from the
original `df`. and then cache the new RDD.

On Fri, Oct 13, 2017 at 3:35 PM, Supun Nakandala <supun.nakand...@gmail.com>
wrote:

> Hi all,
>
> I have been experimenting with cache/persist/unpersist methods with
> respect to both Dataframes and RDD APIs. However, I am experiencing
> different behaviors Ddataframe API compared RDD API such Dataframes are not
> getting cached when count() is called.
>
> Is there a difference between how these operations act wrt to Dataframe
> and RDD APIs?
>
> Thank You.
> -Supun
>

Reply via email to