Re: Difference between Dataframe and RDD Persisting

2016-06-27 Thread Jörn Franke
Dataframe uses a more efficient binary representation to store and persist data. You should go for that one in most of the cases. Rdd is slower. > On 27 Jun 2016, at 07:54, Brandon White wrote: > > What is the difference between persisting a dataframe and a rdd? When I

Difference between Dataframe and RDD Persisting

2016-06-26 Thread Brandon White
What is the difference between persisting a dataframe and a rdd? When I persist my RDD, the UI says it takes 50G or more of memory. When I persist my dataframe, the UI says it takes 9G or less of memory. Does the dataframe not persist the actual content? Is it better / faster to persist a RDD