Here's a talk I gave on the topic:

https://www.youtube.com/watch?v=i7l3JQRx7Qw
http://www.slideshare.net/SparkSummit/structuring-spark-dataframes-datasets-and-streaming-by-michael-armbrust

On Mon, Jun 13, 2016 at 4:01 AM, Arun Patel <arunp.bigd...@gmail.com> wrote:

> In Spark 2.0, DataFrames and Datasets are unified. DataFrame is simply an
> alias for a Dataset of type row.   I have few questions.
>
> 1) What does this really mean to an Application developer?
> 2) Why this unification was needed in Spark 2.0?
> 3) What changes can be observed in Spark 2.0 vs Spark 1.6?
> 4) Compile time safety will be there for DataFrames too?
> 5) Python API is supported for Datasets in 2.0?
>
> Thanks
> Arun
>

Reply via email to