Thanks Michael. I went thru these slides already and could not find answers for these specific questions.
I created a Dataset and converted it to DataFrame in 1.6 and 2.0. I don't see any difference in 1.6 vs 2.0. So, I really got confused and asked these questions about unification. Appreciate if you can answer these specific questions. Thank you very much! On Mon, Jun 13, 2016 at 2:55 PM, Michael Armbrust <mich...@databricks.com> wrote: > Here's a talk I gave on the topic: > > https://www.youtube.com/watch?v=i7l3JQRx7Qw > > http://www.slideshare.net/SparkSummit/structuring-spark-dataframes-datasets-and-streaming-by-michael-armbrust > > On Mon, Jun 13, 2016 at 4:01 AM, Arun Patel <arunp.bigd...@gmail.com> > wrote: > >> In Spark 2.0, DataFrames and Datasets are unified. DataFrame is simply an >> alias for a Dataset of type row. I have few questions. >> >> 1) What does this really mean to an Application developer? >> 2) Why this unification was needed in Spark 2.0? >> 3) What changes can be observed in Spark 2.0 vs Spark 1.6? >> 4) Compile time safety will be there for DataFrames too? >> 5) Python API is supported for Datasets in 2.0? >> >> Thanks >> Arun >> > >