>
> From: Chandeep Singh [mailto:c...@chandeep.com]
> Sent: 16 February 2016 18:17
> To: Mich Talebzadeh <m...@peridale.co.uk>
> Cc: Ashok Kumar <ashok34...@yahoo.com>; User <user@spark.apache.org>
> Subject: Re: Use case for RDD and Data Frame
>
> He
Singh [mailto:c...@chandeep.com]
Sent: 16 February 2016 18:17
To: Mich Talebzadeh <m...@peridale.co.uk>
Cc: Ashok Kumar <ashok34...@yahoo.com>; User <user@spark.apache.org>
Subject: Re: Use case for RDD and Data Frame
Here is another interesting post.
http://www.kdnuggets.com/
Here is another interesting post.
http://www.kdnuggets.com/2016/02/apache-spark-rdd-dataframe-dataset.html?utm_content=buffer31ce5_medium=social_source=twitter.com_campaign=buffer
Hi,
A Resilient Distributed Dataset (RDD) is a heap of data distributed among all
nodes of cluster. It is basically raw data and that is all about it with little
optimization on it. Remember data is not much of a value until it is turned
into information.
On the other hand a DataFrame
This blog post should be helpful
http://www.agildata.com/apache-spark-rdd-vs-dataframe-vs-dataset/
Thanks,
Andy.
--
Andy Grove
Chief Architect
AgilData - Simple Streaming SQL that Scales
www.agildata.com
On Tue, Feb 16, 2016 at 9:05 AM, Ashok Kumar
wrote:
>