Re: Use case for RDD and Data Frame

2016-02-16 Thread Chandeep Singh
> > From: Chandeep Singh [mailto:c...@chandeep.com] > Sent: 16 February 2016 18:17 > To: Mich Talebzadeh <m...@peridale.co.uk> > Cc: Ashok Kumar <ashok34...@yahoo.com>; User <user@spark.apache.org> > Subject: Re: Use case for RDD and Data Frame > > He

RE: Use case for RDD and Data Frame

2016-02-16 Thread Mich Talebzadeh
Singh [mailto:c...@chandeep.com] Sent: 16 February 2016 18:17 To: Mich Talebzadeh <m...@peridale.co.uk> Cc: Ashok Kumar <ashok34...@yahoo.com>; User <user@spark.apache.org> Subject: Re: Use case for RDD and Data Frame Here is another interesting post. http://www.kdnuggets.com/

Re: Use case for RDD and Data Frame

2016-02-16 Thread Chandeep Singh
Here is another interesting post. http://www.kdnuggets.com/2016/02/apache-spark-rdd-dataframe-dataset.html?utm_content=buffer31ce5_medium=social_source=twitter.com_campaign=buffer

RE: Use case for RDD and Data Frame

2016-02-16 Thread Mich Talebzadeh
Hi, A Resilient Distributed Dataset (RDD) is a heap of data distributed among all nodes of cluster. It is basically raw data and that is all about it with little optimization on it. Remember data is not much of a value until it is turned into information. On the other hand a DataFrame

Re: Use case for RDD and Data Frame

2016-02-16 Thread Andy Grove
This blog post should be helpful http://www.agildata.com/apache-spark-rdd-vs-dataframe-vs-dataset/ Thanks, Andy. -- Andy Grove Chief Architect AgilData - Simple Streaming SQL that Scales www.agildata.com On Tue, Feb 16, 2016 at 9:05 AM, Ashok Kumar wrote: >