Re: RDD and Dataframes

2016-07-15 Thread Taotao.Li
f its structure. It > doesn't > convert to RDD but uses RDD partitions to produce logical plan. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/RDD-and-Dataframes-tp27306p27346.html > Sent from the Apache Spark Us

Re: RDD and Dataframes

2016-07-15 Thread RK Aduri
DataFrames uses RDDs as internal implementation of its structure. It doesn't convert to RDD but uses RDD partitions to produce logical plan. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/RDD-and-Dataframes-tp27306p27346.html Sent from the Apache Spark

Re: RDD and Dataframes

2016-07-07 Thread Bruno Costa
>> step >> it will be transformed into a RDD to be executed in Spark? >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/RDD-and-Dataframes-tp27306.html >> Sent from the Apache Spark User List mail

Re: RDD and Dataframes

2016-07-07 Thread Rishi Mishra
t: > http://apache-spark-user-list.1001560.n3.nabble.com/RDD-and-Dataframes-tp27306.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >

RDD and Dataframes

2016-07-07 Thread brccosta
, in the final step it will be transformed into a RDD to be executed in Spark? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/RDD-and-Dataframes-tp27306.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: is there any significant performance issue converting between rdd and dataframes in pyspark?

2015-07-02 Thread Davies Liu
On Mon, Jun 29, 2015 at 1:27 PM, Axel Dahl a...@whisperstream.com wrote: In pyspark, when I convert from rdds to dataframes it looks like the rdd is being materialized/collected/repartitioned before it's converted to a dataframe. It's not true. When converting a RDD to dataframe, it only take

is there any significant performance issue converting between rdd and dataframes in pyspark?

2015-06-29 Thread Axel Dahl
In pyspark, when I convert from rdds to dataframes it looks like the rdd is being materialized/collected/repartitioned before it's converted to a dataframe. Just wondering if there's any guidelines for doing this conversion and whether it's best to do it early to get the performance benefits of