Top of my head, I can think of the zip operation that RDD provides. So for 
example, if you have two DataFrames df1 and df2, you could do something like 
this:

val newDF = df1.rdd.zip(df2.rdd).map { case(rowFromDf1, rowFromDf2) => 
(....)}.toDF(...)

Couple of things to keep in mind:

1)      Both df1 and df2 should have the same number of rows.

2)      You are assuming that row N from df1 is related to row N from df2.

Mohammed
Author: Big Data Analytics with 
Spark<http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>

From: spR [mailto:data.smar...@gmail.com]
Sent: Wednesday, June 15, 2016 4:08 PM
To: Mohammed Guller
Cc: Natu Lauchande; user
Subject: Re: concat spark dataframes

Hey,

There are quite a lot of fields. But, there are no common fields between the 2 
dataframes. Can I not concatenate the 2 frames like we can do in pandas such 
that the resulting dataframe has columns from both the dataframes?

Thank you.

Regards,
Misha



On Wed, Jun 15, 2016 at 3:44 PM, Mohammed Guller 
<moham...@glassbeam.com<mailto:moham...@glassbeam.com>> wrote:
Hi Misha,
What is the schema for both the DataFrames? And what is the expected schema of 
the resulting DataFrame?

Mohammed
Author: Big Data Analytics with 
Spark<http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>

From: Natu Lauchande [mailto:nlaucha...@gmail.com<mailto:nlaucha...@gmail.com>]
Sent: Wednesday, June 15, 2016 2:07 PM
To: spR
Cc: user
Subject: Re: concat spark dataframes

Hi,
You can select the common collumns and use DataFrame.union all .
Regards,
Natu

On Wed, Jun 15, 2016 at 8:57 PM, spR 
<data.smar...@gmail.com<mailto:data.smar...@gmail.com>> wrote:
hi,

how to concatenate spark dataframes? I have 2 frames with certain columns. I 
want to get a dataframe with columns from both the other frames.

Regards,
Misha


Reply via email to