Date: Thursday, April 5, 2018 at 11:23 AM
>> To: Cesar <ces...@gmail.com>, "user @spark" <user@spark.apache.org>
>> Subject: Re: Union of multiple data frames
>>
>> Maybe something like
>>
>>
>>
>> var finalDF = spark.sqlContex
il.com>, "user @spark" <user@spark.apache.org>
> Subject: Re: Union of multiple data frames
>
> Maybe something like
>
>
>
> var finalDF = spark.sqlContext.emptyDataFrame
>
> for (df <- dfs){
>
> finalDF = finalDF.union(df)
>
> }
>
&
Hi Ceasar
I have used Brandson approach in the past with out any problem
Andy
From: Brandon Geise <brandonge...@gmail.com>
Date: Thursday, April 5, 2018 at 11:23 AM
To: Cesar <ces...@gmail.com>, "user @spark" <user@spark.apache.org>
Subject: Re: Union of m
mal way to perform a union of multiple data frames?
thanks
--
Cesar Flores
The following code works for small n, but not for large n (>20):
val dfUnion = Seq(df1,df2,df3,...dfn).reduce(_ union _)
dfUnion.show()
By not working, I mean that Spark takes a lot of time to create the
execution plan.
*Is there a more optimal way to perform a union of multiple data fra