Re: Union of multiple data frames

2018-04-06 Thread Alessandro Solimando
Date: Thursday, April 5, 2018 at 11:23 AM >> To: Cesar <ces...@gmail.com>, "user @spark" <user@spark.apache.org> >> Subject: Re: Union of multiple data frames >> >> Maybe something like >> >> >> >> var finalDF = spark.sqlContex

Re: Union of multiple data frames

2018-04-05 Thread Cesar
il.com>, "user @spark" <user@spark.apache.org> > Subject: Re: Union of multiple data frames > > Maybe something like > > > > var finalDF = spark.sqlContext.emptyDataFrame > > for (df <- dfs){ > > finalDF = finalDF.union(df) > > } > &

Re: Union of multiple data frames

2018-04-05 Thread Andy Davidson
Hi Ceasar I have used Brandson approach in the past with out any problem Andy From: Brandon Geise <brandonge...@gmail.com> Date: Thursday, April 5, 2018 at 11:23 AM To: Cesar <ces...@gmail.com>, "user @spark" <user@spark.apache.org> Subject: Re: Union of m

Re: Union of multiple data frames

2018-04-05 Thread Brandon Geise
mal way to perform a union of multiple data frames? thanks -- Cesar Flores

Union of multiple data frames

2018-04-05 Thread Cesar
The following code works for small n, but not for large n (>20): val dfUnion = Seq(df1,df2,df3,...dfn).reduce(_ union _) dfUnion.show() By not working, I mean that Spark takes a lot of time to create the execution plan. *Is there a more optimal way to perform a union of multiple data fra