Try grouping sets. On Sun, Feb 19, 2017 at 8:23 AM, Patrick <titlibat...@gmail.com> wrote:
> Hi, > > I have read 5 columns from parquet into data frame. My queries on the > parquet table is of below type: > > val df1 = sqlContext.sql(select col1,col2,count(*) from table groupby > col1,col2) > val df2 = sqlContext.sql(select col1,col3,count(*) from table groupby > col1,col3) > val df3 = sqlContext.sql(select col1,col4,count(*) from table groupby > col1,col4) > val df4 = sqlContext.sql(select col1,col5,count(*) from table groupby > col1,col5) > > And then i require to union the results from df1 to df4 into a single df. > > > So basically, only the second column is changing, Is there any efficient > way to write the above queries in Spark-Sql instead of writing 4 different > queries(OR in loop) and doing union to get the result. > > > Thanks > > > > > > -- Best Regards, Ayan Guha