In Both the cases, I am trying to create a HIVE table based on Union on 2 same queries.
Not sure how internally it differs on the process of creation of HIVE table? Regards, Neeraj On Sun, Mar 31, 2019 at 1:29 PM Jörn Franke <jornfra...@gmail.com> wrote: > Is the select taking longer or the saving to a file. You seem to only save > in the second case to a file > > Am 29.03.2019 um 15:10 schrieb neeraj bhadani <bhadani.neeraj...@gmail.com > >: > > Hi Team, > I am executing same spark code using the Spark SQL API and DataFrame > API, however, Spark SQL is taking longer than expected. > > PFB Sudo code. > > ----------------------------------------------------------------------------------------------- > > Case 1 : Spark SQL > > > ----------------------------------------------------------------------------------------------- > > %sql > > CREATE TABLE <tbl_name> > > AS > > > WITH <table_1> AS ( > > <qry1> > > ) > > ,<table_2> AS ( > > <qry2> > > ) > > > SELECT * FROM <table_1> > > UNION ALL > > SELECT * FROM <table_2> > > > > ----------------------------------------------------------------------------------------------- > > Case 2 : DataFrame API > > > ----------------------------------------------------------------------------------------------- > > > df1 = spark.sql(<qry1>) > > df2 = spark.sql(<qry2>) > > df3 = df1.union(df2) > > df3.write.saveAsTable(<table_name>) > > > ----------------------------------------------------------------------------------------------- > > > As per my understanding, both Spark SQL and DtaaFrame API generate the > same code under the hood and execution time has to be similar. > > > Regards, > > Neeraj > > >