Hi All, Can anyone help me here with my query? Regards, Neeraj
On Mon, Apr 1, 2019 at 9:44 AM neeraj bhadani <bhadani.neeraj...@gmail.com> wrote: > In Both the cases, I am trying to create a HIVE table based on Union on 2 > same queries. > > Not sure how internally it differs on the process of creation of HIVE > table? > > Regards, > Neeraj > > On Sun, Mar 31, 2019 at 1:29 PM Jörn Franke <jornfra...@gmail.com> wrote: > >> Is the select taking longer or the saving to a file. You seem to only >> save in the second case to a file >> >> Am 29.03.2019 um 15:10 schrieb neeraj bhadani < >> bhadani.neeraj...@gmail.com>: >> >> Hi Team, >> I am executing same spark code using the Spark SQL API and DataFrame >> API, however, Spark SQL is taking longer than expected. >> >> PFB Sudo code. >> >> ----------------------------------------------------------------------------------------------- >> >> Case 1 : Spark SQL >> >> >> ----------------------------------------------------------------------------------------------- >> >> %sql >> >> CREATE TABLE <tbl_name> >> >> AS >> >> >> WITH <table_1> AS ( >> >> <qry1> >> >> ) >> >> ,<table_2> AS ( >> >> <qry2> >> >> ) >> >> >> SELECT * FROM <table_1> >> >> UNION ALL >> >> SELECT * FROM <table_2> >> >> >> >> ----------------------------------------------------------------------------------------------- >> >> Case 2 : DataFrame API >> >> >> ----------------------------------------------------------------------------------------------- >> >> >> df1 = spark.sql(<qry1>) >> >> df2 = spark.sql(<qry2>) >> >> df3 = df1.union(df2) >> >> df3.write.saveAsTable(<table_name>) >> >> >> ----------------------------------------------------------------------------------------------- >> >> >> As per my understanding, both Spark SQL and DtaaFrame API generate the >> same code under the hood and execution time has to be similar. >> >> >> Regards, >> >> Neeraj >> >> >>