Hi Team,
   I am executing same spark code using the Spark SQL API and DataFrame
API, however, Spark SQL is taking longer than expected.

PFB Sudo code.
-----------------------------------------------------------------------------------------------

Case 1 : Spark SQL

-----------------------------------------------------------------------------------------------

%sql

CREATE TABLE <tbl_name>

AS


 WITH <table_1> AS (

     <qry1>

)

,<table_2> AS (

     <qry2>

     )


SELECT * FROM <table_1>

UNION ALL

SELECT * FROM <table_2>


-----------------------------------------------------------------------------------------------

Case  2 : DataFrame API

-----------------------------------------------------------------------------------------------


df1 = spark.sql(<qry1>)

df2 = spark.sql(<qry2>)

df3 = df1.union(df2)

df3.write.saveAsTable(<table_name>)

-----------------------------------------------------------------------------------------------


As per my understanding, both Spark SQL and DtaaFrame API generate the same
code under the hood and execution time has to be similar.


Regards,

Neeraj

Reply via email to