Hi Everyone,

I have complex SQL with approx 2000 lines of code and works with 50+ tables
with 50+ left joins and transformations. All the tables are fully cached in
Memory with sufficient storage memory and working memory. The issue is
after the launch of the query for the execution; the query takes
approximately 40 seconds to appear in the Jobs/SQL in the application UI.

While the execution takes only 25 seconds; the execution is delayed by 40
seconds by the scheduler so the total runtime of the query becomes 65
seconds(40s + 25s). Also, there are enough cores available during this wait
time. I couldn't figure out why DAG scheduler is delaying the execution by
40 seconds. Is this due to time taken for Query Parsing and Query Planning
for the Complex SQL? If thats the case; how do we optimize this Query
Parsing and Query Planning time in Spark? Any help would be helpful.


Thanks

Sathish

Reply via email to