Hi Everyone, I have complex SQL with approx 2000 lines of code and works with 50+ tables with 50+ left joins and transformations. All the tables are fully cached in Memory with sufficient storage memory and working memory. The issue is after the launch of the query for the execution; the query takes approximately 40 seconds to appear in the Jobs/SQL in the application UI.
While the execution takes only 25 seconds; the execution is delayed by 40 seconds by the scheduler so the total runtime of the query becomes 65 seconds(40s + 25s). Also, there are enough cores available during this wait time. I couldn't figure out why DAG scheduler is delaying the execution by 40 seconds. Is this due to time taken for Query Parsing and Query Planning for the Complex SQL? If thats the case; how do we optimize this Query Parsing and Query Planning time in Spark? Any help would be helpful. Thanks Sathish