Hi All, We are facing an issue and would be thankful if anyone can help us on this issue. Environment: Spark, Kubernetes and Airflow. Airflow is used to schedule job spark job over kubernetes. We are using bash script which is using spark submit command to submit spark jobs. Issue: We are submitting spark job through airflow in a *cluster mode*. However, when the job is completed and executors are closed, airflow is not able to schedule another job.
As per our investigation, we found that jobs are completed but spark submit command in our script is not able to exit and continuously running with following logs: 21/10/12 08:54:26 INFO LoggingPodStatusWatcherImpl: Application status for spark-3f914f93ad684743b1a7b17aa26b4329 (phase: Running) In order to confirm this issue is not from the airflow side, we tried to kill spark submit command and it was able to schedule another job so our observation is that somehow after completion of job spark-submit script is not able to exit and still running. FYI we have closed the spark session already in our code. One of the weird observations was that it is running completely fine in local mode and we are checking for client mode presently. Would be thankful if you can guide us on this? Thanks, Shishir