Youngkwang (YK) Lee created SPARK-49061:
-------------------------------------------

             Summary: Emit Kubernetes events when driver fails to request 
executor
                 Key: SPARK-49061
                 URL: https://issues.apache.org/jira/browse/SPARK-49061
             Project: Spark
          Issue Type: Improvement
          Components: Kubernetes
    Affects Versions: 3.5.3
            Reporter: Youngkwang (YK) Lee


In Kubernetes, when a driver pod fails to request executor pods (i.e due to 
being out of resource quota), the only visibility around this issue is inside 
the driver logs. 

We would like to expose this issue as a Kubernetes driver event to enhance 
debugging. A possible solution is to add event emission logic in 
ExecutorPodsAllocator.scala when we fail to request executors:
[https://bbgithub.dev.bloomberg.com/dnaspark/apache-spark-internal/blob/develop-3.4/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala#L439-L463]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to