[ https://issues.apache.org/jira/browse/SPARK-40379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-40379: ------------------------------------ Assignee: Holden Karau (was: Apache Spark) > Propagate decommission executor loss reason during onDisconnect in K8s > ---------------------------------------------------------------------- > > Key: SPARK-40379 > URL: https://issues.apache.org/jira/browse/SPARK-40379 > Project: Spark > Issue Type: Improvement > Components: Kubernetes, Spark Core > Affects Versions: 3.4.0 > Reporter: Holden Karau > Assignee: Holden Karau > Priority: Minor > > Currently if an executor has been sent a decommission message and then it > disconnects from the scheduler we only disable the executor depending on the > K8s status events to drive the rest of the state transitions. However, the > K8s status events can become overwhelmed on large clusters so we should check > if an executor is in a decommissioning state when it is disconnected and use > that reason instead of waiting on the K8s status events so we have more > accurate logging information. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org