[ https://issues.apache.org/jira/browse/SPARK-30821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kevin Hogeland updated SPARK-30821: ----------------------------------- Summary: Executor pods with multiple containers will not be rescheduled unless all containers fail (was: Sidecar containers in executor/driver may fail silently) > Executor pods with multiple containers will not be rescheduled unless all > containers fail > ----------------------------------------------------------------------------------------- > > Key: SPARK-30821 > URL: https://issues.apache.org/jira/browse/SPARK-30821 > Project: Spark > Issue Type: Improvement > Components: Kubernetes > Affects Versions: 3.1.0 > Reporter: Kevin Hogeland > Priority: Major > > Since the restart policy of launched pods is Never, additional handling is > required for pods that may have sidecar containers that need to restart on > failure. Kubernetes sidecar support in 1.18/1.19 does _not_ address this > situation (unlike > [SPARK-28887|https://issues.apache.org/jira/projects/SPARK/issues/SPARK-28887]), > as sidecar containers are excluded from pod phase calculation. > The pod snapshot should be considered "PodFailed" if the restart policy is > Never and any container has a non-zero exit code. > (This is arguably a duplicate of SPARK-28887, but that issue is specifically > for when the executor process fails) -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org