[ https://issues.apache.org/jira/browse/SPARK-30821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kevin Hogeland updated SPARK-30821: ----------------------------------- Description: Since the restart policy of launched pods is Never, additional handling is required for pods that may have sidecar containers that need to restart on failure. Kubernetes sidecar support in 1.18/1.19 does _not_ address this situation (unlike [SPARK-28887|https://issues.apache.org/jira/projects/SPARK/issues/SPARK-28887]), as sidecar containers are excluded from pod phase calculation. The pod snapshot should be considered "PodFailed" if the restart policy is Never and any container has a non-zero exit code. (This is arguably a duplicate of SPARK-28887, but that issue is specifically for when the executor process fails) was: Since the restart policy of launched pods is Never, additional handling is required for pods that may have sidecar containers that need to restart on failure. Kubernetes sidecar support in 1.18/1.19 does _not_ address this situation (unlike [SPARK-28887|https://issues.apache.org/jira/projects/SPARK/issues/SPARK-28887]), as sidecar containers are excluded from pod phase calculation. The pod snapshot should be considered "PodFailed" if the restart policy is Never and any container has a non-zero exit code. > Sidecar containers in executor/driver may fail silently > ------------------------------------------------------- > > Key: SPARK-30821 > URL: https://issues.apache.org/jira/browse/SPARK-30821 > Project: Spark > Issue Type: Improvement > Components: Kubernetes > Affects Versions: 3.1.0 > Reporter: Kevin Hogeland > Priority: Major > > Since the restart policy of launched pods is Never, additional handling is > required for pods that may have sidecar containers that need to restart on > failure. Kubernetes sidecar support in 1.18/1.19 does _not_ address this > situation (unlike > [SPARK-28887|https://issues.apache.org/jira/projects/SPARK/issues/SPARK-28887]), > as sidecar containers are excluded from pod phase calculation. > The pod snapshot should be considered "PodFailed" if the restart policy is > Never and any container has a non-zero exit code. > (This is arguably a duplicate of SPARK-28887, but that issue is specifically > for when the executor process fails) -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org