johnhoran commented on code in PR #61778:
URL: https://github.com/apache/airflow/pull/61778#discussion_r2815981332
##########
providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/triggers/pod.py:
##########
@@ -183,7 +184,7 @@ async def run(self) -> AsyncIterator[TriggerEvent]:
event = await self._wait_for_container_completion()
yield event
return
- except PodLaunchTimeoutException as e:
+ except (PodLaunchTimeoutException, PodLaunchFailedException) as e:
Review Comment:
The point I'd make is that there is always going to be a relatively large
time gap between when a pod reaches timeout state in the triggerer and when the
operator picks back up the task to delete the pod, and then what should happen
if the pod reaches a running state in that gap. My view is that it should hand
it back to the triggerer, rather than accepting the timeout and deleting the
pod.
I'm happy to try working on another PR to try and pickup a transient
`ErrImagePull`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]