johnhoran commented on code in PR #61778:
URL: https://github.com/apache/airflow/pull/61778#discussion_r2815981332


##########
providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/triggers/pod.py:
##########
@@ -183,7 +184,7 @@ async def run(self) -> AsyncIterator[TriggerEvent]:
                 event = await self._wait_for_container_completion()
             yield event
             return
-        except PodLaunchTimeoutException as e:
+        except (PodLaunchTimeoutException, PodLaunchFailedException) as e:

Review Comment:
   The point I'd make is that there is always going to be a relatively large 
time gap between when a pod reaches timeout state in the triggerer and when the 
operator picks back up the task to delete the pod, and then what should happen 
if the pod reaches a running state in that gap.  My view is that it should hand 
it back to the triggerer, rather than accepting the timeout and deleting the 
pod.  
   
   I'm happy to try working on another PR to try and pickup a transient 
`ErrImagePull`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to