dirrao opened a new pull request, #37670:
URL: https://github.com/apache/airflow/pull/37670

   What happened
   When the worker pods init/base containers are in a pending state due to 
fatal container
   state reasons, the tasks eventually fail and the pods are deleted. 
Currently, it has to wait until the worker_pods_pending_timeout even though the 
worker pods don't recover.
   
   What do you think should happen instead
   When the worker pods init/base containers are in a pending state due to 
fatal container
   state reasons, the worker pod doesn't recover. It doesn't make sense to wait 
until the worker_pods_pending_timeout. Instead, mark the tasks as failed and 
delete the worker pods.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to