jscheffl commented on issue #57210: URL: https://github.com/apache/airflow/issues/57210#issuecomment-3939162864
Oh, did not see this bug before. We actually face the same issue. In two ways: 1) We have limited pools, if a task is deferred the slot is handed back to pool. Might not be sufficient pool slots, deferred execution is not getting active 2) Worker on return, pool limits a Dag to be scheduled and the "finalization" is delayed, in case of KPO this means the resources on the node are still allocated and if the node is evicted the XCom data might be lost because this is read only by worker. I would also _like_ very much if (1) a increased/elevated priority is used (task was running, does not make any sense in my view to delay re-execution) as well as (2) it might be good _not_ to return the pool entries between deferred and running on a worker such that a pool limitation does not "interrupt" execution but continues until completed or failed. FYI @AutomationDev85 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
