potiuk commented on issue #35474:
URL: https://github.com/apache/airflow/issues/35474#issuecomment-1801908221

   YEp. @Taragolis . That would be me idea. It comes from the assumption that 
in order to REALLY be able to handle all timeouts you need to do it from a 
separate process - because as you rightfully explained - trying to handle 
things "in-process" is not always applicable. The idea of mine is to add extra 
layer of "what to do if the actual task process is not responding" - and I 
think utilising that parent process (which is already there) to apply such 
hard-timeout is simplest - without modifying states and adding yet another 
layer of monitoring processes/overloading the scheduler.
   
   I think - other than occasional "whole machine stops working" this would 
handle most cases where the task is not timing out but still continues to do 
stuff because of badly written low-level C implemetnation of the library that 
is used.. 
   
   And the "whole machine hangs" case should anyhow be handled on deployment 
level (for example K8S should kill it, also in this case we will stop receiving 
heartbeats and ultimately Scheduler should handle it even today.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to