potiuk commented on issue #53337:
URL: https://github.com/apache/airflow/issues/53337#issuecomment-3070686001

   Another option is to make a second fork:
   
   ```
   supervisor -> timeout fork -> task fork
   ```
   
   And have the timeout fork only handle the timeout via ALRM signal. And the 
timeout fork could do dag parsing and deciding what to run before forking the 
task fork. This way it can retrieve timeout from DAG definition, and then 
(because it will fork) - the resulting task fork will already have the task 
parsed in memory.
   
   While forks are not as "good" for Python interpreter for memory as for 
regular c-code (because of reference counting almost all memory is very quickly 
copied-on-write. this is not the case when the fork will merely join the forked 
process and react to ALRM signal.
   
   Would be worth seeing if this approach might be good for memory usage - but 
it has the "isolation" property that is rather nice.  We can have single ALRM 
handler in the "timeout" fork rather than handling more complex cases of 
supervisor waiting for the timeout additionally. But also it might be an 
overkill - I thought I will just mention it as we discussed that option as 
potential implementation in 2.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to