Bisk1 commented on issue #34816:
URL: https://github.com/apache/airflow/issues/34816#issuecomment-1762156063

   After debugging I think I understand the problem fully. Here is what happens 
when we run triggerer in daemon mode on current main.
   
   1. Triggerer's main thread creates triggerer_job_runner 
[triggerer_job_runner](https://github.com/apache/airflow/blob/e9987d50598f70d84cbb2a5d964e21020e81c080/airflow/cli/commands/triggerer_command.py#L61C5-L61C25)
 which internally creates TriggerRunner - an async thread, but doesn't start it 
yet.
   2. Triggerer's main thread enters daemon context 
https://github.com/apache/airflow/blob/e9987d50598f70d84cbb2a5d964e21020e81c080/airflow/cli/commands/triggerer_command.py#L72
 which internally forks itself 
https://pagure.io/python-daemon/blob/main/f/daemon/daemon.py#_683
   3. After fork only the calling thread stays alive as it is the result of 
POSIX complaince. Python respects it by setting all other threads to stopped 
https://github.com/python/cpython/blob/b2ab210aaefb3b0e39f28e7946b7a531d7b2ab17/Lib/threading.py#L1690
   4. This affects the TriggerRunner thread which is set to `stopped` by the 
time when it reaches the `is_alive()` check in 
https://github.com/apache/airflow/commit/d6cc9e4bb1efe9713eccd8e62e46f11bad294a36#diff-9cea7921268261a177e82c16fd5111f8d3252e3ca0267bdfb397c379c5d70857R353-R355
 and `is_alive()` returns false.
   
   Before the change https://github.com/apache/airflow/pull/32092 it wasn't a 
problem because apparently a thread in stopped state can still be started and 
runs fine.
   
   My proposal to fix it: 
   1) Switch order of operations when running triggerer command in daemon mode 
so that the async thread is created after entering daemon context, e.g. move 
the thread initialization 
https://github.com/apache/airflow/blob/e9987d50598f70d84cbb2a5d964e21020e81c080/airflow/cli/commands/triggerer_command.py#L61C5-L61C25
 from line  61 to 80 and 86 (worked when I tested it locally).
   2) Refactor all the commands that have daemon mode (scheduler, webserver, 
etc.) to reuse the same pattern.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to