Hello Airflow community, While working on resolving a memory leak issue in the LocalExecutor[1], I observed that garbage collection (GC) in forked subprocesses was triggering Copy-On-Write (COW) on shared memory, which significantly increased each process’s PSS. By using gc.freeze to move objects created at subprocess startup into the GC permanent generation, I was able to mitigate this issue effectively.
I would like to propose applying the same approach to the Dag processor to address GC-related performance issues and improve stability in subprocesses. Below are the expected benefits. Preventing COW on Shared Memory Unlike the LocalExecutor, where subprocesses are long-lived, Dag processor subprocesses are not permanent. However, with the increasing adoption of dynamic Dags, parsing time has become longer in many cases. GC activity during parsing can trigger COW on shared memory, leading to memory spikes. In containerized environments, these spikes can result in OOM events. Improving GC Performance Applying gc.freeze marks existing objects as non-GC targets. As a result, this greatly lowers the frequency of threshold-based GC runs and makes GC much faster when it does occur. In a simple experiment, I observed GC time dropping from roughly 1 second to about 1 microsecond (with GC forced via gc.collect). Eliminating GC-Related Issues in Child Processes Similar to the issue in [2], GC triggered arbitrarily in child processes can affect shared objects inherited from the parent. By ensuring that parent-owned objects are not subject to GC in children, these issues can be avoided entirely. Beyond immediate performance and stability improvements, increased memory stability also enables further optimizations. For example, preloading heavy modules in the parent process can eliminate repeated memory loading in each child process. This approach has been discussed previously in [3], and preloading Airflow modules is already partially implemented today. While [3] primarily focused on parsing time, the broader benefit is reduced CPU and memory usage overall. Extending this idea beyond Airflow modules, allowing users to pre-import libraries used in DAG files could provide significant performance gains. That said, it is also clear why this has not been broadly adopted so far. Persistently importing problematic libraries defined in DAG files could introduce side effects, and unloading modules once loaded is difficult. In environments with frequent DAG changes, this can become a burden. For this reason, I believe the optimal approach is to allow pre-importing only for explicitly user-approved libraries. Users would define which libraries to preload via configuration. These libraries would be loaded lazily, and only after they are successfully loaded in a child process would they be loaded in the parent process as well. The pre-import mechanism I proposed recently in [4] may be helpful here. In summary, I am proposing two items: 1. Apply gc.freeze to the DAG processor. 2.Then, allow user-aware and intentional preloading of libraries. Thank you for taking the time to read this. If this proposal requires an AIP, I would be happy to prepare one. [1] https://github.com/apache/airflow/pull/58365 [2] https://github.com/apache/airflow/issues/56879 [3] https://github.com/apache/airflow/pull/30495 [4] https://github.com/apache/airflow/pull/58890 Best Regards, Jeongwoo Do
