Arunodoy18 opened a new pull request, #61489: URL: https://github.com/apache/airflow/pull/61489
### SUMMARY Fix GitDagBundle performing a full git clone into a new directory for each task execution when used with LocalExecutor, which can lead to inode changes and result in FileNotFoundError during task runtime. This change ensures that an existing valid bundle clone is reused when safe, while preserving executor isolation, bundle lifecycle guarantees, and correctness across parallel task execution. --- ### ROOT CAUSE The current GitDagBundle behavior triggers a fresh git clone into a new directory for each task execution. This results in inode changes between DAG parsing and task execution phases, which can cause FileNotFoundError when tasks reference paths from the previously resolved bundle location. --- ### SOLUTION Introduce safe bundle reuse logic by: - Detecting existing valid bundle clones - Reusing bundle directories when repository state matches expected revision - Preserving executor safety and preventing shared mutable state - Maintaining compatibility with parallel LocalExecutor task execution The implementation ensures no partial clone states are exposed and preserves cleanup lifecycle behavior. --- ### TESTING Added tests covering: Functional: - Multiple tasks using the same DAG bundle under LocalExecutor - DAG parse phase and task execution phase consistency - No FileNotFoundError during task runtime Regression: - Bundle re-clone still happens when repository state changes - Existing bundle lifecycle and cleanup behavior remains unchanged Edge Cases: - Partial clone failure handling - Bundle reuse under parallel task execution All existing tests pass, and new tests validate bundle reuse correctness. --- ### PERFORMANCE IMPACT Reduces unnecessary full git clones per task execution and avoids redundant filesystem operations while maintaining correctness guarantees. --- closes: #61396 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
