GitHub user rt849122 created a discussion: Large DAG with conditional subset execution: Is DAG Versioning + Dynamic DAG Generation the right approach?
Hi Airflow community, We're facing a challenge with a large DAG (150+ tasks) and would appreciate your advice on best practices. Current Setup: • We have a complex DAG with 150+ tasks • Each DAG run only needs to execute a subset of these tasks • Currently we use conditional logic to skip tasks that aren't needed • The problem: When we want to run very downstream tasks, we have to wait for all upstream tasks to be evaluated and skipped, which takes 10+ minutes due to DAG complexity Proposed Solution: We're considering using DAG Versioning combined with Dynamic DAG Generation: 1. Generate a new DAG version for each run 2. In each version, only include the tasks that need to run (removing unnecessary tasks entirely) 3. This would eliminate the need for skip logic and waiting Is this an appropriate use case for DAG Versioning? Is there a better pattern for this use case? Thanks in advance! GitHub link: https://github.com/apache/airflow/discussions/64344 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
