GitHub user rt849122 created a discussion: Large DAG with conditional subset 
execution: Is DAG Versioning + Dynamic DAG Generation the right approach?

Hi Airflow community,

 We're facing a challenge with a large DAG (150+ tasks) and would appreciate 
your advice on best practices.

  Current Setup:

  • We have a complex DAG with 150+ tasks
  • Each DAG run only needs to execute a subset of these tasks
  • Currently we use conditional logic to skip tasks that aren't needed
  • The problem: When we want to run very downstream tasks, we have to wait for 
all upstream tasks to be evaluated and skipped, which takes 10+ minutes due to 
DAG complexity

  Proposed Solution: We're considering using DAG Versioning combined with 
Dynamic DAG Generation:

  1. Generate a new DAG version for each run
  2. In each version, only include the tasks that need to run (removing 
unnecessary tasks entirely)
  3. This would eliminate the need for skip logic and waiting

 
  Is this an appropriate use case for DAG Versioning?   Is there a better 
pattern for this use case?

Thanks in advance!

GitHub link: https://github.com/apache/airflow/discussions/64344

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to