Hello Team,

Currently we are using airflow version - 1.10.10 to data ingest.

In our DAG, we create tasks dynamically based on data volume , i.e if data 
volume is high, number of parallel tasks increases and if the data volume is 
less number of parallel tasks reduces in the next run or vice versa.
As DAG execution instance use the same table to update, we use 
'wait_for_downstream'  to True to maintain the data consistency and make sure 
next run should not happen if the previous run is in progress or failed.

In this scenario, we are seeing one issue i.e. If previous instances has less 
number of tasks then the current one because of dynamic task creation, then the 
current DAG is always in waiting state . As the current DAG  is waiting for the 
new task/s which are generated during the run but not exists in the previous 
DAG instance, but waiting for the same tasks to be in completion state in the 
previous DAG.  As soon as we manually mark those tasks as completed in the 
previous DAG instance, current DAG start running .

Let me know if you have any work around for this scenario.

Thanks
Sibabrata Pattanaik
--------------------------
spatt...@cisco.com
VOIP 84260416
+91 80  44260416
--------------------------

Reply via email to