[ 
https://issues.apache.org/jira/browse/AIRFLOW-932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893124#comment-15893124
 ] 

Bolke de Bruin commented on AIRFLOW-932:
----------------------------------------

Issue resides in cli.py and will only happen when a specific task is used:

{code}
    if args.task_regex:
        dag = dag.sub_dag(
            task_regex=args.task_regex,
            include_upstream=not args.ignore_dependencies)
{code}

This creates a subset of the tasks from a dag_run, with the same name as the 
original. Hence it will set a task to removed if you verify the integrity of a 
dag run.

get_task_instances picks up all instances, including "removed' ones from the 
original (whole) dag and this is not filtered in the backfill. Hence the lists 
mismatch and an AirflowException is thrown.

The quick and dirty fix is to not mark removed if run from backfill and filter 
the list of "get_task_instances". 

However the functionality if sub_dag is awkward imho and might need a real fix. 
What do you think?





> Backfills delete existing task instances and mark them as removed
> -----------------------------------------------------------------
>
>                 Key: AIRFLOW-932
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-932
>             Project: Apache Airflow
>          Issue Type: Sub-task
>          Components: backfill
>            Reporter: Dan Davydov
>            Priority: Blocker
>
> I'm still investigating.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to