[ https://issues.apache.org/jira/browse/AIRFLOW-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16603302#comment-16603302 ]
Trevor Edwards edited comment on AIRFLOW-2319 at 9/4/18 9:25 PM: ----------------------------------------------------------------- +1 to this issue. There is an id column, but aside from this, it seems like only the pair (dag_id, [run_id|[https://github.com/apache/incubator-airflow/blob/1.9.0/airflow/models.py#L4384]] ) should be enforced as unique. The current behavior feels like a bug. This issue becomes problematic if you have event-driven DAGs (e.g. [https://cloud.google.com/composer/docs/how-to/using/triggering-with-gcf]) which may have different parameters execute simultaneously, causing an execution_date collision. Andreas, are you working on a fix for this? was (Author: trevoredwards): +1 to this issue. There is an id column, but aside from this, it seems like only the pair (dag_id, [run_id|[https://github.com/apache/incubator-airflow/blob/1.9.0/airflow/models.py#L4384]]) should be enforced as unique. The current behavior feels like a bug. This issue becomes problematic if you have event-driven DAGs (e.g. [https://cloud.google.com/composer/docs/how-to/using/triggering-with-gcf]) which may have different parameters execute simultaneously, causing an execution_date collision. Andreas, are you working on a fix for this? > Table "dag_run" has (bad) second index on (dag_id, execution_date) > ------------------------------------------------------------------ > > Key: AIRFLOW-2319 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2319 > Project: Apache Airflow > Issue Type: Bug > Components: DagRun > Affects Versions: 1.9.0 > Reporter: Andreas Költringer > Priority: Major > > Inserting DagRun's via {{airflow.api.common.experimental.trigger_dag}} > (multiple rows with the same {{(dag_id, execution_date)}}) raised the > following error: > {code:java} > {models.py:1644} ERROR - No row was found for one(){code} > This is weird as the {{session.add()}} and {{session.commit()}} is right > before {{run.refresh_from_db()}} in {{models.DAG.create_dagrun()}}. > Manually inspecting the database revealed that there is an extra index with > {{unique}} constraint on the columns {{(dag_id, execution_date)}}: > {code:java} > sqlite> .schema dag_run > CREATE TABLE dag_run ( > id INTEGER NOT NULL, > dag_id VARCHAR(250), > execution_date DATETIME, > state VARCHAR(50), > run_id VARCHAR(250), > external_trigger BOOLEAN, conf BLOB, end_date DATETIME, start_date > DATETIME, > PRIMARY KEY (id), > UNIQUE (dag_id, execution_date), > UNIQUE (dag_id, run_id), > CHECK (external_trigger IN (0, 1)) > ); > CREATE INDEX dag_id_state ON dag_run (dag_id, state);{code} > (On SQLite its a unique constraint, on MariaDB its also an index) > The {{DagRun}} class in {{models.py}} does not reflect this, however it is in > [migrations/versions/1b38cef5b76e_add_dagrun.py|https://github.com/apache/incubator-airflow/blob/master/airflow/migrations/versions/1b38cef5b76e_add_dagrun.py#L42] > I looked for other migrations correting this, but could not find any. As this > is not reflected in the model, I guess this is a bug? -- This message was sent by Atlassian JIRA (v7.6.3#76005)