[jira] [Assigned] (AIRFLOW-2344) Fix `airflow connections -l` to work with pipe and redirect
[ https://issues.apache.org/jira/browse/AIRFLOW-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kengo Seki reassigned AIRFLOW-2344: --- Assignee: Kengo Seki > Fix `airflow connections -l` to work with pipe and redirect > --- > > Key: AIRFLOW-2344 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2344 > Project: Apache Airflow > Issue Type: Bug > Components: cli >Affects Versions: 1.9.0 >Reporter: Kengo Seki >Assignee: Kengo Seki >Priority: Minor > > {{airflow connections -l}} fails with pipe or redirect e.g.: > {code} > $ airflow connections -l > foo > Traceback (most recent call last): > File "/home/sekikn/.virtualenvs/a/bin/airflow", line 6, in > exec(compile(open(__file__).read(), __file__, 'exec')) > File "/home/sekikn/dev/incubator-airflow/airflow/bin/airflow", line 32, in > > args.func(args) > File "/home/sekikn/dev/incubator-airflow/airflow/utils/cli.py", line 77, in > wrapper > raise e > UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-141: > ordinal not in range(128) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
incubator-airflow git commit: closes apache/incubator-airflow#3225 *Closed for inactivity*
Repository: incubator-airflow Updated Branches: refs/heads/master f1e65c489 -> e6145784e closes apache/incubator-airflow#3225 *Closed for inactivity* Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/e6145784 Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/e6145784 Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/e6145784 Branch: refs/heads/master Commit: e6145784e64c6160e302417fed4474fd580e9a8b Parents: f1e65c4 Author: r39132 Authored: Thu Apr 19 18:37:34 2018 -0700 Committer: r39132 Committed: Thu Apr 19 18:37:34 2018 -0700 -- --
[jira] [Assigned] (AIRFLOW-2300) Add S3 Select functionarity to S3ToHiveTransfer
[ https://issues.apache.org/jira/browse/AIRFLOW-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kengo Seki reassigned AIRFLOW-2300: --- Assignee: Kengo Seki > Add S3 Select functionarity to S3ToHiveTransfer > --- > > Key: AIRFLOW-2300 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2300 > Project: Apache Airflow > Issue Type: Improvement > Components: aws, operators >Reporter: Kengo Seki >Assignee: Kengo Seki >Priority: Major > > For the same reason as AIRFLOW-2299, S3ToHiveTransfer should leverage S3 > Select. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (AIRFLOW-2347) Add Banco de Formaturas new officially using Airflow
[ https://issues.apache.org/jira/browse/AIRFLOW-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Anand closed AIRFLOW-2347. Resolution: Fixed > Add Banco de Formaturas new officially using Airflow > - > > Key: AIRFLOW-2347 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2347 > Project: Apache Airflow > Issue Type: Bug >Reporter: Siddharth Anand >Assignee: Siddharth Anand >Priority: Trivial > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2347) Add Banco de Formaturas new officially using Airflow
[ https://issues.apache.org/jira/browse/AIRFLOW-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16445090#comment-16445090 ] ASF subversion and git services commented on AIRFLOW-2347: -- Commit f1e65c4897535aa9b97f2ce1ae628eddc6a4a6e5 in incubator-airflow's branch refs/heads/master from Sid Anand [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=f1e65c4 ] [AIRFLOW-2347] Add Banco de Formaturas to Readme Make sure you have checked _all_ steps below. ### JIRA - [x] My PR addresses the following [Airflow JIRA] (https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-2347 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a JIRA issue. ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: Add a company to the README ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: N/A -- documentation update only ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git- commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` Closes #3242 from r39132/Add_banco_Formaturas_to_readme > Add Banco de Formaturas new officially using Airflow > - > > Key: AIRFLOW-2347 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2347 > Project: Apache Airflow > Issue Type: Bug >Reporter: Siddharth Anand >Assignee: Siddharth Anand >Priority: Trivial > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
incubator-airflow git commit: [AIRFLOW-2347] Add Banco de Formaturas to Readme
Repository: incubator-airflow Updated Branches: refs/heads/master c208a5668 -> f1e65c489 [AIRFLOW-2347] Add Banco de Formaturas to Readme Make sure you have checked _all_ steps below. ### JIRA - [x] My PR addresses the following [Airflow JIRA] (https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-2347 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a JIRA issue. ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: Add a company to the README ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: N/A -- documentation update only ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git- commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` Closes #3242 from r39132/Add_banco_Formaturas_to_readme Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/f1e65c48 Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/f1e65c48 Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/f1e65c48 Branch: refs/heads/master Commit: f1e65c4897535aa9b97f2ce1ae628eddc6a4a6e5 Parents: c208a56 Author: Sid Anand Authored: Thu Apr 19 18:33:33 2018 -0700 Committer: r39132 Committed: Thu Apr 19 18:33:33 2018 -0700 -- README.md | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/f1e65c48/README.md -- diff --git a/README.md b/README.md index 943919b..5e5193c 100644 --- a/README.md +++ b/README.md @@ -93,6 +93,7 @@ Currently **officially** using Airflow: 1. [Auth0](https://auth0.com) [[@sicarul](https://github.com/sicarul)] 1. [Away](https://awaytravel.com) [[@trunsky](https://github.com/trunsky)] 1. [BalanceHero](http://truebalance.io/) [[@swalloow](https://github.com/swalloow)] +1. [Banco de Formaturas](https://www.bancodeformaturas.com.br) [[@guiligan](https://github.com/guiligan)] 1. [Azri Solutions](http://www.azrisolutions.com/) [[@userimack](https://github.com/userimack)] 1. [BandwidthX](http://www.bandwidthx.com) [[@dineshdsharma](https://github.com/dineshdsharma)] 1. [Bellhops](https://github.com/bellhops)
[jira] [Commented] (AIRFLOW-2347) Add Banco de Formaturas new officially using Airflow
[ https://issues.apache.org/jira/browse/AIRFLOW-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16445091#comment-16445091 ] ASF subversion and git services commented on AIRFLOW-2347: -- Commit f1e65c4897535aa9b97f2ce1ae628eddc6a4a6e5 in incubator-airflow's branch refs/heads/master from Sid Anand [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=f1e65c4 ] [AIRFLOW-2347] Add Banco de Formaturas to Readme Make sure you have checked _all_ steps below. ### JIRA - [x] My PR addresses the following [Airflow JIRA] (https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-2347 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a JIRA issue. ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: Add a company to the README ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: N/A -- documentation update only ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git- commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` Closes #3242 from r39132/Add_banco_Formaturas_to_readme > Add Banco de Formaturas new officially using Airflow > - > > Key: AIRFLOW-2347 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2347 > Project: Apache Airflow > Issue Type: Bug >Reporter: Siddharth Anand >Assignee: Siddharth Anand >Priority: Trivial > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-2347) Add Banco de Formaturas new officially using Airflow
[ https://issues.apache.org/jira/browse/AIRFLOW-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Anand updated AIRFLOW-2347: - External issue URL: https://github.com/apache/incubator-airflow/pull/3242 > Add Banco de Formaturas new officially using Airflow > - > > Key: AIRFLOW-2347 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2347 > Project: Apache Airflow > Issue Type: Bug >Reporter: Siddharth Anand >Assignee: Siddharth Anand >Priority: Trivial > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2347) Add Banco de Formaturas new officially using Airflow
Siddharth Anand created AIRFLOW-2347: Summary: Add Banco de Formaturas new officially using Airflow Key: AIRFLOW-2347 URL: https://issues.apache.org/jira/browse/AIRFLOW-2347 Project: Apache Airflow Issue Type: Bug Reporter: Siddharth Anand Assignee: Siddharth Anand -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (AIRFLOW-2346) Add Investorise as official user of Airflow
[ https://issues.apache.org/jira/browse/AIRFLOW-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Anand closed AIRFLOW-2346. Resolution: Fixed > Add Investorise as official user of Airflow > --- > > Key: AIRFLOW-2346 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2346 > Project: Apache Airflow > Issue Type: Bug >Reporter: Siddharth Anand >Assignee: Siddharth Anand >Priority: Trivial > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2346) Add Investorise as official user of Airflow
[ https://issues.apache.org/jira/browse/AIRFLOW-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16445079#comment-16445079 ] ASF subversion and git services commented on AIRFLOW-2346: -- Commit c208a5668285a4cbd5e1073535f30774e942eac1 in incubator-airflow's branch refs/heads/master from Sven Varkel [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=c208a56 ] [AIRFLOW-2346] Add Investorise as official user of Airflow Closes #3238 from svenvarkel/master > Add Investorise as official user of Airflow > --- > > Key: AIRFLOW-2346 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2346 > Project: Apache Airflow > Issue Type: Bug >Reporter: Siddharth Anand >Assignee: Siddharth Anand >Priority: Trivial > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
incubator-airflow git commit: [AIRFLOW-2346] Add Investorise as official user of Airflow
Repository: incubator-airflow Updated Branches: refs/heads/master 17d3d1d9d -> c208a5668 [AIRFLOW-2346] Add Investorise as official user of Airflow Closes #3238 from svenvarkel/master Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/c208a566 Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/c208a566 Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/c208a566 Branch: refs/heads/master Commit: c208a5668285a4cbd5e1073535f30774e942eac1 Parents: 17d3d1d Author: Sven Varkel Authored: Thu Apr 19 18:20:31 2018 -0700 Committer: r39132 Committed: Thu Apr 19 18:20:37 2018 -0700 -- README.md | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/c208a566/README.md -- diff --git a/README.md b/README.md index 6911e32..943919b 100644 --- a/README.md +++ b/README.md @@ -155,6 +155,7 @@ Currently **officially** using Airflow: 1. [imgix](https://www.imgix.com/) [[@dclubb](https://github.com/dclubb)] 1. [ING](http://www.ing.com/) 1. [Intercom](http://www.intercom.com/) [[@fox](https://github.com/fox) & [@paulvic](https://github.com/paulvic)] +1. [Investorise](https://investorise.com/) [[@svenvarkel](https://github.com/svenvarkel)] 1. [Jampp](https://github.com/jampp) 1. [JobTeaser](https://www.jobteaser.com) [[@stefani75](https://github.com/stefani75) & [@knil-sama](https://github.com/knil-sama)] 1. [Kalibrr](https://www.kalibrr.com/) [[@charlesverdad](https://github.com/charlesverdad)]
[jira] [Updated] (AIRFLOW-2346) Add Investorise as official user of Airflow
[ https://issues.apache.org/jira/browse/AIRFLOW-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Anand updated AIRFLOW-2346: - External issue URL: https://github.com/apache/incubator-airflow/pull/3238 > Add Investorise as official user of Airflow > --- > > Key: AIRFLOW-2346 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2346 > Project: Apache Airflow > Issue Type: Bug >Reporter: Siddharth Anand >Assignee: Siddharth Anand >Priority: Trivial > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2346) Add Investorise as official user of Airflow
Siddharth Anand created AIRFLOW-2346: Summary: Add Investorise as official user of Airflow Key: AIRFLOW-2346 URL: https://issues.apache.org/jira/browse/AIRFLOW-2346 Project: Apache Airflow Issue Type: Bug Reporter: Siddharth Anand Assignee: Siddharth Anand -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-2342) DAG in running state but tasks not running
[ https://issues.apache.org/jira/browse/AIRFLOW-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chidrup jhanjhari reassigned AIRFLOW-2342: -- Assignee: (was: chidrup jhanjhari) > DAG in running state but tasks not running > -- > > Key: AIRFLOW-2342 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2342 > Project: Apache Airflow > Issue Type: Bug > Components: DagRun >Affects Versions: Airflow 1.8 > Environment: Redhat >Reporter: chidrup jhanjhari >Priority: Major > Attachments: job1.py.log > > Original Estimate: 1h > Remaining Estimate: 1h > > Hi, We are on Airflow 1.8.0. We have airflow production environment running > well since 8 months. There has been no change on configuration etc. The issue > is since 2 days DAGs are showing in running state but the tasks are not > getting triggered. After the default start task, the DAG run is not moving to > the next task. Attached is the scheduler throwing following error: > 2018-04-19 01:32:22,586] \{jobs.py:354} DagFileProcessor17 ERROR - Got an > exception! Propagating... > Traceback (most recent call last): > Any help will be greatly appreciated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-2342) DAG in running state but tasks not running
[ https://issues.apache.org/jira/browse/AIRFLOW-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chidrup jhanjhari reassigned AIRFLOW-2342: -- Assignee: chidrup jhanjhari > DAG in running state but tasks not running > -- > > Key: AIRFLOW-2342 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2342 > Project: Apache Airflow > Issue Type: Bug > Components: DagRun >Affects Versions: Airflow 1.8 > Environment: Redhat >Reporter: chidrup jhanjhari >Assignee: chidrup jhanjhari >Priority: Major > Attachments: job1.py.log > > Original Estimate: 1h > Remaining Estimate: 1h > > Hi, We are on Airflow 1.8.0. We have airflow production environment running > well since 8 months. There has been no change on configuration etc. The issue > is since 2 days DAGs are showing in running state but the tasks are not > getting triggered. After the default start task, the DAG run is not moving to > the next task. Attached is the scheduler throwing following error: > 2018-04-19 01:32:22,586] \{jobs.py:354} DagFileProcessor17 ERROR - Got an > exception! Propagating... > Traceback (most recent call last): > Any help will be greatly appreciated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2319) Table "dag_run" has (bad) second index on (dag_id, execution_date)
[ https://issues.apache.org/jira/browse/AIRFLOW-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16444778#comment-16444778 ] Andreas Költringer commented on AIRFLOW-2319: - * if the index was needed for performance, creating the index with unique constraint would not be necessary -> could be an index without uniqueness. * regarding entropy: there is an "id" column which is the primary key * my understanding is that [the DagRun table was first created with (dag_id, execution_date) as primary key|https://github.com/apache/incubator-airflow/commit/58519878bba9cf39f9abaf9a2cb016aa1b8f683e], and was later refactored. This makes me think that the uniqueness constraint on (dag_id, execution_date) is there by accident > Table "dag_run" has (bad) second index on (dag_id, execution_date) > -- > > Key: AIRFLOW-2319 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2319 > Project: Apache Airflow > Issue Type: Bug > Components: DagRun >Affects Versions: 1.9.0 >Reporter: Andreas Költringer >Priority: Major > > Inserting DagRun's via {{airflow.api.common.experimental.trigger_dag}} > (multiple rows with the same {{(dag_id, execution_date)}}) raised the > following error: > {code:java} > {models.py:1644} ERROR - No row was found for one(){code} > This is weird as the {{session.add()}} and {{session.commit()}} is right > before {{run.refresh_from_db()}} in {{models.DAG.create_dagrun()}}. > Manually inspecting the database revealed that there is an extra index with > {{unique}} constraint on the columns {{(dag_id, execution_date)}}: > {code:java} > sqlite> .schema dag_run > CREATE TABLE dag_run ( > id INTEGER NOT NULL, > dag_id VARCHAR(250), > execution_date DATETIME, > state VARCHAR(50), > run_id VARCHAR(250), > external_trigger BOOLEAN, conf BLOB, end_date DATETIME, start_date > DATETIME, > PRIMARY KEY (id), > UNIQUE (dag_id, execution_date), > UNIQUE (dag_id, run_id), > CHECK (external_trigger IN (0, 1)) > ); > CREATE INDEX dag_id_state ON dag_run (dag_id, state);{code} > (On SQLite its a unique constraint, on MariaDB its also an index) > The {{DagRun}} class in {{models.py}} does not reflect this, however it is in > [migrations/versions/1b38cef5b76e_add_dagrun.py|https://github.com/apache/incubator-airflow/blob/master/airflow/migrations/versions/1b38cef5b76e_add_dagrun.py#L42] > I looked for other migrations correting this, but could not find any. As this > is not reflected in the model, I guess this is a bug? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (AIRFLOW-2319) Table "dag_run" has (bad) second index on (dag_id, execution_date)
[ https://issues.apache.org/jira/browse/AIRFLOW-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16444711#comment-16444711 ] John Arnold edited comment on AIRFLOW-2319 at 4/19/18 8:14 PM: --- Also, if you truly need "duplicate" dag runs with the same dag_id and execution date, some additional entropy will be needed for uniqueness – eg. a uuid or table id was (Author: johnarnold): Also, if you truly need "duplicate" dag runs with the same dag_id and execution date, some additional entropy will be needed for uniqueness – eg. a dag run id > Table "dag_run" has (bad) second index on (dag_id, execution_date) > -- > > Key: AIRFLOW-2319 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2319 > Project: Apache Airflow > Issue Type: Bug > Components: DagRun >Affects Versions: 1.9.0 >Reporter: Andreas Költringer >Priority: Major > > Inserting DagRun's via {{airflow.api.common.experimental.trigger_dag}} > (multiple rows with the same {{(dag_id, execution_date)}}) raised the > following error: > {code:java} > {models.py:1644} ERROR - No row was found for one(){code} > This is weird as the {{session.add()}} and {{session.commit()}} is right > before {{run.refresh_from_db()}} in {{models.DAG.create_dagrun()}}. > Manually inspecting the database revealed that there is an extra index with > {{unique}} constraint on the columns {{(dag_id, execution_date)}}: > {code:java} > sqlite> .schema dag_run > CREATE TABLE dag_run ( > id INTEGER NOT NULL, > dag_id VARCHAR(250), > execution_date DATETIME, > state VARCHAR(50), > run_id VARCHAR(250), > external_trigger BOOLEAN, conf BLOB, end_date DATETIME, start_date > DATETIME, > PRIMARY KEY (id), > UNIQUE (dag_id, execution_date), > UNIQUE (dag_id, run_id), > CHECK (external_trigger IN (0, 1)) > ); > CREATE INDEX dag_id_state ON dag_run (dag_id, state);{code} > (On SQLite its a unique constraint, on MariaDB its also an index) > The {{DagRun}} class in {{models.py}} does not reflect this, however it is in > [migrations/versions/1b38cef5b76e_add_dagrun.py|https://github.com/apache/incubator-airflow/blob/master/airflow/migrations/versions/1b38cef5b76e_add_dagrun.py#L42] > I looked for other migrations correting this, but could not find any. As this > is not reflected in the model, I guess this is a bug? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (AIRFLOW-2319) Table "dag_run" has (bad) second index on (dag_id, execution_date)
[ https://issues.apache.org/jira/browse/AIRFLOW-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16444711#comment-16444711 ] John Arnold edited comment on AIRFLOW-2319 at 4/19/18 8:14 PM: --- Also, if you truly need "duplicate" dag runs with the same dag_id and execution date, some additional entropy will be needed for uniqueness – eg. a dag run id was (Author: johnarnold): Also, if you truly need "duplicate" dag runs with the same dag_id and execution date, some additional entropy will be needed for uniqueness – eg. a dag run uuid. > Table "dag_run" has (bad) second index on (dag_id, execution_date) > -- > > Key: AIRFLOW-2319 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2319 > Project: Apache Airflow > Issue Type: Bug > Components: DagRun >Affects Versions: 1.9.0 >Reporter: Andreas Költringer >Priority: Major > > Inserting DagRun's via {{airflow.api.common.experimental.trigger_dag}} > (multiple rows with the same {{(dag_id, execution_date)}}) raised the > following error: > {code:java} > {models.py:1644} ERROR - No row was found for one(){code} > This is weird as the {{session.add()}} and {{session.commit()}} is right > before {{run.refresh_from_db()}} in {{models.DAG.create_dagrun()}}. > Manually inspecting the database revealed that there is an extra index with > {{unique}} constraint on the columns {{(dag_id, execution_date)}}: > {code:java} > sqlite> .schema dag_run > CREATE TABLE dag_run ( > id INTEGER NOT NULL, > dag_id VARCHAR(250), > execution_date DATETIME, > state VARCHAR(50), > run_id VARCHAR(250), > external_trigger BOOLEAN, conf BLOB, end_date DATETIME, start_date > DATETIME, > PRIMARY KEY (id), > UNIQUE (dag_id, execution_date), > UNIQUE (dag_id, run_id), > CHECK (external_trigger IN (0, 1)) > ); > CREATE INDEX dag_id_state ON dag_run (dag_id, state);{code} > (On SQLite its a unique constraint, on MariaDB its also an index) > The {{DagRun}} class in {{models.py}} does not reflect this, however it is in > [migrations/versions/1b38cef5b76e_add_dagrun.py|https://github.com/apache/incubator-airflow/blob/master/airflow/migrations/versions/1b38cef5b76e_add_dagrun.py#L42] > I looked for other migrations correting this, but could not find any. As this > is not reflected in the model, I guess this is a bug? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2319) Table "dag_run" has (bad) second index on (dag_id, execution_date)
[ https://issues.apache.org/jira/browse/AIRFLOW-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16444711#comment-16444711 ] John Arnold commented on AIRFLOW-2319: -- Also, if you truly need "duplicate" dag runs with the same dag_id and execution date, some additional entropy will be needed for uniqueness – eg. a dag run uuid. > Table "dag_run" has (bad) second index on (dag_id, execution_date) > -- > > Key: AIRFLOW-2319 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2319 > Project: Apache Airflow > Issue Type: Bug > Components: DagRun >Affects Versions: 1.9.0 >Reporter: Andreas Költringer >Priority: Major > > Inserting DagRun's via {{airflow.api.common.experimental.trigger_dag}} > (multiple rows with the same {{(dag_id, execution_date)}}) raised the > following error: > {code:java} > {models.py:1644} ERROR - No row was found for one(){code} > This is weird as the {{session.add()}} and {{session.commit()}} is right > before {{run.refresh_from_db()}} in {{models.DAG.create_dagrun()}}. > Manually inspecting the database revealed that there is an extra index with > {{unique}} constraint on the columns {{(dag_id, execution_date)}}: > {code:java} > sqlite> .schema dag_run > CREATE TABLE dag_run ( > id INTEGER NOT NULL, > dag_id VARCHAR(250), > execution_date DATETIME, > state VARCHAR(50), > run_id VARCHAR(250), > external_trigger BOOLEAN, conf BLOB, end_date DATETIME, start_date > DATETIME, > PRIMARY KEY (id), > UNIQUE (dag_id, execution_date), > UNIQUE (dag_id, run_id), > CHECK (external_trigger IN (0, 1)) > ); > CREATE INDEX dag_id_state ON dag_run (dag_id, state);{code} > (On SQLite its a unique constraint, on MariaDB its also an index) > The {{DagRun}} class in {{models.py}} does not reflect this, however it is in > [migrations/versions/1b38cef5b76e_add_dagrun.py|https://github.com/apache/incubator-airflow/blob/master/airflow/migrations/versions/1b38cef5b76e_add_dagrun.py#L42] > I looked for other migrations correting this, but could not find any. As this > is not reflected in the model, I guess this is a bug? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2319) Table "dag_run" has (bad) second index on (dag_id, execution_date)
[ https://issues.apache.org/jira/browse/AIRFLOW-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16444653#comment-16444653 ] John Arnold commented on AIRFLOW-2319: -- IMO, the index is probably needed for performance, as those are the most common lookup fields etc. I would add the index to the model as a bugfix. > Table "dag_run" has (bad) second index on (dag_id, execution_date) > -- > > Key: AIRFLOW-2319 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2319 > Project: Apache Airflow > Issue Type: Bug > Components: DagRun >Affects Versions: 1.9.0 >Reporter: Andreas Költringer >Priority: Major > > Inserting DagRun's via {{airflow.api.common.experimental.trigger_dag}} > (multiple rows with the same {{(dag_id, execution_date)}}) raised the > following error: > {code:java} > {models.py:1644} ERROR - No row was found for one(){code} > This is weird as the {{session.add()}} and {{session.commit()}} is right > before {{run.refresh_from_db()}} in {{models.DAG.create_dagrun()}}. > Manually inspecting the database revealed that there is an extra index with > {{unique}} constraint on the columns {{(dag_id, execution_date)}}: > {code:java} > sqlite> .schema dag_run > CREATE TABLE dag_run ( > id INTEGER NOT NULL, > dag_id VARCHAR(250), > execution_date DATETIME, > state VARCHAR(50), > run_id VARCHAR(250), > external_trigger BOOLEAN, conf BLOB, end_date DATETIME, start_date > DATETIME, > PRIMARY KEY (id), > UNIQUE (dag_id, execution_date), > UNIQUE (dag_id, run_id), > CHECK (external_trigger IN (0, 1)) > ); > CREATE INDEX dag_id_state ON dag_run (dag_id, state);{code} > (On SQLite its a unique constraint, on MariaDB its also an index) > The {{DagRun}} class in {{models.py}} does not reflect this, however it is in > [migrations/versions/1b38cef5b76e_add_dagrun.py|https://github.com/apache/incubator-airflow/blob/master/airflow/migrations/versions/1b38cef5b76e_add_dagrun.py#L42] > I looked for other migrations correting this, but could not find any. As this > is not reflected in the model, I guess this is a bug? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2345) pip import is unused in setup.py
[ https://issues.apache.org/jira/browse/AIRFLOW-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1603#comment-1603 ] Sam Garrett commented on AIRFLOW-2345: -- I have created a PR for this here: https://github.com/apache/incubator-airflow/pull/3241 > pip import is unused in setup.py > > > Key: AIRFLOW-2345 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2345 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Sam Garrett >Assignee: Sam Garrett >Priority: Minor > > pip is unnecessarily imported here in the current master branch of airflow: > [https://github.com/apache/incubator-airflow/blob/master/setup.py#L26] > > It should be removed in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (AIRFLOW-2345) pip import is unused in setup.py
[ https://issues.apache.org/jira/browse/AIRFLOW-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AIRFLOW-2345 started by Sam Garrett. > pip import is unused in setup.py > > > Key: AIRFLOW-2345 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2345 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Sam Garrett >Assignee: Sam Garrett >Priority: Minor > > pip is unnecessarily imported here in the current master branch of airflow: > [https://github.com/apache/incubator-airflow/blob/master/setup.py#L26] > > It should be removed in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2345) pip import is unused in setup.py
Sam Garrett created AIRFLOW-2345: Summary: pip import is unused in setup.py Key: AIRFLOW-2345 URL: https://issues.apache.org/jira/browse/AIRFLOW-2345 Project: Apache Airflow Issue Type: Improvement Reporter: Sam Garrett Assignee: Sam Garrett pip is unnecessarily imported here in the current master branch of airflow: [https://github.com/apache/incubator-airflow/blob/master/setup.py#L26] It should be removed in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-630) Airflow worker is not working with Celery 4.0.0
[ https://issues.apache.org/jira/browse/AIRFLOW-630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16444204#comment-16444204 ] Luke Bodeen commented on AIRFLOW-630: - This should be closed as many people run celery 4 now with airflow 1.9 > Airflow worker is not working with Celery 4.0.0 > --- > > Key: AIRFLOW-630 > URL: https://issues.apache.org/jira/browse/AIRFLOW-630 > Project: Apache Airflow > Issue Type: Bug > Components: celery >Affects Versions: Airflow 1.7.1.2, Airflow 1.7.1.3 >Reporter: Hafiz Badrie Lubis >Priority: Major > > Soon as celery version is upgraded to 4.0.0, airflow worker is not working, > because loglevel value is None. You can see the detail of error log on this > image: http://imgur.com/JHedHeN. > Should make loglevel value assignment be more flexible. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-1268) Celery bug can cause tasks to be delayed indefinitely
[ https://issues.apache.org/jira/browse/AIRFLOW-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16444196#comment-16444196 ] Luke Bodeen commented on AIRFLOW-1268: -- that celery issue shows fixed in 4.2 now > Celery bug can cause tasks to be delayed indefinitely > - > > Key: AIRFLOW-1268 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1268 > Project: Apache Airflow > Issue Type: Bug > Components: celery > Environment: With celery_executor with redis >Reporter: Alex Guziel >Priority: Critical > > With celery, tasks can get delayed indefinitely (or default 1 hour) due to a > bug with celery, see https://github.com/celery/celery/issues/3765 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2330) GoogleCloudStorageToGoogleCloudStorageOperator on wildcard appends destination_object even when not given
[ https://issues.apache.org/jira/browse/AIRFLOW-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16443726#comment-16443726 ] ASF subversion and git services commented on AIRFLOW-2330: -- Commit 17d3d1d9dc87c0bbb03de049607c2ad76a4fd747 in incubator-airflow's branch refs/heads/master from [~b11c] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=17d3d1d ] [AIRFLOW-2330] Do not append destination prefix if not given Closes #3233 from berislavlopac/AIRFLOW-2330 > GoogleCloudStorageToGoogleCloudStorageOperator on wildcard appends > destination_object even when not given > - > > Key: AIRFLOW-2330 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2330 > Project: Apache Airflow > Issue Type: Bug >Reporter: Berislav Lopac >Assignee: Berislav Lopac >Priority: Major > Fix For: 2.0.0 > > > Currently, the operator builds the destination like this: > {code} > hook.copy(self.source_bucket, source_object, > self.destination_bucket, "{}/{}".format(self.destination_object, > source_object)) > {code} > If destination is {{None}} (the default) the file will land in > {{None/\{source_object\}}}, and if it's an empty string it goes to > {{/\{source_object\}}}. Basically, it should not prepend > {{destination_object}} if it's empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-2330) GoogleCloudStorageToGoogleCloudStorageOperator on wildcard appends destination_object even when not given
[ https://issues.apache.org/jira/browse/AIRFLOW-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong resolved AIRFLOW-2330. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request #3233 [https://github.com/apache/incubator-airflow/pull/3233] > GoogleCloudStorageToGoogleCloudStorageOperator on wildcard appends > destination_object even when not given > - > > Key: AIRFLOW-2330 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2330 > Project: Apache Airflow > Issue Type: Bug >Reporter: Berislav Lopac >Assignee: Berislav Lopac >Priority: Major > Fix For: 2.0.0 > > > Currently, the operator builds the destination like this: > {code} > hook.copy(self.source_bucket, source_object, > self.destination_bucket, "{}/{}".format(self.destination_object, > source_object)) > {code} > If destination is {{None}} (the default) the file will land in > {{None/\{source_object\}}}, and if it's an empty string it goes to > {{/\{source_object\}}}. Basically, it should not prepend > {{destination_object}} if it's empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2330) GoogleCloudStorageToGoogleCloudStorageOperator on wildcard appends destination_object even when not given
[ https://issues.apache.org/jira/browse/AIRFLOW-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16443725#comment-16443725 ] ASF subversion and git services commented on AIRFLOW-2330: -- Commit 17d3d1d9dc87c0bbb03de049607c2ad76a4fd747 in incubator-airflow's branch refs/heads/master from [~b11c] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=17d3d1d ] [AIRFLOW-2330] Do not append destination prefix if not given Closes #3233 from berislavlopac/AIRFLOW-2330 > GoogleCloudStorageToGoogleCloudStorageOperator on wildcard appends > destination_object even when not given > - > > Key: AIRFLOW-2330 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2330 > Project: Apache Airflow > Issue Type: Bug >Reporter: Berislav Lopac >Assignee: Berislav Lopac >Priority: Major > Fix For: 2.0.0 > > > Currently, the operator builds the destination like this: > {code} > hook.copy(self.source_bucket, source_object, > self.destination_bucket, "{}/{}".format(self.destination_object, > source_object)) > {code} > If destination is {{None}} (the default) the file will land in > {{None/\{source_object\}}}, and if it's an empty string it goes to > {{/\{source_object\}}}. Basically, it should not prepend > {{destination_object}} if it's empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
incubator-airflow git commit: [AIRFLOW-2330] Do not append destination prefix if not given
Repository: incubator-airflow Updated Branches: refs/heads/master e95a1251b -> 17d3d1d9d [AIRFLOW-2330] Do not append destination prefix if not given Closes #3233 from berislavlopac/AIRFLOW-2330 Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/17d3d1d9 Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/17d3d1d9 Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/17d3d1d9 Branch: refs/heads/master Commit: 17d3d1d9dc87c0bbb03de049607c2ad76a4fd747 Parents: e95a125 Author: Berislav Lopac Authored: Thu Apr 19 10:26:23 2018 +0200 Committer: Fokko Driesprong Committed: Thu Apr 19 10:26:23 2018 +0200 -- airflow/contrib/operators/gcs_to_gcs.py | 41 -- .../operators/test_gcs_to_gcs_operator.py | 58 +++- 2 files changed, 81 insertions(+), 18 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/17d3d1d9/airflow/contrib/operators/gcs_to_gcs.py -- diff --git a/airflow/contrib/operators/gcs_to_gcs.py b/airflow/contrib/operators/gcs_to_gcs.py index dc67ddc..6acc517 100644 --- a/airflow/contrib/operators/gcs_to_gcs.py +++ b/airflow/contrib/operators/gcs_to_gcs.py @@ -7,9 +7,9 @@ # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at -# +# # http://www.apache.org/licenses/LICENSE-2.0 -# +# # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY @@ -24,7 +24,7 @@ from airflow.utils.decorators import apply_defaults class GoogleCloudStorageToGoogleCloudStorageOperator(BaseOperator): """ -Copies an object from a bucket to another, with renaming if requested. +Copies objects from a bucket to another, with renaming if requested. :param source_bucket: The source Google cloud storage bucket where the object is. :type source_bucket: string @@ -43,8 +43,7 @@ class GoogleCloudStorageToGoogleCloudStorageOperator(BaseOperator): destination Google cloud storage bucket. If a wildcard is supplied in the source_object argument, this is the -folder that the files will be -copied to in the destination bucket. +prefix that will be prepended to the final destination objects' paths. :type destination_object: string :param move_object: When move object is True, the object is moved instead of copied to the new location. @@ -96,24 +95,34 @@ class GoogleCloudStorageToGoogleCloudStorageOperator(BaseOperator): objects = hook.list(self.source_bucket, prefix=self.source_object[:wildcard_position], delimiter=self.source_object[wildcard_position + 1:]) + for source_object in objects: +if self.destination_object: +destination_object = "{}/{}".format(self.destination_object, +source_object) +else: +destination_object = source_object self.log.info('Executing copy of gs://{0}/{1} to ' - 'gs://{2}/{3}/{1}'.format(self.source_bucket, -source_object, - self.destination_bucket, - self.destination_object, -source_object)) + 'gs://{2}/{3}'.format(self.source_bucket, +source_object, +self.destination_bucket, +destination_object)) + hook.copy(self.source_bucket, source_object, - self.destination_bucket, "{}/{}".format(self.destination_object, - source_object)) + self.destination_bucket, destination_object) if self.move_object: hook.delete(self.source_bucket, source_object) else: -self.log.info('Executing copy: %s, %s, %s, %s', self.source_bucket, - self.source_object, - self.destination_bucket or self.source_bucket, - self.destination_object or self.source_object) +self.log.info(