[jira] [Commented] (AIRFLOW-1) Migrate GitHub code to Apache git
[ https://issues.apache.org/jira/browse/AIRFLOW-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111974#comment-16111974 ] ASF subversion and git services commented on AIRFLOW-1: --- Commit 84215b36773fd6b3062dbbdec2e7c21067dedda6 in incubator-airflow's branch refs/heads/master from Preete Dixit [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=84215b3 ] [AIRFLOW-1397][AIRFLOW-1] No Last Run column data displyed in Airflow UI 1.8.1 Closes #2430 from preete-dixit-ck/master > Migrate GitHub code to Apache git > - > > Key: AIRFLOW-1 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1 > Project: Apache Airflow > Issue Type: Improvement > Components: project-management >Reporter: Maxime Beauchemin >Assignee: Maxime Beauchemin > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AIRFLOW-1397) Airflow 1.8.1 - No data displays in Last Run Column in Airflow UI
[ https://issues.apache.org/jira/browse/AIRFLOW-1397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111973#comment-16111973 ] ASF subversion and git services commented on AIRFLOW-1397: -- Commit 84215b36773fd6b3062dbbdec2e7c21067dedda6 in incubator-airflow's branch refs/heads/master from Preete Dixit [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=84215b3 ] [AIRFLOW-1397][AIRFLOW-1] No Last Run column data displyed in Airflow UI 1.8.1 Closes #2430 from preete-dixit-ck/master > Airflow 1.8.1 - No data displays in Last Run Column in Airflow UI > - > > Key: AIRFLOW-1397 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1397 > Project: Apache Airflow > Issue Type: Bug > Components: DAG, ui >Affects Versions: 1.8.1 >Reporter: user_airflow >Assignee: user_airflow >Priority: Critical > > Recently upgraded Airflow version from 1.8.0 to 1.8.1. After upgrading, the > Last Run column in Airflow UI started showing as Blank for all the existing > dags. > Created a pr for this bug: > https://github.com/apache/incubator-airflow/pull/2430 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
incubator-airflow git commit: [AIRFLOW-1397][AIRFLOW-1] No Last Run column data displyed in Airflow UI 1.8.1
Repository: incubator-airflow Updated Branches: refs/heads/master 18f8498a7 -> 84215b367 [AIRFLOW-1397][AIRFLOW-1] No Last Run column data displyed in Airflow UI 1.8.1 Closes #2430 from preete-dixit-ck/master Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/84215b36 Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/84215b36 Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/84215b36 Branch: refs/heads/master Commit: 84215b36773fd6b3062dbbdec2e7c21067dedda6 Parents: 18f8498 Author: Preete Dixit Authored: Wed Aug 2 17:31:57 2017 -0700 Committer: Alex Guziel Committed: Wed Aug 2 17:31:57 2017 -0700 -- airflow/models.py | 3 +-- airflow/www/api/experimental/endpoints.py | 2 +- 2 files changed, 2 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/84215b36/airflow/models.py -- diff --git a/airflow/models.py b/airflow/models.py index 6e423dc..959d475 100755 --- a/airflow/models.py +++ b/airflow/models.py @@ -4454,13 +4454,12 @@ class DagRun(Base): @classmethod @provide_session def get_latest_runs(cls, session): -"""Returns the latest running DagRun for each DAG. """ +"""Returns the latest DagRun for each DAG. """ subquery = ( session .query( cls.dag_id, func.max(cls.execution_date).label('execution_date')) -.filter(cls.state == State.RUNNING) .group_by(cls.dag_id) .subquery() ) http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/84215b36/airflow/www/api/experimental/endpoints.py -- diff --git a/airflow/www/api/experimental/endpoints.py b/airflow/www/api/experimental/endpoints.py index a8d7f5c..3e0ff46 100644 --- a/airflow/www/api/experimental/endpoints.py +++ b/airflow/www/api/experimental/endpoints.py @@ -156,7 +156,7 @@ def task_instance_info(dag_id, execution_date, task_id): @api_experimental.route('/latest_runs', methods=['GET']) @requires_authentication def latest_dag_runs(): -"""Returns the latest running DagRun for each DAG formatted for the UI. """ +"""Returns the latest DagRun for each DAG formatted for the UI. """ from airflow.models import DagRun dagruns = DagRun.get_latest_runs() payload = []
[jira] [Updated] (AIRFLOW-1483) Page size on model views is to large to render quickly
[ https://issues.apache.org/jira/browse/AIRFLOW-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edgar Rodriguez updated AIRFLOW-1483: - Description: The current hardcoded values for the {{page_size}} on {{AirflowModelView}} is set to {{500}} rows, which is usually too large to render in less than 1-2 secs in modern browsers. Also, in some endpoints it is also taking a long time to render server-side the HTML content for 500 rows, taking around 1-2 secs (on the server) or sometimes more. Simple approach is to reduce this value to something more sensible (100 maybe?). Probably making it a configurable value would be a good option too in case the default is not good enough. See attachment for a profiled sample of a page loading time. was: The current hardcoded values for the {{page_size}} on {{AirflowModelView}} is set to {{500}} rows, which is usually too large to render in less than 1-2 secs in modern browsers. Also, in some endpoints it is also taking a long time to render server-side the HTML content for 500 rows, taking around 1-2 secs (on the server) or sometimes more. Simple approach is to reduce this value to something more sensible (100 maybe?). Probably making it a configurable value would be a good option too in case the default is not good enough. > Page size on model views is to large to render quickly > -- > > Key: AIRFLOW-1483 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1483 > Project: Apache Airflow > Issue Type: Bug >Reporter: Edgar Rodriguez >Assignee: Edgar Rodriguez > Attachments: taskinstance_page_loading_breakdown.png > > > The current hardcoded values for the {{page_size}} on {{AirflowModelView}} is > set to {{500}} rows, which is usually too large to render in less than 1-2 > secs in modern browsers. > Also, in some endpoints it is also taking a long time to render server-side > the HTML content for 500 rows, taking around 1-2 secs (on the server) or > sometimes more. > Simple approach is to reduce this value to something more sensible (100 > maybe?). Probably making it a configurable value would be a good option too > in case the default is not good enough. > See attachment for a profiled sample of a page loading time. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (AIRFLOW-1483) Page size on model views is to large to render quickly
[ https://issues.apache.org/jira/browse/AIRFLOW-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edgar Rodriguez updated AIRFLOW-1483: - Attachment: taskinstance_page_loading_breakdown.png Loading page /admin/taskinstance breakdown: Web app, DOM processing and Page rendering. > Page size on model views is to large to render quickly > -- > > Key: AIRFLOW-1483 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1483 > Project: Apache Airflow > Issue Type: Bug >Reporter: Edgar Rodriguez >Assignee: Edgar Rodriguez > Attachments: taskinstance_page_loading_breakdown.png > > > The current hardcoded values for the {{page_size}} on {{AirflowModelView}} is > set to {{500}} rows, which is usually too large to render in less than 1-2 > secs in modern browsers. > Also, in some endpoints it is also taking a long time to render server-side > the HTML content for 500 rows, taking around 1-2 secs (on the server) or > sometimes more. > Simple approach is to reduce this value to something more sensible (100 > maybe?). Probably making it a configurable value would be a good option too > in case the default is not good enough. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AIRFLOW-774) dagbag_size/collect_dags/dagbag_import_errors stats incorrect
[ https://issues.apache.org/jira/browse/AIRFLOW-774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111850#comment-16111850 ] Ace Haidrey commented on AIRFLOW-774: - Got it. That's why I'm not seeing the errors that you saw, since my version is 1.8.1. But I may try taking a stab at correcting these stats.. We are using [statsd_exporter|https://github.com/prometheus/statsd_exporter] project with prometheus to monitor these instances and another issue is the labels for the stats in Airflow have underscores vs dots to split up, but they need to be dot separated so we will try changing that as well. The statsd_exporter can be configured to translate specific dot-separated StatsD metrics into labeled Prometheus metrics via a simple mapping language. That's an aside but just giving a background of what we're trying to accomplish! > dagbag_size/collect_dags/dagbag_import_errors stats incorrect > - > > Key: AIRFLOW-774 > URL: https://issues.apache.org/jira/browse/AIRFLOW-774 > Project: Apache Airflow > Issue Type: Bug > Components: logging >Reporter: Dan Davydov > > After the multiprocessor change was made (dag folders are processed in > parallel), the number of dags reported by airflow is for each of these > subprocesses which is inaccurate, and potentially orders of magnitude less > than the actual number of dags. These individual processes stats should be > aggregated. The collect_dags/dagbag_import_errors stats should also be fixed > (time it takes to parse the dags). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (AIRFLOW-1483) Page size on model views is to large to render quickly
[ https://issues.apache.org/jira/browse/AIRFLOW-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edgar Rodriguez reassigned AIRFLOW-1483: Assignee: Edgar Rodriguez > Page size on model views is to large to render quickly > -- > > Key: AIRFLOW-1483 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1483 > Project: Apache Airflow > Issue Type: Bug >Reporter: Edgar Rodriguez >Assignee: Edgar Rodriguez > > The current hardcoded values for the {{page_size}} on {{AirflowModelView}} is > set to {{500}} rows, which is usually too large to render in less than 1-2 > secs in modern browsers. > Also, in some endpoints it is also taking a long time to render server-side > the HTML content for 500 rows, taking around 1-2 secs (on the server) or > sometimes more. > Simple approach is to reduce this value to something more sensible (100 > maybe?). Probably making it a configurable value would be a good option too > in case the default is not good enough. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (AIRFLOW-1483) Page size on model views is to large to render quickly
Edgar Rodriguez created AIRFLOW-1483: Summary: Page size on model views is to large to render quickly Key: AIRFLOW-1483 URL: https://issues.apache.org/jira/browse/AIRFLOW-1483 Project: Apache Airflow Issue Type: Bug Reporter: Edgar Rodriguez The current hardcoded values for the {{page_size}} on {{AirflowModelView}} is set to {{500}} rows, which is usually too large to render in less than 1-2 secs in modern browsers. Also, in some endpoints it is also taking a long time to render server-side the HTML content for 500 rows, taking around 1-2 secs (on the server) or sometimes more. Simple approach is to reduce this value to something more sensible (100 maybe?). Probably making it a configurable value would be a good option too in case the default is not good enough. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AIRFLOW-774) dagbag_size/collect_dags/dagbag_import_errors stats incorrect
[ https://issues.apache.org/jira/browse/AIRFLOW-774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111825#comment-16111825 ] Dan Davydov commented on AIRFLOW-774: - Yes to master, IIRC they should appear starting from release 1.8.1. This is the JIRA: https://issues.apache.org/jira/browse/AIRFLOW-780 Yep only fixed the import errors, not the stats. The changes should be similar though (pull the stats logic out to the top-level which aggregates each individual parsing subprocesses errors). > dagbag_size/collect_dags/dagbag_import_errors stats incorrect > - > > Key: AIRFLOW-774 > URL: https://issues.apache.org/jira/browse/AIRFLOW-774 > Project: Apache Airflow > Issue Type: Bug > Components: logging >Reporter: Dan Davydov > > After the multiprocessor change was made (dag folders are processed in > parallel), the number of dags reported by airflow is for each of these > subprocesses which is inaccurate, and potentially orders of magnitude less > than the actual number of dags. These individual processes stats should be > aggregated. The collect_dags/dagbag_import_errors stats should also be fixed > (time it takes to parse the dags). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (AIRFLOW-1397) Airflow 1.8.1 - No data displays in Last Run Column in Airflow UI
[ https://issues.apache.org/jira/browse/AIRFLOW-1397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ace Haidrey reassigned AIRFLOW-1397: Assignee: user_airflow I think once you squash your commits in that PR it can finally be merged! > Airflow 1.8.1 - No data displays in Last Run Column in Airflow UI > - > > Key: AIRFLOW-1397 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1397 > Project: Apache Airflow > Issue Type: Bug > Components: DAG, ui >Affects Versions: 1.8.1 >Reporter: user_airflow >Assignee: user_airflow >Priority: Critical > > Recently upgraded Airflow version from 1.8.0 to 1.8.1. After upgrading, the > Last Run column in Airflow UI started showing as Blank for all the existing > dags. > Created a pr for this bug: > https://github.com/apache/incubator-airflow/pull/2430 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AIRFLOW-774) dagbag_size/collect_dags/dagbag_import_errors stats incorrect
[ https://issues.apache.org/jira/browse/AIRFLOW-774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111805#comment-16111805 ] Ace Haidrey commented on AIRFLOW-774: - Are those changes merged in then? To the master branch or anywhere? And to clarify so you fixed the import errors are fixed but the correcting of the stats are not corrected yet, is that right? > dagbag_size/collect_dags/dagbag_import_errors stats incorrect > - > > Key: AIRFLOW-774 > URL: https://issues.apache.org/jira/browse/AIRFLOW-774 > Project: Apache Airflow > Issue Type: Bug > Components: logging >Reporter: Dan Davydov > > After the multiprocessor change was made (dag folders are processed in > parallel), the number of dags reported by airflow is for each of these > subprocesses which is inaccurate, and potentially orders of magnitude less > than the actual number of dags. These individual processes stats should be > aggregated. The collect_dags/dagbag_import_errors stats should also be fixed > (time it takes to parse the dags). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AIRFLOW-774) dagbag_size/collect_dags/dagbag_import_errors stats incorrect
[ https://issues.apache.org/jira/browse/AIRFLOW-774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111797#comment-16111797 ] Dan Davydov commented on AIRFLOW-774: - I fixed the import errors in a separate JIRA and it was maybe 1-2 days of work end-to-end. Maybe 2-4 days for someone not as familiar with the codebase? > dagbag_size/collect_dags/dagbag_import_errors stats incorrect > - > > Key: AIRFLOW-774 > URL: https://issues.apache.org/jira/browse/AIRFLOW-774 > Project: Apache Airflow > Issue Type: Bug > Components: logging >Reporter: Dan Davydov > > After the multiprocessor change was made (dag folders are processed in > parallel), the number of dags reported by airflow is for each of these > subprocesses which is inaccurate, and potentially orders of magnitude less > than the actual number of dags. These individual processes stats should be > aggregated. The collect_dags/dagbag_import_errors stats should also be fixed > (time it takes to parse the dags). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AIRFLOW-774) dagbag_size/collect_dags/dagbag_import_errors stats incorrect
[ https://issues.apache.org/jira/browse/AIRFLOW-774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111775#comment-16111775 ] Ace Haidrey commented on AIRFLOW-774: - Does anyone have an idea of how involved a fix to this would be, or the way to go about it. Currently our team is running to the same issues of having errors regarding stats collection. Counts are off, and more. > dagbag_size/collect_dags/dagbag_import_errors stats incorrect > - > > Key: AIRFLOW-774 > URL: https://issues.apache.org/jira/browse/AIRFLOW-774 > Project: Apache Airflow > Issue Type: Bug > Components: logging >Reporter: Dan Davydov > > After the multiprocessor change was made (dag folders are processed in > parallel), the number of dags reported by airflow is for each of these > subprocesses which is inaccurate, and potentially orders of magnitude less > than the actual number of dags. These individual processes stats should be > aggregated. The collect_dags/dagbag_import_errors stats should also be fixed > (time it takes to parse the dags). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (AIRFLOW-1452) "airflow initdb" stuck forever on upgrade
[ https://issues.apache.org/jira/browse/AIRFLOW-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111697#comment-16111697 ] Allison Wang edited comment on AIRFLOW-1452 at 8/2/17 8:57 PM: --- Then there must be locks in the database when you run {{airflow initdb}}. I am not familiar with MSSQL but the SQL in posted error message post is {{UPDATE alembic_version SET version_num='cc1e65623dc7' WHERE alembic_version.version_num = '127d2bf2dfa7'}} This is the error of updating alembic_version, not any particular operation related to adding max_tries column. Please look into what exactly causes this error in MSSQL: {{[Microsoft][ODBC Driver 13 for SQL Server]TCP Provider: Error code 0x2746 (10054)}} Please make sure there is no lock before and during the migration. MSSQL is not officially supported DB. This migration script is tested against MySQL, Postgres and SQLite. We recommend using MySQL and Postgres as we can provide more support for issues with these databases. was (Author: allisonwang): Then there must be locks in the database when you run {{airflow initdb}}. I am not familiar with MSSQL but the SQL in posted error message post is {{UPDATE alembic_version SET version_num='cc1e65623dc7' WHERE alembic_version.version_num = '127d2bf2dfa7'}} This is the error of updating alembic_version, not any particular operation related to adding max_tries column. Please look into what exactly causes this error in MSSQL: {{[Microsoft][ODBC Driver 13 for SQL Server]TCP Provider: Error code 0x2746 (10054)}} Please make sure there is no lock before and during the migration. This migration script is tested against MySQL, Postgres and SQLite. We recommend using MySQL and Postgres as we can provide more support for issues with these databases. > "airflow initdb" stuck forever on upgrade > - > > Key: AIRFLOW-1452 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1452 > Project: Apache Airflow > Issue Type: Bug > Components: db >Reporter: Pavel Martynov > Attachments: docker-compose.yml, Dockerfile, run-initdb.sh > > > I install airflow from the current master branch > (426b6a65f6ec142449893e36fcd677941bdad879 when I write this issue) and run > "airflow initdb" against MS SQL and it stuck forever with that output: > {noformat} > [2017-07-25 07:30:12,458] {db.py:307} INFO - Creating tables > INFO [alembic.runtime.migration] Context impl MSSQLImpl. > INFO [alembic.runtime.migration] Will assume transactional DDL. > INFO [alembic.runtime.migration] Running upgrade -> e3a246e0dc1, current > schema > INFO [alembic.runtime.migration] Running upgrade e3a246e0dc1 -> > 1507a7289a2f, create is_encrypted > INFO [alembic.runtime.migration] Running upgrade 1507a7289a2f -> > 13eb55f81627, maintain history for compatibility with earlier migrations > INFO [alembic.runtime.migration] Running upgrade 13eb55f81627 -> > 338e90f54d61, More logging into task_isntance > INFO [alembic.runtime.migration] Running upgrade 338e90f54d61 -> > 52d714495f0, job_id indices > INFO [alembic.runtime.migration] Running upgrade 52d714495f0 -> > 502898887f84, Adding extra to Log > INFO [alembic.runtime.migration] Running upgrade 502898887f84 -> > 1b38cef5b76e, add dagrun > INFO [alembic.runtime.migration] Running upgrade 1b38cef5b76e -> > 2e541a1dcfed, task_duration > INFO [alembic.runtime.migration] Running upgrade 2e541a1dcfed -> > 40e67319e3a9, dagrun_config > INFO [alembic.runtime.migration] Running upgrade 40e67319e3a9 -> > 561833c1c74b, add password column to user > INFO [alembic.runtime.migration] Running upgrade 561833c1c74b -> 4446e08588, > dagrun start end > INFO [alembic.runtime.migration] Running upgrade 4446e08588 -> bbc73705a13e, > Add notification_sent column to sla_miss > INFO [alembic.runtime.migration] Running upgrade bbc73705a13e -> > bba5a7cfc896, Add a column to track the encryption state of the 'Extra' field > in connection > INFO [alembic.runtime.migration] Running upgrade bba5a7cfc896 -> > 1968acfc09e3, add is_encrypted column to variable table > INFO [alembic.runtime.migration] Running upgrade 1968acfc09e3 -> > 2e82aab8ef20, rename user table > INFO [alembic.runtime.migration] Running upgrade 2e82aab8ef20 -> > 211e584da130, add TI state index > INFO [alembic.runtime.migration] Running upgrade 211e584da130 -> > 64de9cddf6c9, add task fails journal table > INFO [alembic.runtime.migration] Running upgrade 64de9cddf6c9 -> > f2ca10b85618, add dag_stats table > INFO [alembic.runtime.migration] Running upgrade f2ca10b85618 -> > 4addfa1236f1, Add fractional seconds to mysql tables > INFO [alembic.runtime.migration] Running upgrade 4addfa1236f1 -> > 8504051e801b, xcom dag task indices > INFO [alembic.runtime.migration] Running upgrade 850405
[jira] [Comment Edited] (AIRFLOW-1452) "airflow initdb" stuck forever on upgrade
[ https://issues.apache.org/jira/browse/AIRFLOW-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111697#comment-16111697 ] Allison Wang edited comment on AIRFLOW-1452 at 8/2/17 8:53 PM: --- Then there must be locks in the database when you run {{airflow initdb}}. I am not familiar with MSSQL but the SQL in posted error message post is {{UPDATE alembic_version SET version_num='cc1e65623dc7' WHERE alembic_version.version_num = '127d2bf2dfa7'}} This is the error of updating alembic_version, not any particular operation related to adding max_tries column. Please look into what exactly causes this error in MSSQL: {{[Microsoft][ODBC Driver 13 for SQL Server]TCP Provider: Error code 0x2746 (10054)}} Please make sure there is no lock before and during the migration. This migration script is tested against MySQL, Postgres and SQLite. We recommend using MySQL and Postgres as we can provide more support for issues with these databases. was (Author: allisonwang): Then there must be locks in the database when you run {{airflow initdb}}. I am not familiar with MSSQL but the SQL in posted error message post is {{UPDATE alembic_version SET version_num='cc1e65623dc7' WHERE alembic_version.version_num = '127d2bf2dfa7'}} This is the error of updating alembic_version, not any particular operation related to adding max_tries column. Please make sure there is no lock before and during the migration. This migration script is tested against MySQL, Postgres and SQLite. We recommend using MySQL and Postgres as we can provide more support for issues with these databases. > "airflow initdb" stuck forever on upgrade > - > > Key: AIRFLOW-1452 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1452 > Project: Apache Airflow > Issue Type: Bug > Components: db >Reporter: Pavel Martynov > Attachments: docker-compose.yml, Dockerfile, run-initdb.sh > > > I install airflow from the current master branch > (426b6a65f6ec142449893e36fcd677941bdad879 when I write this issue) and run > "airflow initdb" against MS SQL and it stuck forever with that output: > {noformat} > [2017-07-25 07:30:12,458] {db.py:307} INFO - Creating tables > INFO [alembic.runtime.migration] Context impl MSSQLImpl. > INFO [alembic.runtime.migration] Will assume transactional DDL. > INFO [alembic.runtime.migration] Running upgrade -> e3a246e0dc1, current > schema > INFO [alembic.runtime.migration] Running upgrade e3a246e0dc1 -> > 1507a7289a2f, create is_encrypted > INFO [alembic.runtime.migration] Running upgrade 1507a7289a2f -> > 13eb55f81627, maintain history for compatibility with earlier migrations > INFO [alembic.runtime.migration] Running upgrade 13eb55f81627 -> > 338e90f54d61, More logging into task_isntance > INFO [alembic.runtime.migration] Running upgrade 338e90f54d61 -> > 52d714495f0, job_id indices > INFO [alembic.runtime.migration] Running upgrade 52d714495f0 -> > 502898887f84, Adding extra to Log > INFO [alembic.runtime.migration] Running upgrade 502898887f84 -> > 1b38cef5b76e, add dagrun > INFO [alembic.runtime.migration] Running upgrade 1b38cef5b76e -> > 2e541a1dcfed, task_duration > INFO [alembic.runtime.migration] Running upgrade 2e541a1dcfed -> > 40e67319e3a9, dagrun_config > INFO [alembic.runtime.migration] Running upgrade 40e67319e3a9 -> > 561833c1c74b, add password column to user > INFO [alembic.runtime.migration] Running upgrade 561833c1c74b -> 4446e08588, > dagrun start end > INFO [alembic.runtime.migration] Running upgrade 4446e08588 -> bbc73705a13e, > Add notification_sent column to sla_miss > INFO [alembic.runtime.migration] Running upgrade bbc73705a13e -> > bba5a7cfc896, Add a column to track the encryption state of the 'Extra' field > in connection > INFO [alembic.runtime.migration] Running upgrade bba5a7cfc896 -> > 1968acfc09e3, add is_encrypted column to variable table > INFO [alembic.runtime.migration] Running upgrade 1968acfc09e3 -> > 2e82aab8ef20, rename user table > INFO [alembic.runtime.migration] Running upgrade 2e82aab8ef20 -> > 211e584da130, add TI state index > INFO [alembic.runtime.migration] Running upgrade 211e584da130 -> > 64de9cddf6c9, add task fails journal table > INFO [alembic.runtime.migration] Running upgrade 64de9cddf6c9 -> > f2ca10b85618, add dag_stats table > INFO [alembic.runtime.migration] Running upgrade f2ca10b85618 -> > 4addfa1236f1, Add fractional seconds to mysql tables > INFO [alembic.runtime.migration] Running upgrade 4addfa1236f1 -> > 8504051e801b, xcom dag task indices > INFO [alembic.runtime.migration] Running upgrade 8504051e801b -> > 5e7d17757c7a, add pid field to TaskInstance > INFO [alembic.runtime.migration] Running upgrade 5e7d17757c7a -> > 127d2bf2dfa7, Add dag_id/state index on dag_run table >
[jira] [Commented] (AIRFLOW-1452) "airflow initdb" stuck forever on upgrade
[ https://issues.apache.org/jira/browse/AIRFLOW-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111697#comment-16111697 ] Allison Wang commented on AIRFLOW-1452: --- Then there must be locks in the database when you run {{airflow initdb}}. I am not familiar with MSSQL but the SQL in posted error message post is {{UPDATE alembic_version SET version_num='cc1e65623dc7' WHERE alembic_version.version_num = '127d2bf2dfa7'}} This is the error of updating alembic_version, not any particular operation related to adding max_tries column. Please make sure there is no lock before and during the migration. This migration script is tested against MySQL, Postgres and SQLite. We recommend using MySQL and Postgres as we can provide more support for issues with these databases. > "airflow initdb" stuck forever on upgrade > - > > Key: AIRFLOW-1452 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1452 > Project: Apache Airflow > Issue Type: Bug > Components: db >Reporter: Pavel Martynov > Attachments: docker-compose.yml, Dockerfile, run-initdb.sh > > > I install airflow from the current master branch > (426b6a65f6ec142449893e36fcd677941bdad879 when I write this issue) and run > "airflow initdb" against MS SQL and it stuck forever with that output: > {noformat} > [2017-07-25 07:30:12,458] {db.py:307} INFO - Creating tables > INFO [alembic.runtime.migration] Context impl MSSQLImpl. > INFO [alembic.runtime.migration] Will assume transactional DDL. > INFO [alembic.runtime.migration] Running upgrade -> e3a246e0dc1, current > schema > INFO [alembic.runtime.migration] Running upgrade e3a246e0dc1 -> > 1507a7289a2f, create is_encrypted > INFO [alembic.runtime.migration] Running upgrade 1507a7289a2f -> > 13eb55f81627, maintain history for compatibility with earlier migrations > INFO [alembic.runtime.migration] Running upgrade 13eb55f81627 -> > 338e90f54d61, More logging into task_isntance > INFO [alembic.runtime.migration] Running upgrade 338e90f54d61 -> > 52d714495f0, job_id indices > INFO [alembic.runtime.migration] Running upgrade 52d714495f0 -> > 502898887f84, Adding extra to Log > INFO [alembic.runtime.migration] Running upgrade 502898887f84 -> > 1b38cef5b76e, add dagrun > INFO [alembic.runtime.migration] Running upgrade 1b38cef5b76e -> > 2e541a1dcfed, task_duration > INFO [alembic.runtime.migration] Running upgrade 2e541a1dcfed -> > 40e67319e3a9, dagrun_config > INFO [alembic.runtime.migration] Running upgrade 40e67319e3a9 -> > 561833c1c74b, add password column to user > INFO [alembic.runtime.migration] Running upgrade 561833c1c74b -> 4446e08588, > dagrun start end > INFO [alembic.runtime.migration] Running upgrade 4446e08588 -> bbc73705a13e, > Add notification_sent column to sla_miss > INFO [alembic.runtime.migration] Running upgrade bbc73705a13e -> > bba5a7cfc896, Add a column to track the encryption state of the 'Extra' field > in connection > INFO [alembic.runtime.migration] Running upgrade bba5a7cfc896 -> > 1968acfc09e3, add is_encrypted column to variable table > INFO [alembic.runtime.migration] Running upgrade 1968acfc09e3 -> > 2e82aab8ef20, rename user table > INFO [alembic.runtime.migration] Running upgrade 2e82aab8ef20 -> > 211e584da130, add TI state index > INFO [alembic.runtime.migration] Running upgrade 211e584da130 -> > 64de9cddf6c9, add task fails journal table > INFO [alembic.runtime.migration] Running upgrade 64de9cddf6c9 -> > f2ca10b85618, add dag_stats table > INFO [alembic.runtime.migration] Running upgrade f2ca10b85618 -> > 4addfa1236f1, Add fractional seconds to mysql tables > INFO [alembic.runtime.migration] Running upgrade 4addfa1236f1 -> > 8504051e801b, xcom dag task indices > INFO [alembic.runtime.migration] Running upgrade 8504051e801b -> > 5e7d17757c7a, add pid field to TaskInstance > INFO [alembic.runtime.migration] Running upgrade 5e7d17757c7a -> > 127d2bf2dfa7, Add dag_id/state index on dag_run table > INFO [alembic.runtime.migration] Running upgrade 127d2bf2dfa7 -> > cc1e65623dc7, add max tries column to task instance > {noformat} > I reproduce this problem with docker-compose, see files in attachment. > Also, I try this on 1.8.2rc2 and it works fine, looks like problem in > cc1e65623dc7_add_max_tries_column_to_task_instance.py migration. > Some locks occurred, I "killed lock" in MS SQL and got exception: > {noformat} > sqlalchemy.exc.DBAPIError: (pyodbc.Error) ('08S01', '[08S01] [Microsoft][ODBC > Driver 13 for SQL Server]TCP Provider: Error code 0x2746 (10054) > (SQLExecDirectW)') [SQL: u"UPDATE alembic_version SET > version_num='cc1e65623dc7' WHERE alembic_version.version_num = > '127d2bf2dfa7'"] > {noformat} > cc1e65623dc7_add_max_tries_column_to_task_instance.py migration was added for > [AIRFLOW-1366]. --
[jira] [Closed] (AIRFLOW-1474) Add dag_id regex for 'airflow clear' CLI command
[ https://issues.apache.org/jira/browse/AIRFLOW-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Riccomini closed AIRFLOW-1474. Resolution: Fixed Fix Version/s: 1.9.0 > Add dag_id regex for 'airflow clear' CLI command > > > Key: AIRFLOW-1474 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1474 > Project: Apache Airflow > Issue Type: Improvement > Components: cli >Reporter: Joy Gao >Assignee: Joy Gao >Priority: Minor > Fix For: 1.9.0 > > > The 'airflow clear' CLI command is currently limited to clearing a single DAG > per operation. It would be useful to add the capability to clear multiple > DAGs per operation using regex, similar to how task_id can be filtered via > regex. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AIRFLOW-1474) Add dag_id regex for 'airflow clear' CLI command
[ https://issues.apache.org/jira/browse/AIRFLOW-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111637#comment-16111637 ] ASF subversion and git services commented on AIRFLOW-1474: -- Commit 18f8498a74cfd7c65d7b4b1c7e868e738ecfa240 in incubator-airflow's branch refs/heads/master from [~joy.gao54] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=18f8498 ] [AIRFLOW-1474] Add dag_id regex feature for `airflow clear` command Closes #2486 from jgao54/airflow-clear > Add dag_id regex for 'airflow clear' CLI command > > > Key: AIRFLOW-1474 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1474 > Project: Apache Airflow > Issue Type: Improvement > Components: cli >Reporter: Joy Gao >Assignee: Joy Gao >Priority: Minor > > The 'airflow clear' CLI command is currently limited to clearing a single DAG > per operation. It would be useful to add the capability to clear multiple > DAGs per operation using regex, similar to how task_id can be filtered via > regex. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
incubator-airflow git commit: [AIRFLOW-1474] Add dag_id regex feature for `airflow clear` command
Repository: incubator-airflow Updated Branches: refs/heads/master 836f2899c -> 18f8498a7 [AIRFLOW-1474] Add dag_id regex feature for `airflow clear` command Closes #2486 from jgao54/airflow-clear Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/18f8498a Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/18f8498a Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/18f8498a Branch: refs/heads/master Commit: 18f8498a74cfd7c65d7b4b1c7e868e738ecfa240 Parents: 836f289 Author: Joy Gao Authored: Wed Aug 2 12:53:35 2017 -0700 Committer: Chris Riccomini Committed: Wed Aug 2 12:53:35 2017 -0700 -- airflow/bin/cli.py | 38 ++--- airflow/models.py | 65 - tests/core.py | 10 tests/models.py| 63 +++ 4 files changed, 161 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/18f8498a/airflow/bin/cli.py -- diff --git a/airflow/bin/cli.py b/airflow/bin/cli.py index a8543d3..dc49bb7 100755 --- a/airflow/bin/cli.py +++ b/airflow/bin/cli.py @@ -40,6 +40,7 @@ import threading import traceback import time import psutil +import re import airflow from airflow import api @@ -49,7 +50,8 @@ from airflow.exceptions import AirflowException from airflow.executors import GetDefaultExecutor from airflow.models import (DagModel, DagBag, TaskInstance, DagPickle, DagRun, Variable, DagStat, -Connection) +Connection, DAG) + from airflow.ti_deps.dep_context import (DepContext, SCHEDULER_DEPS) from airflow.utils import db as db_utils from airflow.utils import logging as logging_utils @@ -127,6 +129,19 @@ def get_dag(args): return dagbag.dags[args.dag_id] +def get_dags(args): +if not args.dag_regex: +return [get_dag(args)] +dagbag = DagBag(process_subdir(args.subdir)) +matched_dags = [dag for dag in dagbag.dags.values() if re.search( +args.dag_id, dag.dag_id)] +if not matched_dags: +raise AirflowException( +'dag_id could not be found with regex: {}. Either the dag did not exist ' +'or it failed to parse.'.format(args.dag_id)) +return matched_dags + + def backfill(args, dag=None): logging.basicConfig( level=settings.LOGGING_LEVEL, @@ -599,15 +614,17 @@ def clear(args): logging.basicConfig( level=settings.LOGGING_LEVEL, format=settings.SIMPLE_LOG_FORMAT) -dag = get_dag(args) +dags = get_dags(args) if args.task_regex: -dag = dag.sub_dag( -task_regex=args.task_regex, -include_downstream=args.downstream, -include_upstream=args.upstream, -) -dag.clear( +for idx, dag in enumerate(dags): +dags[idx] = dag.sub_dag( +task_regex=args.task_regex, +include_downstream=args.downstream, +include_upstream=args.upstream) + +DAG.clear_dags( +dags, start_date=args.start_date, end_date=args.end_date, only_failed=args.only_failed, @@ -1237,6 +1254,9 @@ class CLIFactory(object): 'exclude_subdags': Arg( ("-x", "--exclude_subdags"), "Exclude subdags", "store_true"), +'dag_regex': Arg( +("-dx", "--dag_regex"), +"Search dag_id as regex instead of exact string", "store_true"), # trigger_dag 'run_id': Arg(("-r", "--run_id"), "Helps to identify this run"), 'conf': Arg( @@ -1482,7 +1502,7 @@ class CLIFactory(object): 'args': ( 'dag_id', 'task_regex', 'start_date', 'end_date', 'subdir', 'upstream', 'downstream', 'no_confirm', 'only_failed', -'only_running', 'exclude_subdags'), +'only_running', 'exclude_subdags', 'dag_regex'), }, { 'func': pause, 'help': "Pause a DAG", http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/18f8498a/airflow/models.py -- diff --git a/airflow/models.py b/airflow/models.py index cc54f36..6e423dc 100755 --- a/airflow/models.py +++ b/airflow/models.py @@ -3249,11 +3249,10 @@ class DAG(BaseDag, LoggingMixin): # Crafting the right filter for dag_id and task_ids combo conditions = [] for dag in self.subdags + [self]: -if dag.task_ids: -conditions.append( -TI.dag_id.like(dag.dag_id) &
[jira] [Created] (AIRFLOW-1482) Error when try to backfill the example_trigger_controller_dag
Timothee N created AIRFLOW-1482: --- Summary: Error when try to backfill the example_trigger_controller_dag Key: AIRFLOW-1482 URL: https://issues.apache.org/jira/browse/AIRFLOW-1482 Project: Apache Airflow Issue Type: Bug Components: backfill Affects Versions: 1.8.1, 1.8.2 Environment: Ubuntu: 16.04 Python: 2.7 CeleryExecutor Broker: Redis Reporter: Timothee N Priority: Blocker Attachments: airflow_1.png, airflow_2.png, airflow_3.png Hello, Running a backfill command for the {noformat}example_trigger_controller_dag{noformat} example dag, result in the failed task {noformat}test_trigger_dagrun{noformat} It seems to me that the problem comes from the TriggerDagRunOperator in the example_trigger_controller_dag ? Backfill command: {noformat}airflow backfill -s 2017-07-10 -e 2017-07-13 --pool backfill example_trigger_controller_dag{noformat} Tested in 1.8.1 and 1.8.2rc1 Here is the output log from the backfill command : {noformat} [2017-08-02 13:53:00,844] {__init__.py:57} INFO - Using executor CeleryExecutor [2017-08-02 13:53:00,888] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt [2017-08-02 13:53:00,902] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt /var/lib/airflow/local/lib/python2.7/site-packages/airflow/www/app.py:23: FlaskWTFDeprecationWarning: "flask_wtf.CsrfProtect" has been renamed to "CSRFProtect" and will be removed in 1.0. csrf = CsrfProtect() [2017-08-02 13:53:01,033] {models.py:168} INFO - Filling up the DagBag from /var/lib/airflow/dags [2017-08-02 13:53:01,332] {models.py:1128} INFO - Dependencies all met for [2017-08-02 13:53:01,337] {base_executor.py:50} INFO - Adding to queue: airflow run example_trigger_controller_dag test_trigger_dagrun 2017-07-10T00:00:00 --pickle 1 --local --pool backfill [2017-08-02 13:53:06,267] {celery_executor.py:81} INFO - [celery] queuing (u'example_trigger_controller_dag', u'test_trigger_dagrun', datetime.datetime(2017, 7, 10, 0, 0)) through celery, queue=default [2017-08-02 13:53:06,330] {models.py:4164} INFO - Updating state for considering 1 task(s) [2017-08-02 13:53:06,334] {jobs.py:2020} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 0 | succeeded: 0 | kicked_off: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0 [2017-08-02 13:53:11,273] {jobs.py:1743} ERROR - Executor reports task instance finished (failed) although the task says its queued. Was the task killed externally? [2017-08-02 13:53:11,273] {models.py:1433} ERROR - Executor reports task instance finished (failed) although the task says its queued. Was the task killed externally? None [2017-08-02 13:53:11,273] {models.py:1457} INFO - Marking task as FAILED. [2017-08-02 13:53:11,279] {models.py:1478} ERROR - Executor reports task instance finished (failed) although the task says its queued. Was the task killed externally? [2017-08-02 13:53:11,281] {jobs.py:1694} ERROR - Task instance failed [2017-08-02 13:53:11,283] {models.py:4164} INFO - Updating state for considering 1 task(s) [2017-08-02 13:53:11,285] {models.py:4204} INFO - Marking run failed [2017-08-02 13:53:11,298] {jobs.py:2020} INFO - [backfill progress] | finished run 1 of 1 | tasks waiting: 0 | succeeded: 0 | kicked_off: 0 | failed: 1 | skipped: 0 | deadlocked: 0 | not ready: 0 Traceback (most recent call last): File "/var/lib/airflow/bin/airflow", line 28, in args.func(args) File "/var/lib/airflow/local/lib/python2.7/site-packages/airflow/bin/cli.py", line 167, in backfill pool=args.pool) File "/var/lib/airflow/local/lib/python2.7/site-packages/airflow/models.py", line 3373, in run job.run() File "/var/lib/airflow/local/lib/python2.7/site-packages/airflow/jobs.py", line 201, in run self._execute() File "/var/lib/airflow/local/lib/python2.7/site-packages/airflow/jobs.py", line 2063, in _execute raise AirflowException(err) airflow.exceptions.AirflowException: --- Some task instances failed: set([(u'example_trigger_controller_dag', u'test_trigger_dagrun', datetime.datetime(2017, 7, 10, 0, 0))]) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (AIRFLOW-1354) www/views.py pool_link produces wrong link
[ https://issues.apache.org/jira/browse/AIRFLOW-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Cederstrand updated AIRFLOW-1354: -- Description: The pool_link() function in www/views.py builds a link to {code:none}'/admin/taskinstance/?flt1_pool_equals=' + m.pool{code} but should probably return {code:none}'/admin/pool/?flt1_pool_equals=' + m.pool{code} was: The dag_link() function in www/views.py builds a link to {code:none}'/admin/taskinstance/?flt1_pool_equals=' + m.pool{code} but should probably return {code:none}'/admin/pool/?flt1_pool_equals=' + m.pool{code} > www/views.py pool_link produces wrong link > -- > > Key: AIRFLOW-1354 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1354 > Project: Apache Airflow > Issue Type: Bug > Components: webapp >Affects Versions: Airflow 1.8 >Reporter: Erik Cederstrand >Priority: Minor > > The pool_link() function in www/views.py builds a link to > {code:none}'/admin/taskinstance/?flt1_pool_equals=' + m.pool{code} > but should probably return > {code:none}'/admin/pool/?flt1_pool_equals=' + m.pool{code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)