[jira] [Resolved] (AIRFLOW-2604) dag_id, task_id, execution_date in dag_fail should be indexed

2018-06-26 Thread Joy Gao (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joy Gao resolved AIRFLOW-2604.
--
   Resolution: Fixed
Fix Version/s: 1.10.0

Issue resolved by pull request #3539
[https://github.com/apache/incubator-airflow/pull/3539]

> dag_id, task_id, execution_date in dag_fail should be indexed
> -
>
> Key: AIRFLOW-2604
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2604
> Project: Apache Airflow
>  Issue Type: Improvement
>Affects Versions: 1.10
>Reporter: Joy Gao
>Assignee: Stefan Seelmann
>Priority: Major
> Fix For: 1.10.0
>
>
> As a follow-up to AIRFLOW-2602, we should index dag_id, task_id and 
> execution_date to make sure the /gantt page (and any other future UIs relying 
> on task_fail) can still be rendered quickly as the table grows in size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2604) dag_id, task_id, execution_date in dag_fail should be indexed

2018-06-26 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524402#comment-16524402
 ] 

ASF subversion and git services commented on AIRFLOW-2604:
--

Commit d00762cb914803341c5d019d14385d249346d601 in incubator-airflow's branch 
refs/heads/master from [~seelmann]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=d00762c ]

[AIRFLOW-2604] Add index to task_fail


> dag_id, task_id, execution_date in dag_fail should be indexed
> -
>
> Key: AIRFLOW-2604
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2604
> Project: Apache Airflow
>  Issue Type: Improvement
>Affects Versions: 1.10
>Reporter: Joy Gao
>Assignee: Stefan Seelmann
>Priority: Major
>
> As a follow-up to AIRFLOW-2602, we should index dag_id, task_id and 
> execution_date to make sure the /gantt page (and any other future UIs relying 
> on task_fail) can still be rendered quickly as the table grows in size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2604) dag_id, task_id, execution_date in dag_fail should be indexed

2018-06-26 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524403#comment-16524403
 ] 

ASF subversion and git services commented on AIRFLOW-2604:
--

Commit 57bf9965921d7bbc2b4d9027d5fc1314fb7ad858 in incubator-airflow's branch 
refs/heads/master from Joy Gao
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=57bf996 ]

Merge pull request #3539 from seelmann/AIRFLOW-2604-index-task-fail


> dag_id, task_id, execution_date in dag_fail should be indexed
> -
>
> Key: AIRFLOW-2604
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2604
> Project: Apache Airflow
>  Issue Type: Improvement
>Affects Versions: 1.10
>Reporter: Joy Gao
>Assignee: Stefan Seelmann
>Priority: Major
>
> As a follow-up to AIRFLOW-2602, we should index dag_id, task_id and 
> execution_date to make sure the /gantt page (and any other future UIs relying 
> on task_fail) can still be rendered quickly as the table grows in size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[2/2] incubator-airflow git commit: Merge pull request #3539 from seelmann/AIRFLOW-2604-index-task-fail

2018-06-26 Thread joygao
Merge pull request #3539 from seelmann/AIRFLOW-2604-index-task-fail


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/57bf9965
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/57bf9965
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/57bf9965

Branch: refs/heads/master
Commit: 57bf9965921d7bbc2b4d9027d5fc1314fb7ad858
Parents: 6a668e7 d00762c
Author: Joy Gao 
Authored: Tue Jun 26 17:56:41 2018 -0700
Committer: Joy Gao 
Committed: Tue Jun 26 17:56:41 2018 -0700

--
 .../versions/9635ae0956e7_index_faskfail.py | 44 
 airflow/models.py   |  5 +++
 2 files changed, 49 insertions(+)
--




[1/2] incubator-airflow git commit: [AIRFLOW-2604] Add index to task_fail

2018-06-26 Thread joygao
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 6a668e741 -> 57bf99659


[AIRFLOW-2604] Add index to task_fail


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/d00762cb
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/d00762cb
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/d00762cb

Branch: refs/heads/master
Commit: d00762cb914803341c5d019d14385d249346d601
Parents: dbcb93c
Author: Stefan Seelmann 
Authored: Sun Jun 17 22:17:11 2018 +0200
Committer: Stefan Seelmann 
Committed: Fri Jun 22 20:33:05 2018 +0200

--
 .../versions/9635ae0956e7_index_faskfail.py | 44 
 airflow/models.py   |  5 +++
 2 files changed, 49 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/d00762cb/airflow/migrations/versions/9635ae0956e7_index_faskfail.py
--
diff --git a/airflow/migrations/versions/9635ae0956e7_index_faskfail.py 
b/airflow/migrations/versions/9635ae0956e7_index_faskfail.py
new file mode 100644
index 000..da69846
--- /dev/null
+++ b/airflow/migrations/versions/9635ae0956e7_index_faskfail.py
@@ -0,0 +1,44 @@
+# flake8: noqa
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""index-faskfail
+
+Revision ID: 9635ae0956e7
+Revises: 856955da8476
+Create Date: 2018-06-17 21:40:01.963540
+
+"""
+
+# revision identifiers, used by Alembic.
+revision = '9635ae0956e7'
+down_revision = '856955da8476'
+branch_labels = None
+depends_on = None
+
+from alembic import op
+import sqlalchemy as sa
+
+
+def upgrade():
+op.create_index('idx_task_fail_dag_task_date', 'task_fail', ['dag_id', 
'task_id', 'execution_date'], unique=False)
+
+
+def downgrade():
+op.drop_index('idx_task_fail_dag_task_date', table_name='task_fail')
+

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/d00762cb/airflow/models.py
--
diff --git a/airflow/models.py b/airflow/models.py
index 381e9d3..423612c 100755
--- a/airflow/models.py
+++ b/airflow/models.py
@@ -2057,6 +2057,11 @@ class TaskFail(Base):
 end_date = Column(UtcDateTime)
 duration = Column(Integer)
 
+__table_args__ = (
+Index('idx_task_fail_dag_task_date', dag_id, task_id, execution_date,
+  unique=False),
+)
+
 def __init__(self, task, execution_date, start_date, end_date):
 self.dag_id = task.dag_id
 self.task_id = task.task_id



[jira] [Commented] (AIRFLOW-2678) Fix db scheme unit test to remove checking fab models

2018-06-26 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524391#comment-16524391
 ] 

ASF subversion and git services commented on AIRFLOW-2678:
--

Commit 0c2206c7d617fe4925ece6478dd5b6caf5b179ba in incubator-airflow's branch 
refs/heads/master from Tao feng
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=0c2206c ]

[AIRFLOW-2678] Fix db schema unit test to remove checking fab models


> Fix db scheme unit test to remove checking fab models
> -
>
> Key: AIRFLOW-2678
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2678
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Tao Feng
>Assignee: Tao Feng
>Priority: Major
> Fix For: 1.10.0
>
>
> Currently airflow doesn't have FAB models as well migration script for the 
> models. We should ignore checking those models in the unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2678) Fix db scheme unit test to remove checking fab models

2018-06-26 Thread Joy Gao (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joy Gao resolved AIRFLOW-2678.
--
   Resolution: Fixed
Fix Version/s: 1.10.0

Issue resolved by pull request #3548
[https://github.com/apache/incubator-airflow/pull/3548]

> Fix db scheme unit test to remove checking fab models
> -
>
> Key: AIRFLOW-2678
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2678
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Tao Feng
>Assignee: Tao Feng
>Priority: Major
> Fix For: 1.10.0
>
>
> Currently airflow doesn't have FAB models as well migration script for the 
> models. We should ignore checking those models in the unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[2/2] incubator-airflow git commit: Merge pull request #3548 from feng-tao/airflow-2678

2018-06-26 Thread joygao
Merge pull request #3548 from feng-tao/airflow-2678


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/6a668e74
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/6a668e74
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/6a668e74

Branch: refs/heads/master
Commit: 6a668e74136409504c2640ec738fdf978d4f8caa
Parents: ea13caa 0c2206c
Author: Joy Gao 
Authored: Tue Jun 26 17:47:13 2018 -0700
Committer: Joy Gao 
Committed: Tue Jun 26 17:47:13 2018 -0700

--
 tests/utils/test_db.py | 24 ++--
 1 file changed, 18 insertions(+), 6 deletions(-)
--




[1/2] incubator-airflow git commit: [AIRFLOW-2678] Fix db schema unit test to remove checking fab models

2018-06-26 Thread joygao
Repository: incubator-airflow
Updated Branches:
  refs/heads/master ea13caa1e -> 6a668e741


[AIRFLOW-2678] Fix db schema unit test to remove checking fab models


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/0c2206c7
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/0c2206c7
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/0c2206c7

Branch: refs/heads/master
Commit: 0c2206c7d617fe4925ece6478dd5b6caf5b179ba
Parents: 5646d31
Author: Tao feng 
Authored: Tue Jun 26 00:23:11 2018 -0700
Committer: Tao feng 
Committed: Tue Jun 26 15:22:19 2018 -0700

--
 tests/utils/test_db.py | 24 ++--
 1 file changed, 18 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/0c2206c7/tests/utils/test_db.py
--
diff --git a/tests/utils/test_db.py b/tests/utils/test_db.py
index cb7b737..8ddd3ef 100644
--- a/tests/utils/test_db.py
+++ b/tests/utils/test_db.py
@@ -20,9 +20,8 @@
 import unittest
 
 from airflow.models import Base as airflow_base
-from flask_appbuilder.models.sqla import Base as fab_base
 
-from airflow.settings import engine, RBAC
+from airflow.settings import engine
 from alembic.autogenerate import compare_metadata
 from alembic.migration import MigrationContext
 from sqlalchemy import MetaData
@@ -31,13 +30,9 @@ from sqlalchemy import MetaData
 class DbTest(unittest.TestCase):
 
 def test_database_schema_and_sqlalchemy_model_are_in_sync(self):
-# combine Airflow and Flask-AppBuilder (if rbac enabled) models
 all_meta_data = MetaData()
 for (table_name, table) in airflow_base.metadata.tables.items():
 all_meta_data._add_table(table_name, table.schema, table)
-if RBAC:
-for (table_name, table) in fab_base.metadata.tables.items():
-all_meta_data._add_table(table_name, table.schema, table)
 
 # create diff between database schema and SQLAlchemy model
 mc = MigrationContext.configure(engine.connect())
@@ -64,6 +59,23 @@ class DbTest(unittest.TestCase):
t[1].name == 'celery_taskmeta'),
 lambda t: (t[0] == 'remove_table' and
t[1].name == 'celery_tasksetmeta'),
+# Ignore all the fab tables
+lambda t: (t[0] == 'remove_table' and
+   t[1].name == 'ab_permission'),
+lambda t: (t[0] == 'remove_table' and
+   t[1].name == 'ab_register_user'),
+lambda t: (t[0] == 'remove_table' and
+   t[1].name == 'ab_role'),
+lambda t: (t[0] == 'remove_table' and
+   t[1].name == 'ab_permission_view'),
+lambda t: (t[0] == 'remove_table' and
+   t[1].name == 'ab_permission_view_role'),
+lambda t: (t[0] == 'remove_table' and
+   t[1].name == 'ab_user_role'),
+lambda t: (t[0] == 'remove_table' and
+   t[1].name == 'ab_user'),
+lambda t: (t[0] == 'remove_table' and
+   t[1].name == 'ab_view_menu'),
 ]
 for ignore in ignores:
 diff = [d for d in diff if not ignore(d)]



[jira] [Closed] (AIRFLOW-2624) Airflow webserver broken out of the box

2018-06-26 Thread Joy Gao (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joy Gao closed AIRFLOW-2624.

   Resolution: Fixed
Fix Version/s: 1.10.0

> Airflow webserver broken out of the box
> ---
>
> Key: AIRFLOW-2624
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2624
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Kevin Yang
>Assignee: Kevin Yang
>Priority: Blocker
> Fix For: 1.10.0
>
>
> `airflow webserver` and then click on any DAG, I get
> ```
>   File "/Users/kevin_yang/ext_repos/incubator-airflow/airflow/www/utils.py", 
> line 364, in view_func
> return f(*args, **kwargs)
>   File "/Users/kevin_yang/ext_repos/incubator-airflow/airflow/www/utils.py", 
> line 251, in wrapper
> user = current_user.user.username
> AttributeError: 'NoneType' object has no attribute 'username'
> ```



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2624) Airflow webserver broken out of the box

2018-06-26 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524387#comment-16524387
 ] 

ASF subversion and git services commented on AIRFLOW-2624:
--

Commit 2fd9328b412841429acc288b1441c8351ee15e98 in incubator-airflow's branch 
refs/heads/master from [~kevinyang]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=2fd9328 ]

[AIRFLOW-2624] Fix webserver login as anonymous


> Airflow webserver broken out of the box
> ---
>
> Key: AIRFLOW-2624
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2624
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Kevin Yang
>Assignee: Kevin Yang
>Priority: Blocker
>
> `airflow webserver` and then click on any DAG, I get
> ```
>   File "/Users/kevin_yang/ext_repos/incubator-airflow/airflow/www/utils.py", 
> line 364, in view_func
> return f(*args, **kwargs)
>   File "/Users/kevin_yang/ext_repos/incubator-airflow/airflow/www/utils.py", 
> line 251, in wrapper
> user = current_user.user.username
> AttributeError: 'NoneType' object has no attribute 'username'
> ```



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[1/2] incubator-airflow git commit: [AIRFLOW-2624] Fix webserver login as anonymous

2018-06-26 Thread joygao
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 78f3d3338 -> ea13caa1e


[AIRFLOW-2624] Fix webserver login as anonymous


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/2fd9328b
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/2fd9328b
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/2fd9328b

Branch: refs/heads/master
Commit: 2fd9328b412841429acc288b1441c8351ee15e98
Parents: 0f4d681
Author: Kevin Yang 
Authored: Thu Jun 14 18:20:46 2018 -0700
Committer: Kevin Yang 
Committed: Fri Jun 22 12:05:28 2018 -0700

--
 airflow/www/utils.py|  6 ++--
 tests/www/test_utils.py | 85 ++--
 2 files changed, 86 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/2fd9328b/airflow/www/utils.py
--
diff --git a/airflow/www/utils.py b/airflow/www/utils.py
index 7d9c8a0..44fa5c4 100644
--- a/airflow/www/utils.py
+++ b/airflow/www/utils.py
@@ -246,8 +246,8 @@ def action_logging(f):
 """
 @functools.wraps(f)
 def wrapper(*args, **kwargs):
-# Only AnonymousUserMixin() does not have user attribute
-if current_user and hasattr(current_user, 'user'):
+# AnonymousUserMixin() has user attribute but its value is None.
+if current_user and hasattr(current_user, 'user') and 
current_user.user:
 user = current_user.user.username
 else:
 user = 'anonymous'
@@ -286,7 +286,7 @@ def notify_owner(f):
 dag = dagbag.get_dag(dag_id)
 task = dag.get_task(task_id)
 
-if current_user and hasattr(current_user, 'username'):
+if current_user and hasattr(current_user, 'user') and 
current_user.user:
 user = current_user.username
 else:
 user = 'anonymous'

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/2fd9328b/tests/www/test_utils.py
--
diff --git a/tests/www/test_utils.py b/tests/www/test_utils.py
index d69041a..9d788e8 100644
--- a/tests/www/test_utils.py
+++ b/tests/www/test_utils.py
@@ -7,9 +7,9 @@
 # to you under the Apache License, Version 2.0 (the
 # "License"); you may not use this file except in compliance
 # with the License.  You may obtain a copy of the License at
-# 
+#
 #   http://www.apache.org/licenses/LICENSE-2.0
-# 
+#
 # Unless required by applicable law or agreed to in writing,
 # software distributed under the License is distributed on an
 # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
@@ -17,9 +17,12 @@
 # specific language governing permissions and limitations
 # under the License.
 
+import mock
 import unittest
 from xml.dom import minidom
 
+from airflow.www import app as application
+
 from airflow.www import utils
 
 
@@ -109,6 +112,84 @@ class UtilsTest(unittest.TestCase):
 self.assertEqual('page=3&search=bash_&showPaused=False',
  utils.get_params(showPaused=False, page=3, 
search='bash_'))
 
+# flask_login is loaded by calling flask_login._get_user.
+@mock.patch("flask_login._get_user")
+@mock.patch("airflow.settings.Session")
+def test_action_logging_with_login_user(self, mocked_session, 
mocked_get_user):
+fake_username = 'someone'
+mocked_current_user = mock.MagicMock()
+mocked_get_user.return_value = mocked_current_user
+mocked_current_user.user.username = fake_username
+mocked_session_instance = mock.MagicMock()
+mocked_session.return_value = mocked_session_instance
+
+app = application.create_app(testing=True)
+# Patching here to avoid errors in applicant.create_app
+with mock.patch("airflow.models.Log") as mocked_log:
+with app.test_request_context():
+@utils.action_logging
+def some_func():
+pass
+
+some_func()
+mocked_log.assert_called_once()
+(args, kwargs) = mocked_log.call_args_list[0]
+self.assertEqual('some_func', kwargs['event'])
+self.assertEqual(fake_username, kwargs['owner'])
+mocked_session_instance.add.assert_called_once()
+
+@mock.patch("flask_login._get_user")
+@mock.patch("airflow.settings.Session")
+def test_action_logging_with_invalid_user(self, mocked_session, 
mocked_get_user):
+anonymous_username = 'anonymous'
+
+# When the user returned by flask login_manager._load_user
+# is invalid.
+mocked_current_user = mock.MagicMock()
+mocked_get_user.return_value = mocked_current_us

[2/2] incubator-airflow git commit: Merge pull request #3508 from yrqls21/kevin_yang_fix

2018-06-26 Thread joygao
Merge pull request #3508 from yrqls21/kevin_yang_fix


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/ea13caa1
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/ea13caa1
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/ea13caa1

Branch: refs/heads/master
Commit: ea13caa1ef7638f2caeb10b4e3f1bbb3cab0a4af
Parents: 78f3d33 2fd9328
Author: Joy Gao 
Authored: Tue Jun 26 17:43:29 2018 -0700
Committer: Joy Gao 
Committed: Tue Jun 26 17:43:29 2018 -0700

--
 airflow/www/utils.py|  6 ++--
 tests/www/test_utils.py | 85 ++--
 2 files changed, 86 insertions(+), 5 deletions(-)
--




[jira] [Commented] (AIRFLOW-2682) Add how-to guide(s) for how to use basic operators like BashOperator and PythonOperator

2018-06-26 Thread Tim Swast (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524336#comment-16524336
 ] 

Tim Swast commented on AIRFLOW-2682:


Per https://cwiki.apache.org/confluence/display/AIRFLOW/Common+Pitfalls the 
how-to guide on the BashOperator should call out the fact that it is templated 
and how to avoid problems when passing in a full path to a script.

> Add how-to guide(s) for how to use basic operators like BashOperator and 
> PythonOperator
> ---
>
> Key: AIRFLOW-2682
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2682
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: Documentation, examples
>Reporter: Tim Swast
>Assignee: Tim Swast
>Priority: Minor
>
> I notice there are no references to the 
> [example_python_operator|https://github.com/apache/incubator-airflow/blob/master/airflow/example_dags/example_python_operator.py]
>  or 
> [example_bash_operator|https://github.com/apache/incubator-airflow/blob/master/airflow/example_dags/example_bash_operator.py]
>  DAGs from the documentation. I believe these examples could be useful for 
> explaining how to use these operators, specifically, but also as supplemental 
> help for writing DAGs, generally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2682) Add how-to guide(s) for how to use basic operators like BaseOperator and PythonOperator

2018-06-26 Thread Tim Swast (JIRA)
Tim Swast created AIRFLOW-2682:
--

 Summary: Add how-to guide(s) for how to use basic operators like 
BaseOperator and PythonOperator
 Key: AIRFLOW-2682
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2682
 Project: Apache Airflow
  Issue Type: Improvement
  Components: Documentation, examples
Reporter: Tim Swast
Assignee: Tim Swast


I notice there are no references to the 
[example_python_operator|https://github.com/apache/incubator-airflow/blob/master/airflow/example_dags/example_python_operator.py]
 or 
[example_bash_operator|https://github.com/apache/incubator-airflow/blob/master/airflow/example_dags/example_bash_operator.py]
 DAGs from the documentation. I believe these examples could be useful for 
explaining how to use these operators, specifically, but also as supplemental 
help for writing DAGs, generally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2682) Add how-to guide(s) for how to use basic operators like BashOperator and PythonOperator

2018-06-26 Thread Tim Swast (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Swast updated AIRFLOW-2682:
---
Summary: Add how-to guide(s) for how to use basic operators like 
BashOperator and PythonOperator  (was: Add how-to guide(s) for how to use basic 
operators like BaseOperator and PythonOperator)

> Add how-to guide(s) for how to use basic operators like BashOperator and 
> PythonOperator
> ---
>
> Key: AIRFLOW-2682
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2682
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: Documentation, examples
>Reporter: Tim Swast
>Assignee: Tim Swast
>Priority: Minor
>
> I notice there are no references to the 
> [example_python_operator|https://github.com/apache/incubator-airflow/blob/master/airflow/example_dags/example_python_operator.py]
>  or 
> [example_bash_operator|https://github.com/apache/incubator-airflow/blob/master/airflow/example_dags/example_bash_operator.py]
>  DAGs from the documentation. I believe these examples could be useful for 
> explaining how to use these operators, specifically, but also as supplemental 
> help for writing DAGs, generally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AIRFLOW-855) Security - Airflow SQLAlchemy PickleType Allows for Code Execution

2018-06-26 Thread Rui Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang closed AIRFLOW-855.

Resolution: Won't Fix

> Security - Airflow SQLAlchemy PickleType Allows for Code Execution
> --
>
> Key: AIRFLOW-855
> URL: https://issues.apache.org/jira/browse/AIRFLOW-855
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Attachments: test_dag.txt
>
>
> Impact: Anyone able to modify the application's underlying database, or a 
> computer where certain DAG tasks are executed, may execute arbitrary code on 
> the Airflow host.
> Location: The XCom class in /airflow-internal-master/airflow/models.py
> Description: Airflow uses the SQLAlchemy object-relational mapping (ORM) to 
> allow for a database agnostic, object-oriented manipulation of application 
> data. You express database tables and values using Python (in this 
> application's use) classes, and the ORM transparently manipulates the 
> underlying database, when you programatically access these structures.
> Airflow defines the following class, defining an XCom's11 ORM model:
> {code}
> class XCom(Base): 
>   """
>   Base class for XCom objects. 
>   """
>   __tablename__ = "xcom"
>   id = Column(Integer, primary_key=True) 
>   key = Column(String(512))
>   value = Column(PickleType(pickler=dill)) 
>   timestamp = Column(
> DateTime, default=func.now(), nullable=False) 
>   execution_date = Column(DateTime, nullable=False)
> {code}
> XComs are used for inter-task communication, and their values are either 
> defined in a DAG, or the return value of the python_callable() function or 
> the task's execute() method, executed on an remote host. XCom values are, 
> according to this model, of the PickleType, meaning that objects assigned to 
> the value column are transparently serialized (when being written to) and 
> deserialized (when being read from). The deserialization of user- controlled 
> pickle objects allows for the execution of arbitrary code. This means that 
> "slaves" (where DAG code is executed) can compromise "masters" (where DAGs 
> are defined in code) by returning an object that, when serialized (and 
> subsequently deserialized), causes remote code execution. This can also be 
> triggered by anyone who has write access to this portion of the database.
> Note: NCC Group plans to meet with developers in the coming days to discuss 
> this finding, and it will be updated to reflect any additional insight 
> provided by this meeting.
> Reproduction Steps:
> 1. Configure a local instance of Airflow.
> 2. Insert the attached DAG into your AIRFLOW_HOME/dags directory.
> This example models a slave returning a malicious object to a task's 
> python_callable by creating a portable object (with reduce) containing a 
> reverse shell and pushing it as an XCom's value. This value is serialized 
> upon xcom_push and deserialized upon xcom_pull.
> In an actual exploit scenario, this value would be DAG function's return 
> value, as assigned by code within the function, executing on a malicious 
> remote machine.
> 3. Start a netcat listener on your machine's port 
> 4. Execute this task from the command line with airflow run push 2016-11-17. 
> Note that your netcat listener has received a shell connect-back.
> Remediation: Consider the use of a custom SQLAlchemy data type that performs 
> this transparent serialization and deserialization, but with JSON (a 
> text-based exchange format), rather than pickles (which may contain code).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2681) Last execution date is not included in UI for externally triggered DAGs

2018-06-26 Thread Joy Gao (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joy Gao resolved AIRFLOW-2681.
--
   Resolution: Fixed
Fix Version/s: 1.10.0

Issue resolved by pull request #3551
[https://github.com/apache/incubator-airflow/pull/3551]

> Last execution date is not included in UI for externally triggered DAGs
> ---
>
> Key: AIRFLOW-2681
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2681
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: David Hatch
>Assignee: David Hatch
>Priority: Major
> Fix For: 1.10.0
>
>
> If a DAG has no schedule and is only externally triggered, the last run's 
> execution date is not included in the UI.
>  
> This is because {{include_externally_triggered}} is not passed to 
> {{get_last_dagrun}} from the {{dags.html}} template.  It used to be before 
> this commit 
> https://github.com/apache/incubator-airflow/commit/0bf7adb209ce969243ffaf4fc5213ff3957cbbc9#diff-f38558559ea1b4c30ddf132b7f223cf9L299.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2681) Last execution date is not included in UI for externally triggered DAGs

2018-06-26 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524283#comment-16524283
 ] 

ASF subversion and git services commented on AIRFLOW-2681:
--

Commit 78f3d33388c772eafbed8fff81b0e50188297fc6 in incubator-airflow's branch 
refs/heads/master from [~dhatch]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=78f3d33 ]

[AIRFLOW-2681] Include last dag run of externally triggered DAGs in UI.

Closes #3551 from dhatch/AIRFLOW-2681


> Last execution date is not included in UI for externally triggered DAGs
> ---
>
> Key: AIRFLOW-2681
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2681
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: David Hatch
>Assignee: David Hatch
>Priority: Major
> Fix For: 1.10.0
>
>
> If a DAG has no schedule and is only externally triggered, the last run's 
> execution date is not included in the UI.
>  
> This is because {{include_externally_triggered}} is not passed to 
> {{get_last_dagrun}} from the {{dags.html}} template.  It used to be before 
> this commit 
> https://github.com/apache/incubator-airflow/commit/0bf7adb209ce969243ffaf4fc5213ff3957cbbc9#diff-f38558559ea1b4c30ddf132b7f223cf9L299.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2681) Last execution date is not included in UI for externally triggered DAGs

2018-06-26 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524282#comment-16524282
 ] 

ASF subversion and git services commented on AIRFLOW-2681:
--

Commit 78f3d33388c772eafbed8fff81b0e50188297fc6 in incubator-airflow's branch 
refs/heads/master from [~dhatch]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=78f3d33 ]

[AIRFLOW-2681] Include last dag run of externally triggered DAGs in UI.

Closes #3551 from dhatch/AIRFLOW-2681


> Last execution date is not included in UI for externally triggered DAGs
> ---
>
> Key: AIRFLOW-2681
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2681
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: David Hatch
>Assignee: David Hatch
>Priority: Major
> Fix For: 1.10.0
>
>
> If a DAG has no schedule and is only externally triggered, the last run's 
> execution date is not included in the UI.
>  
> This is because {{include_externally_triggered}} is not passed to 
> {{get_last_dagrun}} from the {{dags.html}} template.  It used to be before 
> this commit 
> https://github.com/apache/incubator-airflow/commit/0bf7adb209ce969243ffaf4fc5213ff3957cbbc9#diff-f38558559ea1b4c30ddf132b7f223cf9L299.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


incubator-airflow git commit: [AIRFLOW-2681] Include last dag run of externally triggered DAGs in UI.

2018-06-26 Thread joygao
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 5646d3115 -> 78f3d3338


[AIRFLOW-2681] Include last dag run of externally triggered DAGs in UI.

Closes #3551 from dhatch/AIRFLOW-2681


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/78f3d333
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/78f3d333
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/78f3d333

Branch: refs/heads/master
Commit: 78f3d33388c772eafbed8fff81b0e50188297fc6
Parents: 5646d31
Author: David Hatch 
Authored: Tue Jun 26 15:26:13 2018 -0700
Committer: Joy Gao 
Committed: Tue Jun 26 15:26:20 2018 -0700

--
 airflow/www/templates/airflow/dags.html  | 2 +-
 airflow/www_rbac/templates/airflow/dags.html | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/78f3d333/airflow/www/templates/airflow/dags.html
--
diff --git a/airflow/www/templates/airflow/dags.html 
b/airflow/www/templates/airflow/dags.html
index 8a86948..0a7a6ec 100644
--- a/airflow/www/templates/airflow/dags.html
+++ b/airflow/www/templates/airflow/dags.html
@@ -119,7 +119,7 @@
 
 
   {% if dag %}
-{% set last_run = dag.get_last_dagrun() %}
+{% set last_run = 
dag.get_last_dagrun(include_externally_triggered=True) %}
 {% if last_run and last_run.execution_date %}
   
 {{ last_run.execution_date.strftime("%Y-%m-%d %H:%M") 
}}

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/78f3d333/airflow/www_rbac/templates/airflow/dags.html
--
diff --git a/airflow/www_rbac/templates/airflow/dags.html 
b/airflow/www_rbac/templates/airflow/dags.html
index ed11a56..c7aa338 100644
--- a/airflow/www_rbac/templates/airflow/dags.html
+++ b/airflow/www_rbac/templates/airflow/dags.html
@@ -120,7 +120,7 @@
 
 
   {% if dag %}
-{% set last_run = dag.get_last_dagrun() %}
+{% set last_run = 
dag.get_last_dagrun(include_externally_triggered=True) %}
 {% if last_run and last_run.execution_date %}
   
 {{ last_run.execution_date.strftime("%Y-%m-%d %H:%M") 
}}



[jira] [Assigned] (AIRFLOW-2681) Last execution date is not included in UI for externally triggered DAGs

2018-06-26 Thread David Hatch (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Hatch reassigned AIRFLOW-2681:


Assignee: David Hatch

> Last execution date is not included in UI for externally triggered DAGs
> ---
>
> Key: AIRFLOW-2681
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2681
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: David Hatch
>Assignee: David Hatch
>Priority: Major
>
> If a DAG has no schedule and is only externally triggered, the last run's 
> execution date is not included in the UI.
>  
> This is because {{include_externally_triggered}} is not passed to 
> {{get_last_dagrun}} from the {{dags.html}} template.  It used to be before 
> this commit 
> https://github.com/apache/incubator-airflow/commit/0bf7adb209ce969243ffaf4fc5213ff3957cbbc9#diff-f38558559ea1b4c30ddf132b7f223cf9L299.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (AIRFLOW-2681) Last execution date is not included in UI for externally triggered DAGs

2018-06-26 Thread David Hatch (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-2681 started by David Hatch.

> Last execution date is not included in UI for externally triggered DAGs
> ---
>
> Key: AIRFLOW-2681
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2681
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: David Hatch
>Assignee: David Hatch
>Priority: Major
>
> If a DAG has no schedule and is only externally triggered, the last run's 
> execution date is not included in the UI.
>  
> This is because {{include_externally_triggered}} is not passed to 
> {{get_last_dagrun}} from the {{dags.html}} template.  It used to be before 
> this commit 
> https://github.com/apache/incubator-airflow/commit/0bf7adb209ce969243ffaf4fc5213ff3957cbbc9#diff-f38558559ea1b4c30ddf132b7f223cf9L299.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2681) Last execution date is not included in UI for externally triggered DAGs

2018-06-26 Thread David Hatch (JIRA)
David Hatch created AIRFLOW-2681:


 Summary: Last execution date is not included in UI for externally 
triggered DAGs
 Key: AIRFLOW-2681
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2681
 Project: Apache Airflow
  Issue Type: Bug
Reporter: David Hatch


If a DAG has no schedule and is only externally triggered, the last run's 
execution date is not included in the UI.

 

This is because {{include_externally_triggered}} is not passed to 
{{get_last_dagrun}} from the {{dags.html}} template.  It used to be before this 
commit 
https://github.com/apache/incubator-airflow/commit/0bf7adb209ce969243ffaf4fc5213ff3957cbbc9#diff-f38558559ea1b4c30ddf132b7f223cf9L299.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AIRFLOW-1154) Always getting Please import from '

2018-06-26 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor closed AIRFLOW-1154.
--
Resolution: Fixed

> Always getting Please import from ' --
>
> Key: AIRFLOW-1154
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1154
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Arthur Zubarev
>Priority: Minor
>
> Un Ubuntu 16 LTS 
> No matter how I import the module I am getting messages like
> /home/.../.local/lib/python2.7/site-packages/airflow/utils/helpers.py:406: 
> DeprecationWarning: Importing PythonOperator directly from  'airflow.operators' from 
> '/home/.../.local/lib/python2.7/site-packages/airflow/operators/__init__.pyc'>
>  has been deprecated. Please import from ' '/home/.../.local/lib/python2.7/site-packages/airflow/operators/__init__.pyc'>.[operator_module]'
>  instead. Support for direct imports will be dropped entirely in Airflow 2.0.
>   DeprecationWarning).
> The above is happening even when imported as 
> 'from airflow.operators.python_operator import PythonOperator'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1154) Always getting Please import from '

2018-06-26 Thread Tim Swast (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524150#comment-16524150
 ] 

Tim Swast commented on AIRFLOW-1154:


This issue is a debugging question from >1 year ago. It can probably be closed.

> Always getting Please import from ' --
>
> Key: AIRFLOW-1154
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1154
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Arthur Zubarev
>Priority: Minor
>
> Un Ubuntu 16 LTS 
> No matter how I import the module I am getting messages like
> /home/.../.local/lib/python2.7/site-packages/airflow/utils/helpers.py:406: 
> DeprecationWarning: Importing PythonOperator directly from  'airflow.operators' from 
> '/home/.../.local/lib/python2.7/site-packages/airflow/operators/__init__.pyc'>
>  has been deprecated. Please import from ' '/home/.../.local/lib/python2.7/site-packages/airflow/operators/__init__.pyc'>.[operator_module]'
>  instead. Support for direct imports will be dropped entirely in Airflow 2.0.
>   DeprecationWarning).
> The above is happening even when imported as 
> 'from airflow.operators.python_operator import PythonOperator'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1934) Unable to Launch Example DAG if ~/AIRFLOW_HOME/dags folder is empty

2018-06-26 Thread Tim Swast (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524137#comment-16524137
 ] 

Tim Swast commented on AIRFLOW-1934:


Appears to be a duplicate of https://issues.apache.org/jira/browse/AIRFLOW-1561

> Unable to Launch Example DAG if ~/AIRFLOW_HOME/dags folder is empty
> ---
>
> Key: AIRFLOW-1934
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1934
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG
>Affects Versions: Airflow 1.8
> Environment: RHEL
>Reporter: raman
>Priority: Major
>
> Steps to reproduce
> 1. Install airflow
> 2. Keep the ~/{airflow_home}/dags folder empty
> 3. airflow initdb
> 4. airflow webserver and scheduler
> 2. Enable a example DAG and trigger it manually from web UI.
> Result: DAG run gets created in the dag_run table. task_instance table also 
> get relevant enteries but scheduler does not pick the DAG.
> Workaround: Create one sample dag in the ~/{airflow_home}/dags folder and 
> scheduler picks it up.
> The following code in jobs.py seems to be doing the trick but this code is 
> only triggered if there is a dag inside ~/{airflow_home}/dags folder
> File: jobs.py
> Function: _find_executable_task_instances
> ti_query = (
>session
>.query(TI)
>.filter(TI.dag_id.in_(simple_dag_bag.dag_ids))
>.outerjoin(DR,
>and_(DR.dag_id == TI.dag_id,
> DR.execution_date == TI.execution_date))
>.filter(or_(DR.run_id == None,
>not_(DR.run_id.like(BackfillJob.ID_PREFIX + '%'
>.outerjoin(DM, DM.dag_id==TI.dag_id)
>.filter(or_(DM.dag_id == None,
>not_(DM.is_paused)))
>)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-820) Standardize GCP related connection id names and default values

2018-06-26 Thread Tim Swast (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524136#comment-16524136
 ] 

Tim Swast commented on AIRFLOW-820:
---

I've documented several of these at 
https://airflow.readthedocs.io/en/latest/howto/manage-connections.html#google-cloud-platform

I agree that it would make sense to standardize.

> Standardize GCP related connection id names and default values  
> 
>
> Key: AIRFLOW-820
> URL: https://issues.apache.org/jira/browse/AIRFLOW-820
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Feng Lu
>Assignee: Feng Lu
>Priority: Major
>
> A number of Google Cloud Platform (GCP) related operators, such as 
> BigQueryCheckOperator or DataFlowJavaOperator, are using different 
> connection_id var names and default values. For example, 
> BigQueryCheckOperator(.., big_query_conn_id='bigquery_default'..)
> DataFlowJavaOperator(..., gcp_conn_id='google_cloud_default'...)
> DataProcClusterCreateOperator(..., 
> google_cloud_conn_id='google_cloud_default',...)
> This makes dag-level default_args problematic, one would have to specify each 
> connection_id explicitly in the default_args even though the same GCP 
> connection is shared throughout the DAG.  We propose to: 
> - standardize all connection id names, e.g., 
>   big_query_conn_id ---> gcp_conn_id
>   google_cloud_conn_id-->gcp_conn_id
> - standardize all default values, e.g., 
>   'bigquery_default' -->  'google_cloud_default' 
> Therefore, if the same GCP connection is used, we only need to specify once 
> in the DAG default_args, e.g., 
> default_args = {
>  ...
>  gcp_conn_id='some_gcp_connection_id'
> ...
> } 
> Better still, if a connection with the default name 'google_cloud_default' 
> has already been created and used by all GCP operators, the gcp_conn_id 
> doesn't even need to be specified in DAG default_args. 
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1101) Docs use the term "Pipelines" instead of DAGs

2018-06-26 Thread Tim Swast (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524098#comment-16524098
 ] 

Tim Swast commented on AIRFLOW-1101:


DAGs appears more often than pipelines, so it probably makes sense to 
standardize on that. There is even a DAGs section in Concepts.

Workflows also appears quite often and interchangably with "pipelines"

> Docs use the term "Pipelines" instead of DAGs
> -
>
> Key: AIRFLOW-1101
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1101
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Andrew Chen
>Priority: Major
>
> It's a bit confusing to use the words "DAGs" and "pipelines" interchangeably 
> in the documentation. Here are a couple examples where it is especially 
> confusing.
> https://airflow.incubator.apache.org/configuration.html#connections
> "The pipeline code you will author will reference the ‘conn_id’ of the 
> Connection objects."
> "Connections in Airflow pipelines can be created using environment variables."
> https://airflow.incubator.apache.org/configuration.html#scaling-out-with-celery
> "If all your boxes have a common mount point, having your pipelines files 
> shared there should work as well"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-06-26 Thread Dan Fowler (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524022#comment-16524022
 ] 

Dan Fowler commented on AIRFLOW-1104:
-

[~jghoman] sure I can get started on that. I'll update this ticket once I have 
a PR ready.

> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2668) Missing cryptogrpahy dependency on airflow initdb call

2018-06-26 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-2668:
---
 Affects Version/s: 1.10.0
External issue URL: https://github.com/apache/incubator-airflow/pull/3550

> Missing cryptogrpahy dependency on airflow initdb call
> --
>
> Key: AIRFLOW-2668
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2668
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: dependencies
>Affects Versions: 1.10.0
>Reporter: Nicholas Pezolano
>Priority: Minor
>
> The cryptography packages looks to be required now for `airflow initdb` calls 
> on a fresh install from master as of commit 
> 702a57ec5a96d159105c4f5ca76ddd2229eb2f44.
> $ airflow initdb
>  Traceback (most recent call last):
>  File "/home/n/git/airflow_testing/env/bin/airflow", line 6, in 
>  exec(compile(open(__file__).read(), __file__, 'exec'))
>  File "/home/n/git/incubator-airflow/airflow/bin/airflow", line 21, in 
> 
>  from airflow import configuration
>  File "/home/n/git/incubator-airflow/airflow/__init__.py", line 37, in 
> 
>  from airflow.models import DAG
>  File "/home/n/git/incubator-airflow/airflow/models.py", line 31, in 
>  import cryptography
>  ImportError: No module named cryptography
>  
> Steps to reproduce:
> {code}
> git clone https://github.com/apache/incubator-airflow
> virtualenv env
> . env/bin/activate
> pip install -e .[s3]
> airflow initdb
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1463) Scheduler does not reschedule tasks in QUEUED state

2018-06-26 Thread James Meickle (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16523708#comment-16523708
 ] 

James Meickle commented on AIRFLOW-1463:


We ran into this in production last night. Our work instance ran out of memory; 
we suspect that it pulled messages from Celery, but then could not fork new 
worker processes. This resulted in a state where the task didn't exist in 
Celery, but the Scheduler thought it did.

I would have expected this check to result in the 
`SCHEDULED`-but-missing-from-Celery tasks eventually getting reset: 
[https://github.com/apache/incubator-airflow/blob/1.9.0/airflow/jobs.py#L213]

But it looks like this only runs on scheduler startup, and not periodically?

> Scheduler does not reschedule tasks in QUEUED state
> ---
>
> Key: AIRFLOW-1463
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1463
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: cli
> Environment: Ubuntu 14.04
> Airflow 1.8.0
> SQS backed task queue, AWS RDS backed meta storage
> DAG folder is synced by script on code push: archive is downloaded from s3, 
> unpacked, moved, install script is run. airflow executable is replaced with 
> symlink pointing to the latest version of code, no airflow processes are 
> restarted.
>Reporter: Stanislav Pak
>Priority: Minor
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Our pipelines related code is deployed almost simultaneously on all airflow 
> boxes: scheduler+webserver box, workers boxes. Some common python package is 
> deployed on those boxes on every other code push (3-5 deployments per hour). 
> Due to installation specifics, a DAG that imports module from that package 
> might fail. If DAG import fails when worker runs a task, the task is still 
> removed from the queue but task state is not changed, so in this case the 
> task stays in QUEUED state forever.
> Beside the described case, there is scenario when it happens because of DAG 
> update lag in scheduler. A task can be scheduled with old DAG and worker can 
> run the task with new DAG that fails to be imported.
> There might be other scenarios when it happens.
> Proposal:
> Catch errors when importing DAG on task run and clear task instance state if 
> import fails. This should fix transient issues of this kind.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2675) Run commands error after installed

2018-06-26 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16523666#comment-16523666
 ] 

Ash Berlin-Taylor commented on AIRFLOW-2675:


The k8s error comes from loading the example dags - if you don't want those you 
can disable the built-in example dags by changing the {{load_examples}} setting 
in the config file.

> Run commands error after installed
> --
>
> Key: AIRFLOW-2675
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2675
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: dependencies
>Reporter: Guoqiang Ding
>Assignee: Guoqiang Ding
>Priority: Major
>
> After I installed airflow in my test environment, it raises error on the 
> dependencies.
>  
> {code:java}
> //
> $ airflow
> Traceback (most recent call last):
> File "/opt/incubator-airflow/env/bin/airflow", line 6, in 
> exec(compile(open(__file__).read(), __file__, 'exec'))
> File "/opt/incubator-airflow/airflow/bin/airflow", line 21, in 
> from airflow import configuration
> File "/opt/incubator-airflow/airflow/__init__.py", line 37, in 
> from airflow.models import DAG
> File "/opt/incubator-airflow/airflow/models.py", line 31, in 
> import cryptography
> ImportError: No module named cryptography
> {code}
>  
>  
> And after I run "pip install cryptography" mannually, it raises error on the 
> k8s dependencies when I run "airflow init db".
>  
> {code:java}
> //
> $ airflow initdb
> [2018-06-26 10:52:08,259] {__init__.py:51} INFO - Using executor 
> SequentialExecutor
> DB: sqlite:home/openstack/airflow/airflow.db
> [2018-06-26 10:52:08,394] {db.py:338} INFO - Creating tables
> INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
> INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
> INFO  [alembic.runtime.migration] Running upgrade  -> e3a246e0dc1, current 
> schema
> INFO  [alembic.runtime.migration] Running upgrade e3a246e0dc1 -> 
> 1507a7289a2f, create is_encrypted
> /opt/incubator-airflow/env/local/lib/python2.7/site-packages/alembic/util/messaging.py:69:
>  UserWarning: Skipping unsupported ALTER for creation of implicit constraint
>   warnings.warn(msg)
> INFO  [alembic.runtime.migration] Running upgrade 1507a7289a2f -> 
> 13eb55f81627, maintain history for compatibility with earlier migrations
> INFO  [alembic.runtime.migration] Running upgrade 13eb55f81627 -> 
> 338e90f54d61, More logging into task_isntance
> INFO  [alembic.runtime.migration] Running upgrade 338e90f54d61 -> 
> 52d714495f0, job_id indices
> INFO  [alembic.runtime.migration] Running upgrade 52d714495f0 -> 
> 502898887f84, Adding extra to Log
> INFO  [alembic.runtime.migration] Running upgrade 502898887f84 -> 
> 1b38cef5b76e, add dagrun
> INFO  [alembic.runtime.migration] Running upgrade 1b38cef5b76e -> 
> 2e541a1dcfed, task_duration
> INFO  [alembic.runtime.migration] Running upgrade 2e541a1dcfed -> 
> 40e67319e3a9, dagrun_config
> INFO  [alembic.runtime.migration] Running upgrade 40e67319e3a9 -> 
> 561833c1c74b, add password column to user
> INFO  [alembic.runtime.migration] Running upgrade 561833c1c74b -> 4446e08588, 
> dagrun start end
> INFO  [alembic.runtime.migration] Running upgrade 4446e08588 -> bbc73705a13e, 
> Add notification_sent column to sla_miss
> INFO  [alembic.runtime.migration] Running upgrade bbc73705a13e -> 
> bba5a7cfc896, Add a column to track the encryption state of the 'Extra' field 
> in connection
> INFO  [alembic.runtime.migration] Running upgrade bba5a7cfc896 -> 
> 1968acfc09e3, add is_encrypted column to variable table
> INFO  [alembic.runtime.migration] Running upgrade 1968acfc09e3 -> 
> 2e82aab8ef20, rename user table
> INFO  [alembic.runtime.migration] Running upgrade 2e82aab8ef20 -> 
> 211e584da130, add TI state index
> INFO  [alembic.runtime.migration] Running upgrade 211e584da130 -> 
> 64de9cddf6c9, add task fails journal table
> INFO  [alembic.runtime.migration] Running upgrade 64de9cddf6c9 -> 
> f2ca10b85618, add dag_stats table
> INFO  [alembic.runtime.migration] Running upgrade f2ca10b85618 -> 
> 4addfa1236f1, Add fractional seconds to mysql tables
> INFO  [alembic.runtime.migration] Running upgrade 4addfa1236f1 -> 
> 8504051e801b, xcom dag task indices
> INFO  [alembic.runtime.migration] Running upgrade 8504051e801b -> 
> 5e7d17757c7a, add pid field to TaskInstance
> INFO  [alembic.runtime.migration] Running upgrade 5e7d17757c7a -> 
> 127d2bf2dfa7, Add dag_id/state index on dag_run table
> INFO  [alembic.runtime.migration] Running upgrade 127d2bf2dfa7 -> 
> cc1e65623dc7, add max tries column to task instance
> WARNI [airflow.utils.log.logging_mixin.LoggingMixin] Could not import 
> KubernetesPodOperator: No module named kubernetes
> WARNI [airflow.utils.log.logging_mixin.LoggingMixin] Install kubernetes 
> dependencies with:     

[jira] [Comment Edited] (AIRFLOW-2675) Run commands error after installed

2018-06-26 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16523661#comment-16523661
 ] 

Ash Berlin-Taylor edited comment on AIRFLOW-2675 at 6/26/18 12:33 PM:
--

You are on master version of Airflow or the released 1.9.0?

Missing cryptography is a bug that's been reported in AIRFLOW-2668. The k8s 
deps are only a warning - the second error is {{AirflowException: Could not 
create Fernet object: Incorrect padding}} which is again related to 
fernet/crypto.

Is this a clean environment? Have you made any changes to airflow.cfg at all?


was (Author: ashb):
You are on master version of Airflow or the released 1.9.0?

Missing cryptography is a bug that's been reported. The k8s deps are only a 
warning - the second error is {{AirflowException: Could not create Fernet 
object: Incorrect padding}} which is again related to fernet/crypto.

Is this a clean environment? Have you made any changes to airflow.cfg at all?

> Run commands error after installed
> --
>
> Key: AIRFLOW-2675
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2675
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: dependencies
>Reporter: Guoqiang Ding
>Assignee: Guoqiang Ding
>Priority: Major
>
> After I installed airflow in my test environment, it raises error on the 
> dependencies.
>  
> {code:java}
> //
> $ airflow
> Traceback (most recent call last):
> File "/opt/incubator-airflow/env/bin/airflow", line 6, in 
> exec(compile(open(__file__).read(), __file__, 'exec'))
> File "/opt/incubator-airflow/airflow/bin/airflow", line 21, in 
> from airflow import configuration
> File "/opt/incubator-airflow/airflow/__init__.py", line 37, in 
> from airflow.models import DAG
> File "/opt/incubator-airflow/airflow/models.py", line 31, in 
> import cryptography
> ImportError: No module named cryptography
> {code}
>  
>  
> And after I run "pip install cryptography" mannually, it raises error on the 
> k8s dependencies when I run "airflow init db".
>  
> {code:java}
> //
> $ airflow initdb
> [2018-06-26 10:52:08,259] {__init__.py:51} INFO - Using executor 
> SequentialExecutor
> DB: sqlite:home/openstack/airflow/airflow.db
> [2018-06-26 10:52:08,394] {db.py:338} INFO - Creating tables
> INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
> INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
> INFO  [alembic.runtime.migration] Running upgrade  -> e3a246e0dc1, current 
> schema
> INFO  [alembic.runtime.migration] Running upgrade e3a246e0dc1 -> 
> 1507a7289a2f, create is_encrypted
> /opt/incubator-airflow/env/local/lib/python2.7/site-packages/alembic/util/messaging.py:69:
>  UserWarning: Skipping unsupported ALTER for creation of implicit constraint
>   warnings.warn(msg)
> INFO  [alembic.runtime.migration] Running upgrade 1507a7289a2f -> 
> 13eb55f81627, maintain history for compatibility with earlier migrations
> INFO  [alembic.runtime.migration] Running upgrade 13eb55f81627 -> 
> 338e90f54d61, More logging into task_isntance
> INFO  [alembic.runtime.migration] Running upgrade 338e90f54d61 -> 
> 52d714495f0, job_id indices
> INFO  [alembic.runtime.migration] Running upgrade 52d714495f0 -> 
> 502898887f84, Adding extra to Log
> INFO  [alembic.runtime.migration] Running upgrade 502898887f84 -> 
> 1b38cef5b76e, add dagrun
> INFO  [alembic.runtime.migration] Running upgrade 1b38cef5b76e -> 
> 2e541a1dcfed, task_duration
> INFO  [alembic.runtime.migration] Running upgrade 2e541a1dcfed -> 
> 40e67319e3a9, dagrun_config
> INFO  [alembic.runtime.migration] Running upgrade 40e67319e3a9 -> 
> 561833c1c74b, add password column to user
> INFO  [alembic.runtime.migration] Running upgrade 561833c1c74b -> 4446e08588, 
> dagrun start end
> INFO  [alembic.runtime.migration] Running upgrade 4446e08588 -> bbc73705a13e, 
> Add notification_sent column to sla_miss
> INFO  [alembic.runtime.migration] Running upgrade bbc73705a13e -> 
> bba5a7cfc896, Add a column to track the encryption state of the 'Extra' field 
> in connection
> INFO  [alembic.runtime.migration] Running upgrade bba5a7cfc896 -> 
> 1968acfc09e3, add is_encrypted column to variable table
> INFO  [alembic.runtime.migration] Running upgrade 1968acfc09e3 -> 
> 2e82aab8ef20, rename user table
> INFO  [alembic.runtime.migration] Running upgrade 2e82aab8ef20 -> 
> 211e584da130, add TI state index
> INFO  [alembic.runtime.migration] Running upgrade 211e584da130 -> 
> 64de9cddf6c9, add task fails journal table
> INFO  [alembic.runtime.migration] Running upgrade 64de9cddf6c9 -> 
> f2ca10b85618, add dag_stats table
> INFO  [alembic.runtime.migration] Running upgrade f2ca10b85618 -> 
> 4addfa1236f1, Add fractional seconds to mysql tables
> INFO  [alembic.runtime.migration] Running upgrade 4addfa1236f1 

[jira] [Commented] (AIRFLOW-2675) Run commands error after installed

2018-06-26 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16523661#comment-16523661
 ] 

Ash Berlin-Taylor commented on AIRFLOW-2675:


You are on master version of Airflow or the released 1.9.0?

Missing cryptography is a bug that's been reported. The k8s deps are only a 
warning - the second error is {{AirflowException: Could not create Fernet 
object: Incorrect padding}} which is again related to fernet/crypto.

Is this a clean environment? Have you made any changes to airflow.cfg at all?

> Run commands error after installed
> --
>
> Key: AIRFLOW-2675
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2675
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: dependencies
>Reporter: Guoqiang Ding
>Assignee: Guoqiang Ding
>Priority: Major
>
> After I installed airflow in my test environment, it raises error on the 
> dependencies.
>  
> {code:java}
> //
> $ airflow
> Traceback (most recent call last):
> File "/opt/incubator-airflow/env/bin/airflow", line 6, in 
> exec(compile(open(__file__).read(), __file__, 'exec'))
> File "/opt/incubator-airflow/airflow/bin/airflow", line 21, in 
> from airflow import configuration
> File "/opt/incubator-airflow/airflow/__init__.py", line 37, in 
> from airflow.models import DAG
> File "/opt/incubator-airflow/airflow/models.py", line 31, in 
> import cryptography
> ImportError: No module named cryptography
> {code}
>  
>  
> And after I run "pip install cryptography" mannually, it raises error on the 
> k8s dependencies when I run "airflow init db".
>  
> {code:java}
> //
> $ airflow initdb
> [2018-06-26 10:52:08,259] {__init__.py:51} INFO - Using executor 
> SequentialExecutor
> DB: sqlite:home/openstack/airflow/airflow.db
> [2018-06-26 10:52:08,394] {db.py:338} INFO - Creating tables
> INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
> INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
> INFO  [alembic.runtime.migration] Running upgrade  -> e3a246e0dc1, current 
> schema
> INFO  [alembic.runtime.migration] Running upgrade e3a246e0dc1 -> 
> 1507a7289a2f, create is_encrypted
> /opt/incubator-airflow/env/local/lib/python2.7/site-packages/alembic/util/messaging.py:69:
>  UserWarning: Skipping unsupported ALTER for creation of implicit constraint
>   warnings.warn(msg)
> INFO  [alembic.runtime.migration] Running upgrade 1507a7289a2f -> 
> 13eb55f81627, maintain history for compatibility with earlier migrations
> INFO  [alembic.runtime.migration] Running upgrade 13eb55f81627 -> 
> 338e90f54d61, More logging into task_isntance
> INFO  [alembic.runtime.migration] Running upgrade 338e90f54d61 -> 
> 52d714495f0, job_id indices
> INFO  [alembic.runtime.migration] Running upgrade 52d714495f0 -> 
> 502898887f84, Adding extra to Log
> INFO  [alembic.runtime.migration] Running upgrade 502898887f84 -> 
> 1b38cef5b76e, add dagrun
> INFO  [alembic.runtime.migration] Running upgrade 1b38cef5b76e -> 
> 2e541a1dcfed, task_duration
> INFO  [alembic.runtime.migration] Running upgrade 2e541a1dcfed -> 
> 40e67319e3a9, dagrun_config
> INFO  [alembic.runtime.migration] Running upgrade 40e67319e3a9 -> 
> 561833c1c74b, add password column to user
> INFO  [alembic.runtime.migration] Running upgrade 561833c1c74b -> 4446e08588, 
> dagrun start end
> INFO  [alembic.runtime.migration] Running upgrade 4446e08588 -> bbc73705a13e, 
> Add notification_sent column to sla_miss
> INFO  [alembic.runtime.migration] Running upgrade bbc73705a13e -> 
> bba5a7cfc896, Add a column to track the encryption state of the 'Extra' field 
> in connection
> INFO  [alembic.runtime.migration] Running upgrade bba5a7cfc896 -> 
> 1968acfc09e3, add is_encrypted column to variable table
> INFO  [alembic.runtime.migration] Running upgrade 1968acfc09e3 -> 
> 2e82aab8ef20, rename user table
> INFO  [alembic.runtime.migration] Running upgrade 2e82aab8ef20 -> 
> 211e584da130, add TI state index
> INFO  [alembic.runtime.migration] Running upgrade 211e584da130 -> 
> 64de9cddf6c9, add task fails journal table
> INFO  [alembic.runtime.migration] Running upgrade 64de9cddf6c9 -> 
> f2ca10b85618, add dag_stats table
> INFO  [alembic.runtime.migration] Running upgrade f2ca10b85618 -> 
> 4addfa1236f1, Add fractional seconds to mysql tables
> INFO  [alembic.runtime.migration] Running upgrade 4addfa1236f1 -> 
> 8504051e801b, xcom dag task indices
> INFO  [alembic.runtime.migration] Running upgrade 8504051e801b -> 
> 5e7d17757c7a, add pid field to TaskInstance
> INFO  [alembic.runtime.migration] Running upgrade 5e7d17757c7a -> 
> 127d2bf2dfa7, Add dag_id/state index on dag_run table
> INFO  [alembic.runtime.migration] Running upgrade 127d2bf2dfa7 -> 
> cc1e65623dc7, add max tries column to task instance
> WARNI [airflow.utils.log.logging_mixin.Loggi

[jira] [Commented] (AIRFLOW-2668) Missing cryptogrpahy dependency on airflow initdb call

2018-06-26 Thread Nicholas Pezolano (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16523594#comment-16523594
 ] 

Nicholas Pezolano commented on AIRFLOW-2668:


it looks like the cryptography package is only installed if you install the 
kubernetes or crypto subpackages, should crypto be enabled by default ?

> Missing cryptogrpahy dependency on airflow initdb call
> --
>
> Key: AIRFLOW-2668
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2668
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: dependencies
>Reporter: Nicholas Pezolano
>Priority: Minor
>
> The cryptography packages looks to be required now for `airflow initdb` calls 
> on a fresh install from master as of commit 
> 702a57ec5a96d159105c4f5ca76ddd2229eb2f44.
> $ airflow initdb
>  Traceback (most recent call last):
>  File "/home/n/git/airflow_testing/env/bin/airflow", line 6, in 
>  exec(compile(open(__file__).read(), __file__, 'exec'))
>  File "/home/n/git/incubator-airflow/airflow/bin/airflow", line 21, in 
> 
>  from airflow import configuration
>  File "/home/n/git/incubator-airflow/airflow/__init__.py", line 37, in 
> 
>  from airflow.models import DAG
>  File "/home/n/git/incubator-airflow/airflow/models.py", line 31, in 
>  import cryptography
>  ImportError: No module named cryptography
>  
> Steps to reproduce:
> {code}
> git clone https://github.com/apache/incubator-airflow
> virtualenv env
> . env/bin/activate
> pip install -e .[s3]
> airflow initdb
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2680) Don't automatically percolate skipped stae

2018-06-26 Thread Andrei-Alin Popescu (JIRA)
Andrei-Alin Popescu created AIRFLOW-2680:


 Summary: Don't automatically percolate skipped stae
 Key: AIRFLOW-2680
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2680
 Project: Apache Airflow
  Issue Type: Improvement
Reporter: Andrei-Alin Popescu


Dear Airflow Maintainers,

 

As part of our workflow, we have cases where all the upstream of a certain task 
A can be skipped. In this case, airflow seems to automatically mark A as 
skipped.

However, this does not quite work for us, since there are changes external to 
the DAG which A needs to process, regardless of whether its upstream ran or 
not. Additionally, we require A to get into an "upstream_failed" state and not 
run if any its upstream tasks failed.

I don't see a trigger rule to cover this, so what would be the best way to 
achieve this? I was thinking we could attach a DummyOperator as an upstream to 
A, which in a way marks the fact that A depends on some external data and needs 
to run anyway, but this can get really ugly for big DAGs.

I was also thinking we could have a new trigger_rule, e.g. "no_failure", which 
would only trigger tasks if no upstream has failed. It differs from 
"all_success" in that it will also trigger if all upstream has been skipped, 
rather than percolating the skipped state on.

I'd really appreciate your feedback on this, and I'd like to know if in fact 
there is already a good way of doing this with airflow that I don't know of.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-1840) Fix Celery config

2018-06-26 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-1840:
---
Fix Version/s: (was: 1.9.1)
   1.10.0

> Fix Celery config
> -
>
> Key: AIRFLOW-1840
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1840
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
> Fix For: 1.10.0
>
>
> While configuring the Celery executor I keep running into this problem:
> ==> /var/log/airflow/scheduler.log <==
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python2.7/dist-packages/airflow/executors/celery_executor.py",
>  line 83, in sync
> state = async.state
>   File "/usr/local/lib/python2.7/dist-packages/celery/result.py", line 394, 
> in state
> return self._get_task_meta()['status']
>   File "/usr/local/lib/python2.7/dist-packages/celery/result.py", line 339, 
> in _get_task_meta
> return self._maybe_set_cache(self.backend.get_task_meta(self.id))
>   File "/usr/local/lib/python2.7/dist-packages/celery/backends/base.py", line 
> 307, in get_task_meta
> meta = self._get_task_meta_for(task_id)
> AttributeError: 'DisabledBackend' object has no attribute '_get_task_meta_for'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2679) GoogleCloudStorageToBigQueryOperator to support UPSERT

2018-06-26 Thread jack (JIRA)
jack created AIRFLOW-2679:
-

 Summary: GoogleCloudStorageToBigQueryOperator to support UPSERT
 Key: AIRFLOW-2679
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2679
 Project: Apache Airflow
  Issue Type: Improvement
Reporter: jack
 Fix For: 1.10.0, 1.10


Currently the 
{color:#22}GoogleCloudStorageToBigQueryOp{color}{color:#22}erator 
support incremental load using 
*{color:#404040}max_id_key{color}*{color:#404040} {color}.{color}

 

{color:#22}However many systems actually needs "UPSERT" in terms of - if 
row exists update it, if not insert/copy it.{color}

{color:#22}Currently the operator assumes that we only need to insert new 
data, it can't handle update of data. Most of the time data is not static it 
changes with time. Yesterday order status was NEW today it's Processing, 
tomorrow it's SENT in a month it will be REFUNDED etc... {color}

 

{color:#22} {color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2678) Fix db scheme unit test to remove checking fab models

2018-06-26 Thread Tao Feng (JIRA)
Tao Feng created AIRFLOW-2678:
-

 Summary: Fix db scheme unit test to remove checking fab models
 Key: AIRFLOW-2678
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2678
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Tao Feng
Assignee: Tao Feng


Currently airflow doesn't have FAB models as well migration script for the 
models. We should ignore checking those models in the unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)