[jira] [Resolved] (AIRFLOW-3357) Scheduler doesn't work on example DAGs unless there's some dag file to process

2018-11-26 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3357.

Resolution: Duplicate

Fixed too, not just closing as dupe.

> Scheduler doesn't work on example DAGs unless there's some dag file to process
> --
>
> Key: AIRFLOW-3357
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3357
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.0, 1.10.1
>Reporter: Victor Villas Bôas Chaves
>Priority: Major
>
> Having a blank Airflow install, if you try to manually run one of the example 
> DAGs no tasks are going to get queued or executed. They're going to stay with 
> state null.
> Steps to reproduce on a new airflow:
>  # Entered the UI, turned on the example_bash_operator, manually triggered 
> the example_bash_operator, 6 tasks went to None state, nothing gets scheduled
>  # Rebooted the scheduler (with debug logging on). Nothing gets scheduled.
>  # Create a mytutorial.py in the dag folder with code from tutorial.py but 
> DAG name changed, everything starts getting scheduled.
> A debug view of the logs is here: 
> [https://gist.github.com/tomfaulhaber/4e72ed0884c9580c606e02e4b745ddff]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-1934) Unable to Launch Example DAG if ~/AIRFLOW_HOME/dags folder is empty

2018-11-26 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-1934.

Resolution: Duplicate

> Unable to Launch Example DAG if ~/AIRFLOW_HOME/dags folder is empty
> ---
>
> Key: AIRFLOW-1934
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1934
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG
>Affects Versions: 1.8.0
> Environment: RHEL
>Reporter: raman
>Priority: Major
>
> Steps to reproduce
> 1. Install airflow
> 2. Keep the ~/{airflow_home}/dags folder empty
> 3. airflow initdb
> 4. airflow webserver and scheduler
> 2. Enable a example DAG and trigger it manually from web UI.
> Result: DAG run gets created in the dag_run table. task_instance table also 
> get relevant enteries but scheduler does not pick the DAG.
> Workaround: Create one sample dag in the ~/{airflow_home}/dags folder and 
> scheduler picks it up.
> The following code in jobs.py seems to be doing the trick but this code is 
> only triggered if there is a dag inside ~/{airflow_home}/dags folder
> File: jobs.py
> Function: _find_executable_task_instances
> ti_query = (
>session
>.query(TI)
>.filter(TI.dag_id.in_(simple_dag_bag.dag_ids))
>.outerjoin(DR,
>and_(DR.dag_id == TI.dag_id,
> DR.execution_date == TI.execution_date))
>.filter(or_(DR.run_id == None,
>not_(DR.run_id.like(BackfillJob.ID_PREFIX + '%'
>.outerjoin(DM, DM.dag_id==TI.dag_id)
>.filter(or_(DM.dag_id == None,
>not_(DM.is_paused)))
>)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-1561) Scheduler doesn't pick up example dags unless there is atleast 1 dag in dags folder

2018-11-26 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-1561:
---
Fix Version/s: 1.10.2

> Scheduler doesn't pick up example dags unless there is atleast 1 dag in dags 
> folder
> ---
>
> Key: AIRFLOW-1561
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1561
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.8.2
>Reporter: Sumit Maheshwari
>Assignee: Shintaro Murakami
>Priority: Major
> Fix For: 1.10.2, 2.0.0
>
> Attachments: airflow_scheduler_log.txt
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3301) Update CI test for [AIRFLOW-3132] (PR #3977)

2018-11-26 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3301.

Resolution: Fixed

> Update CI test for [AIRFLOW-3132] (PR #3977)
> 
>
> Key: AIRFLOW-3301
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3301
> Project: Apache Airflow
>  Issue Type: Test
>  Components: tests
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
>
> In PR [https://github.com/apache/incubator-airflow/pull/3977,] test is not 
> updated accordingly, and it results in CI failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3400) Remove python-nvd3

2018-11-26 Thread Ash Berlin-Taylor (JIRA)
Ash Berlin-Taylor created AIRFLOW-3400:
--

 Summary: Remove python-nvd3
 Key: AIRFLOW-3400
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3400
 Project: Apache Airflow
  Issue Type: Improvement
  Components: ui
Reporter: Ash Berlin-Taylor
 Fix For: 2.0.0


We are using python-nvd3 for generating charts from python side, and this pulls 
in slugify which by default uses a GPL'd dependency. This dep chian is the 
cause of needing to pass {{SLUGIFY_USES_TEXT_UNIDECODE=yes}} or similar, and 
that is annoying.

So we should remove python-nvd3, probably still continue using nvd3.js and 
d3.js.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2813) `pip install apache-airflow` fails on Python 3.7

2018-11-26 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-2813.

Resolution: Fixed

Airflow core itself is now installable in 1.10.1 -- some of the optional deps 
might still have problems.

> `pip install apache-airflow` fails on Python 3.7
> 
>
> Key: AIRFLOW-2813
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2813
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.9.0, 1.10.0
> Environment: Mac OS, Linux, Windows
>Reporter: Jeff Schwab
>Assignee: Ash Berlin-Taylor
>Priority: Major
> Fix For: 1.10.1
>
>
> `pip install apache-airflow` fails with a SyntaxError on Mac OS, and with a 
> different (extremely verbose) error on Linux.  This happens both on my 
> MacBook and on a fresh Alpine Linux Docker image, and with both pip2 and 
> pip3; a friend just tried `pip install apache-airflow` for me on his Windows 
> box, and it died with yet another error.  Googling quickly found someone else 
> seeing the same issue over a week ago: 
> https://gitter.im/apache/incubator-airflow?at=5b5130bac86c4f0b47201af0
> Please let me know what further information you would like, and/or what I am 
> doing wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3369) Un-pausing a DAG with catchup =False creates an extra DAG run (1.10)

2018-11-20 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16693125#comment-16693125
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3369:


(Sorry I might have asked this in slack too)

Does the this behaviour only happen the first time it is un-paused, or every 
time it is un-paused (after a long enough gap such that 2+ schedule intervals 
have passed)?

> Un-pausing a DAG with catchup =False creates an extra DAG run (1.10)
> 
>
> Key: AIRFLOW-3369
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3369
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Andrew Harmon
>Priority: Major
> Attachments: image.png
>
>
> If you create a DAG with catchup=False, when it is un-paused, it creates 2 
> dag runs. One for the most recent scheduled interval (expected) and one for 
> the interval before that (unexpected).
> *Sample DAG*
> {code:java}
> from airflow import DAG
> from datetime import datetime
> from airflow.operators.dummy_operator import DummyOperator
> dag = DAG(
> dag_id='DummyTest',
> start_date=datetime(2018,1,1),
> catchup=False
> )
> do = DummyOperator(
> task_id='dummy_task',
> dag=dag
> )
> {code}
> *Result:*
> 2 DAG runs are created. 2018-11-18 and 108-11-17
> *Expected Result:*
> Only 1 DAG run should have been created (2018-11-18)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3354) Scheduler compares offset-naive and offset-aware dates

2018-11-19 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3354.

Resolution: Not A Bug

> Scheduler compares offset-naive and offset-aware dates
> --
>
> Key: AIRFLOW-3354
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3354
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10.1
>Reporter: Jakub Powierza
>Priority: Major
>
> New version of Airflow (1.10.1rc1 and 1.10.1rc2) tries to compare 
> offset-naive and offset-aware dates in Scheduler. I've tested a simple case 
> with schedule set to "*/10 * * * *". I've tried to clean my developer 
> instance with `airflow resetdb` and start from scratch but it does not help 
> at all. This issue does not occur on stable version 1.10.0.
> My setup: Python 3.6 on Ubuntu 14.04 with Airflow Scheduler based on Celery 
> with RabbitMQ backend.
> Exception found in Scheduler logs:
> {code:java}
> 2018-11-15 14:41:23,194:ERROR:airflow.processor:[CT=None] Got an exception! 
> Propagating...
> Traceback (most recent call last):
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 389, in helper
>  pickle_dags)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 1846, in process_file
>  self._process_dags(dagbag, dags, ti_keys_to_schedule)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 1426, in _process_dags
>  dag_run = self.create_dag_run(dag)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 909, in create_dag_run
>  external_trigger=False
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/models.py",
>  line 4270, in create_dagrun
>  run.verify_integrity(session=session)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 70, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/models.py",
>  line 5215, in verify_integrity
>  if task.start_date > self.execution_date and not self.is_backfill:
> TypeError: can't compare offset-naive and offset-aware datetimes
> Process DagFileProcessor40-Process:
> Traceback (most recent call last):
>  File "/usr/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
>  self.run()
>  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
>  self._target(*self._args, **self._kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 389, in helper
>  pickle_dags)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 1846, in process_file
>  self._process_dags(dagbag, dags, ti_keys_to_schedule)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 1426, in _process_dags
>  dag_run = self.create_dag_run(dag)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 909, in create_dag_run
>  external_trigger=False
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/models.py",
>  line 4270, in create_dagrun
>  run.verify_integrity(session=session)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 70, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/models.py",
>  line 5215, in verify_integrity
>  if task.start_date > self.execution_date and not self.is_backfill:
> TypeError: can't compare offset-naive and offset-aware datetimes
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3361) Add the task_id to the Deprecation Warning when passing unsupported keywords to BaseOperator

2018-11-18 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3361.

   Resolution: Fixed
Fix Version/s: 1.10.2
   2.0.0

> Add the task_id to the Deprecation Warning when passing unsupported keywords 
> to BaseOperator
> 
>
> Key: AIRFLOW-3361
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3361
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: logging
>Affects Versions: 1.9.0
>Reporter: Martin Black
>Assignee: Martin Black
>Priority: Trivial
> Fix For: 2.0.0, 1.10.2
>
>
> In 2.0 passing invalid keywords to {{BaseOperator}} will be deprecated. Prior 
> to that, there is a {{PendingDeprecationWarning}} raised, however it can be 
> hard to track down which specific task is raising this warning.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-801) Outdated docstring on baseclass

2018-11-18 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-801.
---
Resolution: Fixed

> Outdated docstring on baseclass
> ---
>
> Key: AIRFLOW-801
> URL: https://issues.apache.org/jira/browse/AIRFLOW-801
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Daniel Seisun
>Assignee: Kengo Seki
>Priority: Trivial
>
> The docstring of the BaseOperator still makes reference to it inheriting from 
> SQL Alchemy's Base class, which it no longer does. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3307) Update insecure node dependencies

2018-11-18 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3307.

   Resolution: Fixed
Fix Version/s: 2.0.0

> Update insecure node dependencies
> -
>
> Key: AIRFLOW-3307
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3307
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Josh Carp
>Assignee: Josh Carp
>Priority: Trivial
> Fix For: 2.0.0
>
>
> `npm audit` shows some node dependencies that are out of date and potentially 
> insecure. We should update them with `npm audit fix`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3307) Update insecure node dependencies

2018-11-18 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3307:
---
Issue Type: Improvement  (was: Bug)

> Update insecure node dependencies
> -
>
> Key: AIRFLOW-3307
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3307
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Josh Carp
>Assignee: Josh Carp
>Priority: Trivial
> Fix For: 2.0.0
>
>
> `npm audit` shows some node dependencies that are out of date and potentially 
> insecure. We should update them with `npm audit fix`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3306) Disable unused flask-sqlalchemy modification tracking

2018-11-18 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3306.

   Resolution: Fixed
Fix Version/s: 1.10.2
   2.0.0

> Disable unused flask-sqlalchemy modification tracking
> -
>
> Key: AIRFLOW-3306
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3306
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Josh Carp
>Assignee: Josh Carp
>Priority: Trivial
> Fix For: 2.0.0, 1.10.2
>
>
> By default, flask-sqlalchemy tracks model changes for its event system, which 
> adds some overhead. Since I don't think we're using the flask-sqlalchemy 
> event system, we should be able to turn off modification tracking and improve 
> performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3361) Add the task_id to the Deprecation Warning when passing unsupported keywords to BaseOperator

2018-11-18 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3361:
---
Issue Type: Improvement  (was: Task)

> Add the task_id to the Deprecation Warning when passing unsupported keywords 
> to BaseOperator
> 
>
> Key: AIRFLOW-3361
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3361
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: logging
>Affects Versions: 1.9.0
>Reporter: Martin Black
>Assignee: Martin Black
>Priority: Trivial
> Fix For: 2.0.0, 1.10.2
>
>
> In 2.0 passing invalid keywords to {{BaseOperator}} will be deprecated. Prior 
> to that, there is a {{PendingDeprecationWarning}} raised, however it can be 
> hard to track down which specific task is raising this warning.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3362) Template to support jinja2 native python types

2018-11-18 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16691041#comment-16691041
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3362:


Oh interesting! I think that is a fairly new feature in Jinja

> Template to support jinja2 native python types
> --
>
> Key: AIRFLOW-3362
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3362
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core, DAG
>Reporter: Duan Shiqiang
>Priority: Major
>
> Airflow latest (1.10.x)'s template can only render into string which is fine 
> most of the times, but it would be better to support render into python types.
> It can be very useful if the template system can support render into native 
> python types like list, dictionary, etc. Especially when using xcom to pass 
> some values between operators.
> Jinja2 supports this feature from 2.10, more info can found here: 
> http://jinja.pocoo.org/docs/2.10/nativetypes/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-1252) Experimental API - exception when conf is present in JSON body

2018-11-18 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-1252:
---
Fix Version/s: 1.10.2

> Experimental API - exception when conf is present in JSON body
> --
>
> Key: AIRFLOW-1252
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1252
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: api
>Affects Versions: 1.8.0, 1.8.1, 1.9.0, 1.10.0, 2.0.0
>Reporter: Sergio Herrera
>Assignee: Sergio Herrera
>Priority: Major
>  Labels: api
> Fix For: 2.0.0, 1.10.2
>
>
> When someones calls to the endpoint _POST 
> :/api/experimental/dags//dag_runs {}_, Airflow never run 
> that request if the body of that contains _conf_.
>  This occurs due to a mismatch between types when calling function 
> _trigger_dag()_, which is also used by *CLI*. That function perform a 
> _json.loads(conf)_ because from CLI the type of conf is _string_, but, in the 
> other side, from *experimental API*, that type is _dict_ (because _Json_ is 
> processed before to get all data, such as execution_date).
> There are two possibilities:
>  1. Look for every use of _trigger_dag()_ function and put _Json_ formatting 
> from outside the function.
>  2. In the *experimental API*, put the conf in a string (with _json.dumps()_) 
> to allow _trigger_dag()_ transform into _dict_.
> I have implemented the second option, so I can make a PR with that if you 
> want.
> Thank you a lot
> EDIT: Also, there are no tests which uses conf in the Json passed through 
> request currently.
> Examples:
>  - Before fix (escaped json):
> {noformat}
> POST /api/experimental/dags/test_conf/dag_runs HTTP/1.1
> Content-Type: application/json
> {
>   "conf": "{
> \"k1\": \"v1\",
> \"k2\": \"v2\",
> \"k3\": [\"av1\", \"av2\", \"av3\"],
> \"k4\": {
>   \"sk1\": \"sv1\",
>   \"sk2\": \"sv2\"
> }
>   }"  
> }
> {noformat}
>  - After fix (pure json):
> {noformat}
> POST /api/experimental/dags/test_conf/dag_runs HTTP/1.1
> Content-Type: application/json
> {
>   "conf": {
> "k1": "v1",
> "k2": "v2",
> "k3": ["av1", "av2", "av3"],
> "k4": {
>   "sk1": "sv1",
>   "sk2": "sv2"
> }
>   }
> }
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3251) KubernetesPodOperator does not use 'image_pull_secrets' argument

2018-11-16 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3251.

   Resolution: Fixed
Fix Version/s: 1.10.2
   2.0.0

> KubernetesPodOperator does not use 'image_pull_secrets' argument
> 
>
> Key: AIRFLOW-3251
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3251
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Padarn Wilson
>Assignee: Padarn Wilson
>Priority: Minor
> Fix For: 2.0.0, 1.10.2
>
>
> The KubernetesPodOperator accepts argument `image_pull_secrets`, and the Pod 
> object can use this variable when deploying the pod, but currently the 
> argument to the operator is not added to the Pod before it is launched
> Relevant code in 
> `incubator-airflow/airflow/contrib/operators/kubernetes_pod_operator.py`. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3266) AWS Athena Operator in Airflow

2018-11-16 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3266.

   Resolution: Fixed
Fix Version/s: 2.0.0

> AWS Athena Operator in Airflow
> --
>
> Key: AIRFLOW-3266
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3266
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: aws
>Affects Versions: 1.10.0
>Reporter: Sai Phanindhra
>Assignee: Sai Phanindhra
>Priority: Minor
> Fix For: 2.0.0
>
>
> There is no official athena operator as of now airflow. Either one has do it 
> using boto3 in python operator or using aws cli in bash operator. Either of 
> these does not take care of total life cycle of query. Create a Athena 
> operator and hook to submit presto query and update task based of state of 
> submitted query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3354) Scheduler compares offset-naive and offset-aware dates

2018-11-16 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689397#comment-16689397
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3354:


{{start_date}} controls when a task is valid to be scheduled from - specifying 
it as {{now()}} is a bug and not something an operator should do.

What were you trying to achieve by specifying that in the operator? Bear in 
mind that Airflow parses your dag each and every time it runs tasks (and more 
often too) so this value is constantly changing.

> Scheduler compares offset-naive and offset-aware dates
> --
>
> Key: AIRFLOW-3354
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3354
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10.1
>Reporter: Jakub Powierza
>Priority: Major
>
> New version of Airflow (1.10.1rc1 and 1.10.1rc2) tries to compare 
> offset-naive and offset-aware dates in Scheduler. I've tested a simple case 
> with schedule set to "*/10 * * * *". I've tried to clean my developer 
> instance with `airflow resetdb` and start from scratch but it does not help 
> at all. This issue does not occur on stable version 1.10.0.
> My setup: Python 3.6 on Ubuntu 14.04 with Airflow Scheduler based on Celery 
> with RabbitMQ backend.
> Exception found in Scheduler logs:
> {code:java}
> 2018-11-15 14:41:23,194:ERROR:airflow.processor:[CT=None] Got an exception! 
> Propagating...
> Traceback (most recent call last):
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 389, in helper
>  pickle_dags)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 1846, in process_file
>  self._process_dags(dagbag, dags, ti_keys_to_schedule)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 1426, in _process_dags
>  dag_run = self.create_dag_run(dag)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 909, in create_dag_run
>  external_trigger=False
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/models.py",
>  line 4270, in create_dagrun
>  run.verify_integrity(session=session)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 70, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/models.py",
>  line 5215, in verify_integrity
>  if task.start_date > self.execution_date and not self.is_backfill:
> TypeError: can't compare offset-naive and offset-aware datetimes
> Process DagFileProcessor40-Process:
> Traceback (most recent call last):
>  File "/usr/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
>  self.run()
>  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
>  self._target(*self._args, **self._kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 389, in helper
>  pickle_dags)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 1846, in process_file
>  self._process_dags(dagbag, dags, ti_keys_to_schedule)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 1426, in _process_dags
>  dag_run = self.create_dag_run(dag)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 909, in create_dag_run
>  external_trigger=False
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/models.py",
>  line 4270, in create_dagrun
>  run.verify_integrity(session=session)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 70, in wrapper
>  return func(*args, **kwargs)
>  File 
> 

[jira] [Updated] (AIRFLOW-251) Add optional parameter SQL_ALCHEMY_SCHEMA to control schema for metadata repository

2018-11-16 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-251:
--
Affects Version/s: (was: 2.0.0)

> Add optional parameter SQL_ALCHEMY_SCHEMA to control schema for metadata 
> repository
> ---
>
> Key: AIRFLOW-251
> URL: https://issues.apache.org/jira/browse/AIRFLOW-251
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core
>Reporter: Ed Parcell
>Assignee: Iuliia Volkova
>Priority: Minor
> Fix For: 2.0.0
>
>
> Using SQL Server as a database for metadata, it is preferable to group all 
> Airflow tables into a separate schema, rather than using dbo. I propose 
> adding an optional parameter SQL_ALCHEMY_SCHEMA to control this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-251) Add optional parameter SQL_ALCHEMY_SCHEMA to control schema for metadata repository

2018-11-16 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-251.
---
   Resolution: Fixed
Fix Version/s: 2.0.0

> Add optional parameter SQL_ALCHEMY_SCHEMA to control schema for metadata 
> repository
> ---
>
> Key: AIRFLOW-251
> URL: https://issues.apache.org/jira/browse/AIRFLOW-251
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core
>Reporter: Ed Parcell
>Assignee: Iuliia Volkova
>Priority: Minor
> Fix For: 2.0.0
>
>
> Using SQL Server as a database for metadata, it is preferable to group all 
> Airflow tables into a separate schema, rather than using dbo. I propose 
> adding an optional parameter SQL_ALCHEMY_SCHEMA to control this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3353) redis-py 3.0.0 dependency breaks celery executor

2018-11-15 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3353.

   Resolution: Fixed
Fix Version/s: 2.0.0

> redis-py 3.0.0 dependency breaks celery executor
> 
>
> Key: AIRFLOW-3353
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3353
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: celery
>Affects Versions: 1.10.0
>Reporter: Stefan Seelmann
>Assignee: Ash Berlin-Taylor
>Priority: Major
> Fix For: 2.0.0
>
>
> redis-py 3.0.0 was just released. Airflow 1.10.0 defines redis>=2.10.5 so 
> installs redis-py 3.0.0 now.
> Error in worker below.
> Workaround: Pin redis==2.10.6 (e.g. in constraints.txt)
> {code}
> [2018-11-15 12:06:18,441: CRITICAL/MainProcess] Unrecoverable error: 
> AttributeError("'float' object has no attribute 'items'",)
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.6/site-packages/celery/worker/worker.py", line 
> 205, in start
> self.blueprint.start(self)
>   File "/usr/local/lib/python3.6/site-packages/celery/bootsteps.py", line 
> 119, in start
> step.start(parent)
>   File "/usr/local/lib/python3.6/site-packages/celery/bootsteps.py", line 
> 369, in start
> return self.obj.start()
>   File 
> "/usr/local/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", 
> line 317, in start
> blueprint.start(self)
>   File "/usr/local/lib/python3.6/site-packages/celery/bootsteps.py", line 
> 119, in start
> step.start(parent)
>   File 
> "/usr/local/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", 
> line 593, in start
> c.loop(*c.loop_args())
>   File "/usr/local/lib/python3.6/site-packages/celery/worker/loops.py", line 
> 91, in asynloop
> next(loop)
>   File "/usr/local/lib/python3.6/site-packages/kombu/asynchronous/hub.py", 
> line 354, in create_loop
> cb(*cbargs)
>   File "/usr/local/lib/python3.6/site-packages/kombu/transport/redis.py", 
> line 1040, in on_readable
> self.cycle.on_readable(fileno)
>   File "/usr/local/lib/python3.6/site-packages/kombu/transport/redis.py", 
> line 337, in on_readable
> chan.handlers[type]()
>   File "/usr/local/lib/python3.6/site-packages/kombu/transport/redis.py", 
> line 724, in _brpop_read
> self.connection._deliver(loads(bytes_to_str(item)), dest)
>   File 
> "/usr/local/lib/python3.6/site-packages/kombu/transport/virtual/base.py", 
> line 983, in _deliver
> callback(message)
>   File 
> "/usr/local/lib/python3.6/site-packages/kombu/transport/virtual/base.py", 
> line 632, in _callback
> self.qos.append(message, message.delivery_tag)
>   File "/usr/local/lib/python3.6/site-packages/kombu/transport/redis.py", 
> line 149, in append
> pipe.zadd(self.unacked_index_key, time(), delivery_tag) \
>   File "/usr/local/lib/python3.6/site-packages/redis/client.py", line 2263, 
> in zadd
> for pair in iteritems(mapping):
>   File "/usr/local/lib/python3.6/site-packages/redis/_compat.py", line 123, 
> in iteritems
> return iter(x.items())
> AttributeError: 'float' object has no attribute 'items'
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (AIRFLOW-3353) redis-py 3.0.0 dependency breaks celery executor

2018-11-15 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-3353 started by Ash Berlin-Taylor.
--
> redis-py 3.0.0 dependency breaks celery executor
> 
>
> Key: AIRFLOW-3353
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3353
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: celery
>Affects Versions: 1.10.0
>Reporter: Stefan Seelmann
>Assignee: Ash Berlin-Taylor
>Priority: Major
> Fix For: 2.0.0
>
>
> redis-py 3.0.0 was just released. Airflow 1.10.0 defines redis>=2.10.5 so 
> installs redis-py 3.0.0 now.
> Error in worker below.
> Workaround: Pin redis==2.10.6 (e.g. in constraints.txt)
> {code}
> [2018-11-15 12:06:18,441: CRITICAL/MainProcess] Unrecoverable error: 
> AttributeError("'float' object has no attribute 'items'",)
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.6/site-packages/celery/worker/worker.py", line 
> 205, in start
> self.blueprint.start(self)
>   File "/usr/local/lib/python3.6/site-packages/celery/bootsteps.py", line 
> 119, in start
> step.start(parent)
>   File "/usr/local/lib/python3.6/site-packages/celery/bootsteps.py", line 
> 369, in start
> return self.obj.start()
>   File 
> "/usr/local/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", 
> line 317, in start
> blueprint.start(self)
>   File "/usr/local/lib/python3.6/site-packages/celery/bootsteps.py", line 
> 119, in start
> step.start(parent)
>   File 
> "/usr/local/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", 
> line 593, in start
> c.loop(*c.loop_args())
>   File "/usr/local/lib/python3.6/site-packages/celery/worker/loops.py", line 
> 91, in asynloop
> next(loop)
>   File "/usr/local/lib/python3.6/site-packages/kombu/asynchronous/hub.py", 
> line 354, in create_loop
> cb(*cbargs)
>   File "/usr/local/lib/python3.6/site-packages/kombu/transport/redis.py", 
> line 1040, in on_readable
> self.cycle.on_readable(fileno)
>   File "/usr/local/lib/python3.6/site-packages/kombu/transport/redis.py", 
> line 337, in on_readable
> chan.handlers[type]()
>   File "/usr/local/lib/python3.6/site-packages/kombu/transport/redis.py", 
> line 724, in _brpop_read
> self.connection._deliver(loads(bytes_to_str(item)), dest)
>   File 
> "/usr/local/lib/python3.6/site-packages/kombu/transport/virtual/base.py", 
> line 983, in _deliver
> callback(message)
>   File 
> "/usr/local/lib/python3.6/site-packages/kombu/transport/virtual/base.py", 
> line 632, in _callback
> self.qos.append(message, message.delivery_tag)
>   File "/usr/local/lib/python3.6/site-packages/kombu/transport/redis.py", 
> line 149, in append
> pipe.zadd(self.unacked_index_key, time(), delivery_tag) \
>   File "/usr/local/lib/python3.6/site-packages/redis/client.py", line 2263, 
> in zadd
> for pair in iteritems(mapping):
>   File "/usr/local/lib/python3.6/site-packages/redis/_compat.py", line 123, 
> in iteritems
> return iter(x.items())
> AttributeError: 'float' object has no attribute 'items'
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3354) Scheduler compares offset-naive and offset-aware dates

2018-11-15 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688250#comment-16688250
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3354:


Might be enough if you provide all the ways that `start_date` is set in your 
DAG/tasks

> Scheduler compares offset-naive and offset-aware dates
> --
>
> Key: AIRFLOW-3354
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3354
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10.1
>Reporter: Jakub Powierza
>Priority: Major
>
> New version of Airflow (1.10.1rc1 and 1.10.1rc2) tries to compare 
> offset-naive and offset-aware dates in Scheduler. I've tested a simple case 
> with schedule set to "*/10 * * * *". I've tried to clean my developer 
> instance with `airflow resetdb` and start from scratch but it does not help 
> at all. This issue does not occur on stable version 1.10.0.
> My setup: Python 3.6 on Ubuntu 14.04 with Airflow Scheduler based on Celery 
> with RabbitMQ backend.
> Exception found in Scheduler logs:
> {code:java}
> 2018-11-15 14:41:23,194:ERROR:airflow.processor:[CT=None] Got an exception! 
> Propagating...
> Traceback (most recent call last):
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 389, in helper
>  pickle_dags)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 1846, in process_file
>  self._process_dags(dagbag, dags, ti_keys_to_schedule)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 1426, in _process_dags
>  dag_run = self.create_dag_run(dag)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 909, in create_dag_run
>  external_trigger=False
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/models.py",
>  line 4270, in create_dagrun
>  run.verify_integrity(session=session)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 70, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/models.py",
>  line 5215, in verify_integrity
>  if task.start_date > self.execution_date and not self.is_backfill:
> TypeError: can't compare offset-naive and offset-aware datetimes
> Process DagFileProcessor40-Process:
> Traceback (most recent call last):
>  File "/usr/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
>  self.run()
>  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
>  self._target(*self._args, **self._kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 389, in helper
>  pickle_dags)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 1846, in process_file
>  self._process_dags(dagbag, dags, ti_keys_to_schedule)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 1426, in _process_dags
>  dag_run = self.create_dag_run(dag)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 909, in create_dag_run
>  external_trigger=False
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/models.py",
>  line 4270, in create_dagrun
>  run.verify_integrity(session=session)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 70, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/models.py",
>  line 5215, in verify_integrity
>  if task.start_date > self.execution_date and not self.is_backfill:
> TypeError: can't compare offset-naive and offset-aware datetimes
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3354) Scheduler compares offset-naive and offset-aware dates

2018-11-15 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688185#comment-16688185
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3354:


Can you provide the full dag definition?

> Scheduler compares offset-naive and offset-aware dates
> --
>
> Key: AIRFLOW-3354
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3354
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10.1
>Reporter: Jakub Powierza
>Priority: Major
>
> New version of Airflow (1.10.1rc1 and 1.10.1rc2) tries to compare 
> offset-naive and offset-aware dates in Scheduler. I've tested a simple case 
> with schedule set to "*/10 * * * *". I've tried to clean my developer 
> instance with `airflow resetdb` and start from scratch but it does not help 
> at all. This issue does not occur on stable version 1.10.0.
> My setup: Python 3.6 on Ubuntu 14.04 with Airflow Scheduler based on Celery 
> with RabbitMQ backend.
> Exception found in Scheduler logs:
> {code:java}
> 2018-11-15 14:41:23,194:ERROR:airflow.processor:[CT=None] Got an exception! 
> Propagating...
> Traceback (most recent call last):
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 389, in helper
>  pickle_dags)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 1846, in process_file
>  self._process_dags(dagbag, dags, ti_keys_to_schedule)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 1426, in _process_dags
>  dag_run = self.create_dag_run(dag)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 909, in create_dag_run
>  external_trigger=False
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/models.py",
>  line 4270, in create_dagrun
>  run.verify_integrity(session=session)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 70, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/models.py",
>  line 5215, in verify_integrity
>  if task.start_date > self.execution_date and not self.is_backfill:
> TypeError: can't compare offset-naive and offset-aware datetimes
> Process DagFileProcessor40-Process:
> Traceback (most recent call last):
>  File "/usr/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
>  self.run()
>  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
>  self._target(*self._args, **self._kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 389, in helper
>  pickle_dags)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 1846, in process_file
>  self._process_dags(dagbag, dags, ti_keys_to_schedule)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 1426, in _process_dags
>  dag_run = self.create_dag_run(dag)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/jobs.py", 
> line 909, in create_dag_run
>  external_trigger=False
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 74, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/models.py",
>  line 4270, in create_dagrun
>  run.verify_integrity(session=session)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/utils/db.py",
>  line 70, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/jpowierz/my_project/venv/lib/python3.6/site-packages/airflow/models.py",
>  line 5215, in verify_integrity
>  if task.start_date > self.execution_date and not self.is_backfill:
> TypeError: can't compare offset-naive and offset-aware datetimes
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3353) redis-py 3.0.0 dependency breaks celery executor

2018-11-15 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688102#comment-16688102
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3353:


This is maybe a problem in celery/kombu, and not Airflow itself?

> redis-py 3.0.0 dependency breaks celery executor
> 
>
> Key: AIRFLOW-3353
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3353
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: celery
>Affects Versions: 1.10.0
>Reporter: Stefan Seelmann
>Priority: Major
>
> redis-py 3.0.0 was just released. Airflow 1.10.0 defines redis>=2.10.5 so 
> installs redis-py 3.0.0 now.
> Error in worker below.
> Workaround: Pin redis==2.10.6 (e.g. in constraints.txt)
> {code}
> [2018-11-15 12:06:18,441: CRITICAL/MainProcess] Unrecoverable error: 
> AttributeError("'float' object has no attribute 'items'",)
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.6/site-packages/celery/worker/worker.py", line 
> 205, in start
> self.blueprint.start(self)
>   File "/usr/local/lib/python3.6/site-packages/celery/bootsteps.py", line 
> 119, in start
> step.start(parent)
>   File "/usr/local/lib/python3.6/site-packages/celery/bootsteps.py", line 
> 369, in start
> return self.obj.start()
>   File 
> "/usr/local/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", 
> line 317, in start
> blueprint.start(self)
>   File "/usr/local/lib/python3.6/site-packages/celery/bootsteps.py", line 
> 119, in start
> step.start(parent)
>   File 
> "/usr/local/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", 
> line 593, in start
> c.loop(*c.loop_args())
>   File "/usr/local/lib/python3.6/site-packages/celery/worker/loops.py", line 
> 91, in asynloop
> next(loop)
>   File "/usr/local/lib/python3.6/site-packages/kombu/asynchronous/hub.py", 
> line 354, in create_loop
> cb(*cbargs)
>   File "/usr/local/lib/python3.6/site-packages/kombu/transport/redis.py", 
> line 1040, in on_readable
> self.cycle.on_readable(fileno)
>   File "/usr/local/lib/python3.6/site-packages/kombu/transport/redis.py", 
> line 337, in on_readable
> chan.handlers[type]()
>   File "/usr/local/lib/python3.6/site-packages/kombu/transport/redis.py", 
> line 724, in _brpop_read
> self.connection._deliver(loads(bytes_to_str(item)), dest)
>   File 
> "/usr/local/lib/python3.6/site-packages/kombu/transport/virtual/base.py", 
> line 983, in _deliver
> callback(message)
>   File 
> "/usr/local/lib/python3.6/site-packages/kombu/transport/virtual/base.py", 
> line 632, in _callback
> self.qos.append(message, message.delivery_tag)
>   File "/usr/local/lib/python3.6/site-packages/kombu/transport/redis.py", 
> line 149, in append
> pipe.zadd(self.unacked_index_key, time(), delivery_tag) \
>   File "/usr/local/lib/python3.6/site-packages/redis/client.py", line 2263, 
> in zadd
> for pair in iteritems(mapping):
>   File "/usr/local/lib/python3.6/site-packages/redis/_compat.py", line 123, 
> in iteritems
> return iter(x.items())
> AttributeError: 'float' object has no attribute 'items'
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3353) redis-py 3.0.0 dependency breaks celery executor

2018-11-15 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688097#comment-16688097
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3353:


Could you check if redis==3.0.0post1 also has this problem?

> redis-py 3.0.0 dependency breaks celery executor
> 
>
> Key: AIRFLOW-3353
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3353
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: celery
>Affects Versions: 1.10.0
>Reporter: Stefan Seelmann
>Priority: Major
>
> redis-py 3.0.0 was just released. Airflow 1.10.0 defines redis>=2.10.5 so 
> installs redis-py 3.0.0 now.
> Error in worker below.
> Workaround: Pin redis==2.10.6 (e.g. in constraints.txt)
> {code}
> [2018-11-15 12:06:18,441: CRITICAL/MainProcess] Unrecoverable error: 
> AttributeError("'float' object has no attribute 'items'",)
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.6/site-packages/celery/worker/worker.py", line 
> 205, in start
> self.blueprint.start(self)
>   File "/usr/local/lib/python3.6/site-packages/celery/bootsteps.py", line 
> 119, in start
> step.start(parent)
>   File "/usr/local/lib/python3.6/site-packages/celery/bootsteps.py", line 
> 369, in start
> return self.obj.start()
>   File 
> "/usr/local/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", 
> line 317, in start
> blueprint.start(self)
>   File "/usr/local/lib/python3.6/site-packages/celery/bootsteps.py", line 
> 119, in start
> step.start(parent)
>   File 
> "/usr/local/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", 
> line 593, in start
> c.loop(*c.loop_args())
>   File "/usr/local/lib/python3.6/site-packages/celery/worker/loops.py", line 
> 91, in asynloop
> next(loop)
>   File "/usr/local/lib/python3.6/site-packages/kombu/asynchronous/hub.py", 
> line 354, in create_loop
> cb(*cbargs)
>   File "/usr/local/lib/python3.6/site-packages/kombu/transport/redis.py", 
> line 1040, in on_readable
> self.cycle.on_readable(fileno)
>   File "/usr/local/lib/python3.6/site-packages/kombu/transport/redis.py", 
> line 337, in on_readable
> chan.handlers[type]()
>   File "/usr/local/lib/python3.6/site-packages/kombu/transport/redis.py", 
> line 724, in _brpop_read
> self.connection._deliver(loads(bytes_to_str(item)), dest)
>   File 
> "/usr/local/lib/python3.6/site-packages/kombu/transport/virtual/base.py", 
> line 983, in _deliver
> callback(message)
>   File 
> "/usr/local/lib/python3.6/site-packages/kombu/transport/virtual/base.py", 
> line 632, in _callback
> self.qos.append(message, message.delivery_tag)
>   File "/usr/local/lib/python3.6/site-packages/kombu/transport/redis.py", 
> line 149, in append
> pipe.zadd(self.unacked_index_key, time(), delivery_tag) \
>   File "/usr/local/lib/python3.6/site-packages/redis/client.py", line 2263, 
> in zadd
> for pair in iteritems(mapping):
>   File "/usr/local/lib/python3.6/site-packages/redis/_compat.py", line 123, 
> in iteritems
> return iter(x.items())
> AttributeError: 'float' object has no attribute 'items'
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3343) DockerOperator always reports containers as failed

2018-11-14 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3343.

Resolution: Fixed

> DockerOperator always reports containers as failed
> --
>
> Key: AIRFLOW-3343
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3343
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10.1
>Reporter: Ash Berlin-Taylor
>Priority: Major
> Fix For: 1.10.1
>
>
> The docker-py 3.0.0 has some API breaking changes that didn't get picked up 
> because our mocking was still returning the old API. From 
> https://docker-py.readthedocs.io/en/stable/change-log.html#breaking-changes:
> > APIClient.wait and Container.wait now return a dict representing the API’s 
> > response instead of returning the status code directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3339) Timezone error when start_date in default_args

2018-11-14 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3339:
---
Summary: Timezone error when start_date in default_args  (was: Fix timezone 
error)

> Timezone error when start_date in default_args
> --
>
> Key: AIRFLOW-3339
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3339
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.1
>Reporter: Ikar Pohorsky
>Assignee: Ash Berlin-Taylor
>Priority: Blocker
> Fix For: 1.10.1
>
>
> From slack:
> nicor88 8:34 AM
> {quote}
> Hey all, we are having issue with our scheduler, in the latest release 1.10.1 
> due to this error:
> {noformat}
> Process DagFileProcessor94249-Process:
> Traceback (most recent call last):
>  File "/usr/local/lib/python3.6/multiprocessing/process.py", line 258, in 
> _bootstrap
>  self.run()
>  File "/usr/local/lib/python3.6/multiprocessing/process.py", line 93, in run
>  self._target(*self._args, **self._kwargs)
>  File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 389, in 
> helper
>  pickle_dags)
>  File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 74, 
> in wrapper
>  return func(*args, **kwargs)
>  File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 1846, in 
> process_file
>  self._process_dags(dagbag, dags, ti_keys_to_schedule)
>  File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 1426, in 
> _process_dags
>  dag_run = self.create_dag_run(dag)
>  File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 74, 
> in wrapper
>  return func(*args, **kwargs)
>  File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 835, in 
> create_dag_run
>  next_start = dag.following_schedule(now)
>  File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 3396, 
> in following_schedule
>  tz = pendulum.timezone(self.timezone.name)
> AttributeError: 'NoneType' object has no attribute 'name'
> {noformat}
> We have a timezone setup in the config
> {quote}
> eamon [Today at 9:41 AM]
> {quote}
> @nicor88  I saw that in the logs also, fwiw it seemed to fail at the first 
> dag run but then subsequent attempts seemed to work.
> {quote}
> 
> nicor88 [3 hours ago]
> {quote}
> I tried but the daily jobs are stacked
> {quote}
> 
> eamon [2 hours ago]
> {quote}
> so nothing getting scheduled? This is the behaviour I've observed also. When 
> I saw that particular error, it seemed to correct itself so I discounted that 
> as the root cause but could be it.
> {quote}
> 
> eamon [2 hours ago]
> {quote}
> maybe try a task without a timezone.
> {quote}
> 
> porn [1 hour ago]
> {quote}
> dammit, same problem here ({{1.10.1-rc1}})
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> we have timezone setup as utc in the config, and the start_date is setup 
> like: datetime(2018, 7, 23)
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> we reverted to the stable release 1.10.0
> {quote}
> 
> porn [1 hour ago]
> {quote}
> {{default_timezone = utc}} in config too,
> the DAG parameters:
> {noformat}
> schedule_interval = "0 14 * * *",
> start_date = datetime(2017, 8, 29, hour=1)
> {noformat}
> {quote}
> 
> porn [1 hour ago]
> {quote}
> I need to add that this happened to me only when tried to manually execute 
> (from UI) the task that is periodical, but haven't ran yet. Can you @nicor88 
> confirm this was your case too?
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> in our case the daily jobs were not scheduled at all
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> that’s why I got suspiscious and check (edited)
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> we install Airflow from Github directly, pointing to the stable branch. Now 
> we recovered pointing to a specific release
> {quote}
> 
> porn [1 hour ago]
> {quote}
> it is a pre-release actually
> {quote}
> nicor88 [1 hour ago]
> {quote}
> it’s my bad that we use the v1-10-stable branch from Github
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> instead of using the tagged release (edited)
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> lesson learned
> {quote}
> 
> porn [14 minutes ago]
> {quote}
> ok, just downgraded to {{v1.10.1b1}} and got the same error
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3339) Fix timezone error

2018-11-14 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3339.

   Resolution: Fixed
Fix Version/s: 1.10.1

> Fix timezone error
> --
>
> Key: AIRFLOW-3339
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3339
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.1
>Reporter: Ikar Pohorsky
>Assignee: Ash Berlin-Taylor
>Priority: Blocker
> Fix For: 1.10.1
>
>
> From slack:
> nicor88 8:34 AM
> {quote}
> Hey all, we are having issue with our scheduler, in the latest release 1.10.1 
> due to this error:
> {noformat}
> Process DagFileProcessor94249-Process:
> Traceback (most recent call last):
>  File "/usr/local/lib/python3.6/multiprocessing/process.py", line 258, in 
> _bootstrap
>  self.run()
>  File "/usr/local/lib/python3.6/multiprocessing/process.py", line 93, in run
>  self._target(*self._args, **self._kwargs)
>  File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 389, in 
> helper
>  pickle_dags)
>  File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 74, 
> in wrapper
>  return func(*args, **kwargs)
>  File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 1846, in 
> process_file
>  self._process_dags(dagbag, dags, ti_keys_to_schedule)
>  File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 1426, in 
> _process_dags
>  dag_run = self.create_dag_run(dag)
>  File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 74, 
> in wrapper
>  return func(*args, **kwargs)
>  File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 835, in 
> create_dag_run
>  next_start = dag.following_schedule(now)
>  File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 3396, 
> in following_schedule
>  tz = pendulum.timezone(self.timezone.name)
> AttributeError: 'NoneType' object has no attribute 'name'
> {noformat}
> We have a timezone setup in the config
> {quote}
> eamon [Today at 9:41 AM]
> {quote}
> @nicor88  I saw that in the logs also, fwiw it seemed to fail at the first 
> dag run but then subsequent attempts seemed to work.
> {quote}
> 
> nicor88 [3 hours ago]
> {quote}
> I tried but the daily jobs are stacked
> {quote}
> 
> eamon [2 hours ago]
> {quote}
> so nothing getting scheduled? This is the behaviour I've observed also. When 
> I saw that particular error, it seemed to correct itself so I discounted that 
> as the root cause but could be it.
> {quote}
> 
> eamon [2 hours ago]
> {quote}
> maybe try a task without a timezone.
> {quote}
> 
> porn [1 hour ago]
> {quote}
> dammit, same problem here ({{1.10.1-rc1}})
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> we have timezone setup as utc in the config, and the start_date is setup 
> like: datetime(2018, 7, 23)
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> we reverted to the stable release 1.10.0
> {quote}
> 
> porn [1 hour ago]
> {quote}
> {{default_timezone = utc}} in config too,
> the DAG parameters:
> {noformat}
> schedule_interval = "0 14 * * *",
> start_date = datetime(2017, 8, 29, hour=1)
> {noformat}
> {quote}
> 
> porn [1 hour ago]
> {quote}
> I need to add that this happened to me only when tried to manually execute 
> (from UI) the task that is periodical, but haven't ran yet. Can you @nicor88 
> confirm this was your case too?
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> in our case the daily jobs were not scheduled at all
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> that’s why I got suspiscious and check (edited)
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> we install Airflow from Github directly, pointing to the stable branch. Now 
> we recovered pointing to a specific release
> {quote}
> 
> porn [1 hour ago]
> {quote}
> it is a pre-release actually
> {quote}
> nicor88 [1 hour ago]
> {quote}
> it’s my bad that we use the v1-10-stable branch from Github
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> instead of using the tagged release (edited)
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> lesson learned
> {quote}
> 
> porn [14 minutes ago]
> {quote}
> ok, just downgraded to {{v1.10.1b1}} and got the same error
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1262) Allow configuration of email alert subject and body

2018-11-14 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687134#comment-16687134
 ] 

Ash Berlin-Taylor commented on AIRFLOW-1262:


The PR for this was merged, but this Jira shouldn't be closed until there are 
some docs added.

> Allow configuration of email alert subject and body
> ---
>
> Key: AIRFLOW-1262
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1262
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: configuration
>Reporter: Alek Storm
>Assignee: Alek Storm
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3342) Homepage UI broke ("Recent Tasks" column)

2018-11-14 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686919#comment-16686919
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3342:


Ah, duplicate of AIRFLOW-3325 then. Already fixed in the RC (but that's not on 
Pypi, I know)

> Homepage UI broke ("Recent Tasks" column)
> -
>
> Key: AIRFLOW-3342
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3342
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 1.10.1
>Reporter: Christophe
>Priority: Minor
> Attachments: Capture d’écran 2018-11-14 à 17.24.33.png
>
>
> The column "Recent tasks" is broken with the addition of the Delete Dag 
> button.
> !Capture d’écran 2018-11-14 à 17.24.33.png|width=1176,height=380!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3342) Homepage UI broke ("Recent Tasks" column)

2018-11-14 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686902#comment-16686902
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3342:


Oh yes, sorry! What version is this? 1.10.1b1 or 1.10.1rc1?

> Homepage UI broke ("Recent Tasks" column)
> -
>
> Key: AIRFLOW-3342
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3342
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 1.10.1
>Reporter: Christophe
>Priority: Minor
> Attachments: Capture d’écran 2018-11-14 à 17.24.33.png
>
>
> The column "Recent tasks" is broken with the addition of the Delete Dag 
> button.
> !Capture d’écran 2018-11-14 à 17.24.33.png|width=1176,height=380!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3342) Homepage UI broke ("Recent Tasks" column)

2018-11-14 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686876#comment-16686876
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3342:


Can you point out what is broken as I'm not seeing it.

> Homepage UI broke ("Recent Tasks" column)
> -
>
> Key: AIRFLOW-3342
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3342
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 1.10.1
>Reporter: Christophe
>Priority: Minor
> Attachments: Capture d’écran 2018-11-14 à 17.24.33.png
>
>
> The column "Recent tasks" is broken with the addition of the Delete Dag 
> button.
> !Capture d’écran 2018-11-14 à 17.24.33.png|width=1176,height=380!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-3339) Fix timezone error

2018-11-14 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor reassigned AIRFLOW-3339:
--

Assignee: Ash Berlin-Taylor

> Fix timezone error
> --
>
> Key: AIRFLOW-3339
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3339
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.1
>Reporter: Ikar Pohorsky
>Assignee: Ash Berlin-Taylor
>Priority: Blocker
>
> From slack:
> nicor88 8:34 AM
> {quote}
> Hey all, we are having issue with our scheduler, in the latest release 1.10.1 
> due to this error:
> {noformat}
> Process DagFileProcessor94249-Process:
> Traceback (most recent call last):
>  File "/usr/local/lib/python3.6/multiprocessing/process.py", line 258, in 
> _bootstrap
>  self.run()
>  File "/usr/local/lib/python3.6/multiprocessing/process.py", line 93, in run
>  self._target(*self._args, **self._kwargs)
>  File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 389, in 
> helper
>  pickle_dags)
>  File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 74, 
> in wrapper
>  return func(*args, **kwargs)
>  File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 1846, in 
> process_file
>  self._process_dags(dagbag, dags, ti_keys_to_schedule)
>  File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 1426, in 
> _process_dags
>  dag_run = self.create_dag_run(dag)
>  File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 74, 
> in wrapper
>  return func(*args, **kwargs)
>  File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 835, in 
> create_dag_run
>  next_start = dag.following_schedule(now)
>  File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 3396, 
> in following_schedule
>  tz = pendulum.timezone(self.timezone.name)
> AttributeError: 'NoneType' object has no attribute 'name'
> {noformat}
> We have a timezone setup in the config
> {quote}
> eamon [Today at 9:41 AM]
> {quote}
> @nicor88  I saw that in the logs also, fwiw it seemed to fail at the first 
> dag run but then subsequent attempts seemed to work.
> {quote}
> 
> nicor88 [3 hours ago]
> {quote}
> I tried but the daily jobs are stacked
> {quote}
> 
> eamon [2 hours ago]
> {quote}
> so nothing getting scheduled? This is the behaviour I've observed also. When 
> I saw that particular error, it seemed to correct itself so I discounted that 
> as the root cause but could be it.
> {quote}
> 
> eamon [2 hours ago]
> {quote}
> maybe try a task without a timezone.
> {quote}
> 
> porn [1 hour ago]
> {quote}
> dammit, same problem here ({{1.10.1-rc1}})
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> we have timezone setup as utc in the config, and the start_date is setup 
> like: datetime(2018, 7, 23)
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> we reverted to the stable release 1.10.0
> {quote}
> 
> porn [1 hour ago]
> {quote}
> {{default_timezone = utc}} in config too,
> the DAG parameters:
> {noformat}
> schedule_interval = "0 14 * * *",
> start_date = datetime(2017, 8, 29, hour=1)
> {noformat}
> {quote}
> 
> porn [1 hour ago]
> {quote}
> I need to add that this happened to me only when tried to manually execute 
> (from UI) the task that is periodical, but haven't ran yet. Can you @nicor88 
> confirm this was your case too?
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> in our case the daily jobs were not scheduled at all
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> that’s why I got suspiscious and check (edited)
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> we install Airflow from Github directly, pointing to the stable branch. Now 
> we recovered pointing to a specific release
> {quote}
> 
> porn [1 hour ago]
> {quote}
> it is a pre-release actually
> {quote}
> nicor88 [1 hour ago]
> {quote}
> it’s my bad that we use the v1-10-stable branch from Github
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> instead of using the tagged release (edited)
> {quote}
> 
> nicor88 [1 hour ago]
> {quote}
> lesson learned
> {quote}
> 
> porn [14 minutes ago]
> {quote}
> ok, just downgraded to {{v1.10.1b1}} and got the same error
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1973) Scheduler log dirs permission issue since airflow 1.9.0

2018-11-14 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686392#comment-16686392
 ] 

Ash Berlin-Taylor commented on AIRFLOW-1973:


The fix I would suggest is to not run the webserver as root - having a root 
process listening on a public port sounds like a disaster waiting to happen!

If you want airflow available on port 80 then I'd suggest putting nginx/apache 
httpd in front of it, or using something like the capabilities feature in Linux 
to allow {{airflow}} to bind to port 80 https://superuser.com/a/892391

> Scheduler log dirs permission issue since airflow 1.9.0
> ---
>
> Key: AIRFLOW-1973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1973
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler, webserver
>Affects Versions: 1.9.0
>Reporter: Ikar Pohorsky
>Priority: Major
>
> I'm having a weird (cannot safely reproduce) issue with scheduler log 
> permissions since version 1.9.0.
> *Problem description:*
> sometimes (not always) it happens that a new log dir for scheduler is created 
> with root set as owner (and group) which leads to the scheduler being unable 
> to write to the logs and therefor no task is scheduled and no task is 
> executed.
> Example:
> {code}
> ubuntu@airflow-admin-master:~$ ls -l airflow/logs/scheduler/
> total 24
> drwxr-xr-x 3 ubuntu ubuntu 4096 Jan  3 13:46 2018-01-03
> drwxr-xr-x 3 ubuntu ubuntu 4096 Jan  4 07:16 2018-01-04
> drwxr-xr-x 4 ubuntu ubuntu 4096 Jan  5 12:08 2018-01-05
> drwxr-xr-x 2 root   root   4096 Jan  6 00:00 2018-01-06  # fail
> drwxr-xr-x 4 ubuntu ubuntu 4096 Jan  7 00:00 2018-01-07
> drwxr-xr-x 4 ubuntu ubuntu 4096 Jan  8 00:00 2018-01-08
> lrwxrwxrwx 1 ubuntu ubuntu   46 Jan  8 00:00 latest -> 
> /home/ubuntu/airflow/logs/scheduler/2018-01-08
> {code}
> the other machine:
> {code}
> ubuntu@airflow-client-master:~$ ls -l airflow/logs/scheduler/
> total 24
> drwxr-xr-x 3 ubuntu ubuntu 4096 Jan  3 00:00 2018-01-03
> drwxr-xr-x 3 ubuntu ubuntu 4096 Jan  4 00:00 2018-01-04
> drwxr-xr-x 3 ubuntu ubuntu 4096 Jan  5 12:25 2018-01-05
> drwxr-xr-x 2 ubuntu ubuntu 4096 Jan  6 00:00 2018-01-06
> drwxr-xr-x 2 root   root   4096 Jan  7 00:00 2018-01-07  # fail
> drwxr-xr-x 2 ubuntu ubuntu 4096 Jan  8 00:00 2018-01-08
> lrwxrwxrwx 1 ubuntu ubuntu   46 Jan  8 00:00 latest -> 
> /home/ubuntu/airflow/logs/scheduler/2018-01-08
> {code}
> *My setup:*
> - _scheduler_ runs on EC2 machine under {{ubuntu:ubuntu}} from systemd service
> - _webapp_ runs on the same machine under {{root:root}} in order to be able 
> to serve on port :80
> - _workers_ run on different machines - actually this is irrelevant info;)
> Also there are two different "environments" - one for "client" and one for 
> "admin". The admin one is a fresh installation of airflow 1.9.0. The client 
> one is an upgrade from 1.8.2 to 1.9.0. Both encounter the permissions 
> problems.
> I'm not aware of these problems with 1.8.2 version.
> Workaround:
> let's get dirty, add following row into root's crontab:
> {code}
> 1 0 * * * chown ubuntu:ubuntu -Rc /home/airflow/logs/scheduler
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3337) "About" page version info is not available

2018-11-14 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3337.

Resolution: Duplicate

> "About" page version info is not available
> --
>
> Key: AIRFLOW-3337
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3337
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Dmytro Kulyk
>Priority: Minor
> Attachments: image-2018-11-14-01-00-58-743.png
>
>
> From the Airflow 1.10.0 ui, click about and the resulting page shows version 
> and git version as "Not available"
> Version has been upgraded from 1.9 over 
> {code}
> pip install apache-airflow=1.10.0
> {code}
>   !image-2018-11-14-01-00-58-743.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3323) Support Basic Authentication for Flower

2018-11-13 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3323.

   Resolution: Fixed
Fix Version/s: 2.0.0

> Support Basic Authentication for Flower
> ---
>
> Key: AIRFLOW-3323
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3323
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: celery
>Affects Versions: 1.10.0
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
> Fix For: 2.0.0
>
>
> The current `airflow flower` doesn't come with any authentication. This may 
> make essential information exposed to in an untrusted environment.
> Currently Flower itself supports
>  * HTTP Basic Authentication
>  * Google OAuth 2.0
>  * GitHub OAuth
> Given Flower is not really the most essential component of Airflow, we don't 
> have to support all its authentication methods. But may be good to at least 
> support Basic Authentication.
>  
> This ticket adds support to Basic Authentication for `Airflow Flower`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3334) Eliminate need for "Troubleshooting: Jinja template not found... Add a space after the script name when directly calling a Bash script with the bash_command argument.

2018-11-13 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685065#comment-16685065
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3334:


Oh yes I can see that being annoying.

I wonder if we can "simply" handle this by catching the error and not trying to 
template the script in that case. I wonder if this would break anything.

> Eliminate need for "Troubleshooting: Jinja template not found... Add a space 
> after the script name when directly calling a Bash script with the 
> bash_command argument. 
> ---
>
> Key: AIRFLOW-3334
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3334
> Project: Apache Airflow
>  Issue Type: Wish
>Reporter: Steven Ramey
>Priority: Trivial
>
> I've been training in some colleagues at my work on Airflow. A problem and 
> silly thing to go over in a tutorial to someone is the known error of: "Jinja 
> template not found."
> https://airflow.apache.org/howto/operator.html?highlight=operator#jinja-template-not-found
> I have no idea how much work goes into it, but would it be possible to 
> eliminate this feature of Airflow where a space at the end of a string is 
> necessary and the BashOperator simply reads in the string correctly, space or 
> not? Thanks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2908) Kubernetes executor tasks cannot be retried

2018-11-12 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-2908:
---
Fix Version/s: 1.10.2

> Kubernetes executor tasks cannot be retried
> ---
>
> Key: AIRFLOW-2908
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2908
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Reporter: Jon Davies
>Assignee: Jon Davies
>Priority: Major
> Fix For: 2.0.0, 1.10.2
>
>
> Hello,
> I'm using Airflow with the Kubernetes executor and pod operator, I cannot 
> retry a failed task in the DAG as Airflow reports "This can only be done with 
> the Celery executor".
> Airflow should be able to spin up the container in question just fine and 
> roll from there.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2442) Airflow run command leaves database connections open

2018-11-12 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-2442.

   Resolution: Fixed
Fix Version/s: (was: 2.0.0)
   1.10.0

> Airflow run command leaves database connections open
> 
>
> Key: AIRFLOW-2442
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2442
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: cli
>Affects Versions: 1.8.0
>Reporter: Alejandro Fernandez
>Assignee: Alejandro Fernandez
>Priority: Major
> Fix For: 1.10.0
>
> Attachments: connection_duration_1_hour.png, db_connections.png, 
> fixed_before_and_after.jpg, monthly_db_connections.png, running_tasks.png
>
>
> *Summary*
> The "airflow run" command creates a connection to the database and leaves it 
> open (until killed by SQLALchemy later). The number of these connections can 
> skyrocket whenever hundreds/thousands of tasks are launched simultaneously, 
> and potentially hit the database connection limit.
> The problem is that in cli.py, the run() method first calls 
> {code:java}
> settings.configure_orm(disable_connection_pool=True){code}
> correctly
>  to use a NullPool, but then parses any custom configs and again calls
> {code:java}
> settings.configure_orm(){code}
> , thereby overriding the desired behavior by instead using a QueuePool.
>  The QueuePool uses the default configs for SQL_ALCHEMY_POOL_SIZE and 
> SQL_ALCHEMY_POOL_RECYCLE. This means that while the task is running and the 
> executor is sending heartbeats, the sleeping connection is idle until it is 
> killed by SQLAlchemy.
> This fixes a bug introduced by 
> [https://github.com/apache/incubator-airflow/pull/1934] in 
> [https://github.com/apache/incubator-airflow/pull/1934/commits/b380013634b02bb4c1b9d1cc587ccd12383820b6#diff-1c2404a3a60f829127232842250ff406R344]
>   
> which is present in branches 1-8-stable, 1-9-stable, and 1-10-test
> NOTE: Will create a PR once I've done more testing since I'm on an older 
> branch. For now, attaching a patch file [^AIRFLOW-2442.patch]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-1700) Fix airflow cli connections command

2018-11-12 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-1700.

Resolution: Duplicate

> Fix airflow cli connections command 
> 
>
> Key: AIRFLOW-1700
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1700
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: cli
>Affects Versions: 2.0.0
>Reporter: Feng Lu
>Assignee: Feng Lu
>Priority: Major
> Fix For: 1.10.0
>
>
> When creating a new connection via airflow cli, the connection type is 
> inferred from the conn-uri argument (i.e., conn_type = url scheme). However, 
> for connection types like "hive_cli" and "google_cloud_platform", urlparse 
> (by design) was unable to get the connection type as '_' is not a valid 
> scheme character. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2723) Update lxml package for python 3 compatibility

2018-11-12 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-2723:
---
Fix Version/s: (was: 2.0.0)
   1.10.1

> Update lxml package for python 3 compatibility
> --
>
> Key: AIRFLOW-2723
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2723
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: dependencies, docker
>Affects Versions: 1.9.0, 2.0.0
>Reporter: neil90
>Assignee: neil90
>Priority: Minor
> Fix For: 1.10.1
>
>
> Following the 
> [CONTRIBUTING.md|https://github.com/apache/incubator-airflow/blob/master/CONTRIBUTING.md#set-up-a-development-env-using-docker]
>  for testing env for python 3 in docker I reviewed a gcc when pip install 
> position failed on the package gcc. Link to error - 
> [https://pastebin.com/30AJz5tC] .
>  
> I propose changing the setup.py lxml install to version 4.0.0 or greater.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3325) Fix the DAGs page - column "Recent Tasks" display issue

2018-11-12 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3325.

Resolution: Fixed

> Fix the DAGs page - column "Recent Tasks" display issue
> ---
>
> Key: AIRFLOW-3325
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3325
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ui
>Affects Versions: 1.10.1
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Minor
> Fix For: 1.10.1
>
> Attachments: Screen Shot 2018-11-10 at 11.01.43 AM.png
>
>
> In master branch and pre-release 1.10.1b1:
> In the main UI page, the "Recent Tasks" column is not displayed completely 
> (the last circle "scheduled" is not displayed completely).
> Reason behind: a new task state "Skipped" is added while UI was not adjusted 
> accordingly.
>  
> Please check the screenshot attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3193) Pin docker requirement version

2018-11-09 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3193:
---
Fix Version/s: (was: 2.0.0)
   1.10.1

> Pin docker requirement version
> --
>
> Key: AIRFLOW-3193
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3193
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Guoqiang Ding
>Assignee: Guoqiang Ding
>Priority: Major
> Fix For: 1.10.1
>
>
> The method "create_container" in APIClient of docker has been incompatible 
> from version 3.0.0.
>  
> Usage in `_airflow.operators.docker_operator_` as follows.
>  
> {code:java}
> self.container = self.cli.create_container(
> command=self.get_command(),
> cpu_shares=cpu_shares,
> environment=self.environment,
> host_config=self.cli.create_host_config(
> binds=self.volumes,
> network_mode=self.network_mode,
> shm_size=self.shm_size,
> dns=self.dns,
> dns_search=self.dns_search),
> image=image,
> mem_limit=self.mem_limit,
> user=self.user,
> working_dir=self.working_dir
> )
> {code}
>  
> The arguments such as "cpu_shares" and "mem_limit" has gone off. In other 
> words, after version 3.0.0, they should be passed into `create_host_config` 
> method.
>  
> {quote}airflow usage code link:
> https://github.com/apache/incubator-airflow/blob/cdbdcae7c0645ac2987360fced43407202716b99/airflow/operators/docker_operator.py#L207
> {quote}
>  
> {quote}version 3.0.0 code link: 
> https://github.com/docker/docker-py/blob/91bc75cc92f578ae9d659ad7e8ed11a0877b70aa/docker/api/container.py#L206
> {quote}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3271) Airflow RBAC Permissions modification via UI do not persist

2018-11-09 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3271:
---
Affects Version/s: 1.10.0
Fix Version/s: 1.10.2

> Airflow RBAC Permissions modification via UI do not persist
> ---
>
> Key: AIRFLOW-3271
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3271
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Smitha Koduri
>Assignee: Smitha Koduri
>Priority: Major
> Fix For: 1.10.2
>
>
> After upgrading Airflow to 1.10, we have noticed that when attempting to add 
> a new permission-role mapping (via UI), initially it gets successfully added 
> to db. But later, the entry doesn't persist in the db. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3114) Add feature to create External BigQuery Table for querying Cloud Bigtable data

2018-11-09 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3114:
---
Fix Version/s: (was: 1.10.1)
   (was: 2.0.0)
   1.10.2

> Add feature to create External BigQuery Table for querying Cloud Bigtable data
> --
>
> Key: AIRFLOW-3114
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3114
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp, hooks
>Affects Versions: 1.10.0, 2.0.0
>Reporter: Akshar Dave
>Priority: Minor
> Fix For: 1.10.2
>
>
> Currently using the BigQuery operator and the bq hook we can create an 
> external BQ table but there is no way to create a permanent external table on 
> Cloud Bigtable data.
>  * For 
> [externalDataConfiguration.sourceUris|https://cloud.google.com/bigquery/docs/reference/v2/tables#externalDataConfiguration.sourceUris],
>  you can specify exactly one [Cloud Bigtable 
> URI|https://cloud.google.com/bigquery/external-data-bigtable#bigtable-uri], 
> and it must be a fully specified, valid HTTPS URL.
>  * Specify the data format properties by setting the 
> [externalDataConfiguration.sourceFormat|https://cloud.google.com/bigquery/docs/reference/v2/tables#externalDataConfiguration.sourceFormat]
>  property. For Google Cloud Bigtable, specify {{"BIGTABLE"}}.
> Below are the links to Doc, Python Api and the JSON Rest API that can be 
> helpful:
> [https://cloud.google.com/bigquery/external-data-bigtable]
> [https://cloud.google.com/bigquery/docs/reference/rest/v2/tables#externalDataConfiguration.sourceFormat]
> [https://cloud.google.com/bigquery/docs/reference/rest/v2/tables#externalDataConfiguration.sourceUris]
> [https://github.com/GoogleCloudPlatform/google-cloud-python/blob/38080c8920d20b01f06c1eaef79eb527525de552/bigquery/google/cloud/bigquery/external_config.py]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2421) HTTPHook and SimpleHTTPOperator do not verify certificates by default

2018-11-09 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-2421:
---
Fix Version/s: (was: 1.10.1)
   1.10.2

> HTTPHook and SimpleHTTPOperator do not verify certificates by default
> -
>
> Key: AIRFLOW-2421
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2421
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks, security
>Affects Versions: 1.8.0
>Reporter: David Adrian
>Priority: Major
> Fix For: 1.10.2
>
>
> To verify HTTPS certificates when using anything built with an HTTP hook, you 
> have to explicitly pass the undocumented {{extra_options = \{"verify": True} 
> }}. The offending line is at 
> https://github.com/apache/incubator-airflow/blob/master/airflow/hooks/http_hook.py#L103.
> {code}
> response = session.send(
> 
> verify=extra_options.get("verify", False),
> 
> )
> {code}
> Not only is this the opposite default of what is expected, the necessary 
> requirements to verify certificates (e.g certifi), are already installed as 
> part of Airflow. I haven't dug through all of the code yet, but I'm concerned 
> that any other connections, operators or hooks built using HTTP hook don't 
> pass this option in.
> Instead, the HTTP hook should default to {{verify=True}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3309) Missing Mongo DB connection type

2018-11-09 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3309:
---
Fix Version/s: (was: 1.10.1)
   1.10.2

> Missing Mongo DB connection type
> 
>
> Key: AIRFLOW-3309
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3309
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: database
>Affects Versions: 1.10.0
>Reporter: John Cheng
>Assignee: John Cheng
>Priority: Minor
> Fix For: 1.10.2
>
>
> Unable to choose Mongo DB on the admin console connection page.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3033) `airflow upgradedb` should create FAB user tables always.

2018-11-09 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3033:
---
Fix Version/s: (was: 1.10.1)
   1.10.2

> `airflow upgradedb` should create FAB user tables always.
> -
>
> Key: AIRFLOW-3033
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3033
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Ash Berlin-Taylor
>Assignee: Roufique hossain
>Priority: Major
> Fix For: 1.10.2
>
>
> Right now the FAB user tables are only created on running {{airflow initdb}}, 
> and only when the rbac option is already set.
> I think we should
> 1) create the table un-conditinallly, and
> 2) create the tables as part of {{upgradedb}}, not just initdb. (I don't ever 
> run initdb on my production clusters - I don't want all the example 
> connections created.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2192) Don't authenticate on Google Authentication

2018-11-09 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-2192:
---
Fix Version/s: (was: 1.10.1)
   1.10.2

> Don't authenticate on Google Authentication
> ---
>
> Key: AIRFLOW-2192
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2192
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication
>Affects Versions: 1.8.0
> Environment: OS: Amazon Linux AMI release 2017.09
> RAM: 30.5
> CPU: 4
> Amazon Instance Type: R4.xlarge
> Python: 2.7.13
>Reporter: Fernando Ike
>Assignee: holdenk
>Priority: Critical
> Fix For: 1.10.2
>
> Attachments: airflow.log
>
>
> It's a weird, I tried to login using Google Authentication and Airflow 
> returned "_UnicodeEncodeError: 'latin-1' codec can't encode character 
> u'\u200b' in position 8: ordinal not in range(256)_". 
> So, my google profile was:
> _First Name: Fernando_
> _Last Name: Ike_
> I changed my profile just "_Ike"_ in the "First Name" and now I can login. In 
> the attachment is the log related:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3172) AttributeError: 'DagModel' object has no attribute 'execution_date'

2018-11-09 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3172:
---
Fix Version/s: (was: 1.10.1)
   1.10.2

> AttributeError: 'DagModel' object has no attribute 'execution_date'
> ---
>
> Key: AIRFLOW-3172
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3172
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10.0
> Environment: Docker Environment: python:3.6-stretch
>Reporter: Vinnson Lee
>Priority: Major
> Fix For: 1.10.2
>
>
> 2018-10-09 10:13:28,430] ERROR in app: Exception on /admin/dagmodel/ [GET]
> Traceback (most recent call last):
>  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1982, in 
> wsgi_app
>  response = self.full_dispatch_request()
>  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1614, in 
> full_dispatch_request
>  rv = self.handle_user_exception(e)
>  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1517, in 
> handle_user_exception
>  reraise(exc_type, exc_value, tb)
>  File "/usr/local/lib/python3.6/site-packages/flask/_compat.py", line 33, in 
> reraise
>  raise value
>  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1612, in 
> full_dispatch_request
>  rv = self.dispatch_request()
>  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1598, in 
> dispatch_request
>  return self.view_functions[rule.endpoint](**req.view_args)
>  File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line 69, 
> in inner
>  return self._run_view(f, *args, **kwargs)
>  File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line 368, 
> in _run_view
>  return fn(self, *args, **kwargs)
>  File "/usr/local/lib/python3.6/site-packages/flask_admin/model/base.py", 
> line 1900, in index_view
>  return_url=self._get_list_url(view_args),
>  File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line 308, 
> in render
>  return render_template(template, **kwargs)
>  File "/usr/local/lib/python3.6/site-packages/flask/templating.py", line 134, 
> in render_template
>  context, ctx.app)
>  File "/usr/local/lib/python3.6/site-packages/flask/templating.py", line 116, 
> in _render
>  rv = template.render(context)
>  File "/usr/local/lib/python3.6/site-packages/jinja2/environment.py", line 
> 989, in render
>  return self.environment.handle_exception(exc_info, True)
>  File "/usr/local/lib/python3.6/site-packages/jinja2/environment.py", line 
> 754, in handle_exception
>  reraise(exc_type, exc_value, tb)
>  File "/usr/local/lib/python3.6/site-packages/jinja2/_compat.py", line 37, in 
> reraise
>  raise value.with_traceback(tb)
>  File 
> "/usr/local/lib/python3.6/site-packages/airflow/www/templates/airflow/list_dags.html",
>  line 22, in top-level template code
>  \{% import 'admin/actions.html' as actionlib with context %}
>  File 
> "/usr/local/lib/python3.6/site-packages/airflow/www/templates/admin/master.html",
>  line 18, in top-level template code
>  \{% extends 'admin/base.html' %}
>  File 
> "/usr/local/lib/python3.6/site-packages/flask_admin/templates/bootstrap3/admin/base.html",
>  line 30, in top-level template code
>  \{% block page_body %}
>  File 
> "/usr/local/lib/python3.6/site-packages/airflow/www/templates/admin/master.html",
>  line 107, in block "page_body"
>  \{% block body %}
>  File 
> "/usr/local/lib/python3.6/site-packages/airflow/www/templates/airflow/list_dags.html",
>  line 67, in block "body"
>  \{% block model_list_table %}
>  File 
> "/usr/local/lib/python3.6/site-packages/airflow/www/templates/airflow/list_dags.html",
>  line 115, in block "model_list_table"
>  \{% block list_row scoped %}
>  File 
> "/usr/local/lib/python3.6/site-packages/airflow/www/templates/airflow/list_dags.html",
>  line 143, in block "list_row"
>  \{{ get_value(row, c) }}
>  File "/usr/local/lib/python3.6/site-packages/flask_admin/model/base.py", 
> line 1742, in get_list_value
>  self.column_type_formatters,
>  File "/usr/local/lib/python3.6/site-packages/flask_admin/model/base.py", 
> line 1707, in _get_list_value
>  value = column_fmt(self, context, model, name)
>  File "/usr/local/lib/python3.6/site-packages/airflow/www/views.py", line 
> 124, in dag_link
>  execution_date=m.execution_date)
> AttributeError: 'DagModel' object has no attribute 'execution_date'
>  
>  
> Its fine to work with SQLlite, but not with mysql



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3022) Add volume mount to KubernetesExecutorConfig

2018-11-09 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3022:
---
Fix Version/s: (was: 1.10.1)
   1.10.2

> Add volume mount to KubernetesExecutorConfig
> 
>
> Key: AIRFLOW-3022
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3022
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: executor
>Affects Versions: 1.10.0
>Reporter: John Cheng
>Assignee: John Cheng
>Priority: Minor
> Fix For: 1.10.2
>
>
> Allows volume mount to the worker pod with executor_config.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3118) DAGs not successful on new installation

2018-11-09 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3118:
---
Fix Version/s: 1.10.2

> DAGs not successful on new installation
> ---
>
> Key: AIRFLOW-3118
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3118
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG
>Affects Versions: 1.10.0
> Environment: Ubuntu 18.04
> Python 3.6
>Reporter: Brylie Christopher Oxley
>Assignee: Huy Nguyen
>Priority: Blocker
> Fix For: 1.10.2
>
> Attachments: Screenshot_20180926_161837.png, 
> image-2018-09-26-12-39-03-094.png
>
>
> When trying out Airflow, on localhost, none of the DAG runs are getting to 
> the 'success' state. They are getting stuck in 'running', or I manually label 
> them as failed:
> !image-2018-09-26-12-39-03-094.png!
> h2. Steps to reproduce
>  # create new conda environment
>  ** conda create -n airflow
>  ** source activate airflow
>  # install airflow
>  ** pip install apache-airflow
>  # initialize Airflow db
>  ** airflow initdb
>  # disable default paused setting in airflow.cfg
>  ** dags_are_paused_at_creation = False
>  # {color:#6a8759}run airflow and airflow scheduler (in separate 
> terminal){color}
>  ** {color:#6a8759}airflow scheduler{color}
>  ** {color:#6a8759}airflow webserver{color}
>  # {color:#6a8759}unpause example_bash_operator{color}
>  ** {color:#6a8759}airflow unpause example_bash_operator{color}
>  # {color:#6a8759}log in to Airflow UI{color}
>  # {color:#6a8759}turn on example_bash_operator{color}
>  # {color:#6a8759}click "Trigger DAG" in `example_bash_operator` row{color}
> h2. {color:#6a8759}Observed result{color}
> {color:#6a8759}The `example_bash_operator` never leaves the "running" 
> state.{color}
> h2. {color:#6a8759}Expected result{color}
> {color:#6a8759}The `example_bash_operator` would quickly enter the "success" 
> state{color}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3320) Sagemaker operator never ends when having "Stopped" status

2018-11-09 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681206#comment-16681206
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3320:


Are you running on master? If so can you check again with the latest version? 
We've just merged a big couple of PRs that totally reworks the SageMaker code 
and hopefully this is fixed.

> Sagemaker operator never ends when having "Stopped" status
> --
>
> Key: AIRFLOW-3320
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3320
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: aws
>Reporter: John Cheng
>Assignee: Yang Yu
>Priority: Minor
>
> Sagemaker operator never ends when having "Stopped" status.
> {code:java}
> [2018-11-08 20:15:27,864] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:15:27,864] {sagemaker_hook.py:129} INFO - Job still running for 2690 
> seconds... current status is InProgress
> [2018-11-08 20:15:32,917] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:15:32,917] {sagemaker_hook.py:129} INFO - Job still running for 2695 
> seconds... current status is Stopping
> [2018-11-08 20:15:37,963] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:15:37,963] {sagemaker_hook.py:129} INFO - Job still running for 2700 
> seconds... current status is Stopping
> [2018-11-08 20:15:43,012] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:15:43,012] {sagemaker_hook.py:129} INFO - Job still running for 2705 
> seconds... current status is Stopping
> [2018-11-08 20:15:48,060] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:15:48,060] {sagemaker_hook.py:129} INFO - Job still running for 2710 
> seconds... current status is Stopping
> [2018-11-08 20:15:53,108] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:15:53,108] {sagemaker_hook.py:129} INFO - Job still running for 2715 
> seconds... current status is Stopping
> [2018-11-08 20:15:58,153] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:15:58,153] {sagemaker_hook.py:129} INFO - Job still running for 2720 
> seconds... current status is Stopping
> [2018-11-08 20:16:03,206] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:16:03,206] {sagemaker_hook.py:129} INFO - Job still running for 2725 
> seconds... current status is Stopping
> [2018-11-08 20:16:08,255] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:16:08,255] {sagemaker_hook.py:129} INFO - Job still running for 2730 
> seconds... current status is Stopping
> [2018-11-08 20:16:13,298] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:16:13,298] {sagemaker_hook.py:129} INFO - Job still running for 2735 
> seconds... current status is Stopping
> [2018-11-08 20:16:18,349] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:16:18,349] {sagemaker_hook.py:129} INFO - Job still running for 2740 
> seconds... current status is Stopping
> [2018-11-08 20:16:23,398] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:16:23,398] {sagemaker_hook.py:129} INFO - Job still running for 2745 
> seconds... current status is Stopping
> [2018-11-08 20:16:28,445] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:16:28,445] {sagemaker_hook.py:129} INFO - Job still running for 2750 
> seconds... current status is Stopping
> [2018-11-08 20:16:33,497] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:16:33,497] {sagemaker_hook.py:129} INFO - Job still running for 2755 
> seconds... current status is Stopping
> [2018-11-08 20:16:38,545] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:16:38,545] {sagemaker_hook.py:129} INFO - Job still running for 2760 
> seconds... current status is Stopping
> [2018-11-08 20:16:43,593] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:16:43,593] {sagemaker_hook.py:129} INFO - Job still running for 2765 
> seconds... current status is Stopping
> [2018-11-08 20:16:48,639] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:16:48,639] {sagemaker_hook.py:129} INFO - Job still running for 2770 
> seconds... current status is Stopping
> [2018-11-08 20:16:53,724] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:16:53,724] {sagemaker_hook.py:129} INFO - Job still running for 2775 
> seconds... current status is Stopping
> [2018-11-08 20:16:58,773] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:16:58,773] {sagemaker_hook.py:129} INFO - Job still running for 2780 
> seconds... current status is Stopping
> [2018-11-08 20:17:03,835] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:17:03,835] {sagemaker_hook.py:129} INFO - Job still running for 2785 
> seconds... current status is Stopping
> [2018-11-08 20:17:08,880] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:17:08,880] {sagemaker_hook.py:129} INFO - Job still running for 2790 
> seconds... current status is Stopping
> [2018-11-08 20:17:13,931] {logging_mixin.py:95} INFO - [2018-11-08 
> 20:17:13,931] {sagemaker_hook.py:129} INFO - Job still running for 2795 
> seconds... current status 

[jira] [Closed] (AIRFLOW-3321) Ability to change schedule interval for a dag from airflow UI

2018-11-09 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor closed AIRFLOW-3321.
--
Resolution: Won't Fix

This is counter to one of the core principles of Airflow - that the DAGs are 
defined in code so is not something we're going to do. Sorry

You can create custom DAG runs (and I think we allow the time to be specified 
from a from input in Master/2.0.0)  via the UI so this might give you the 
feature you need.

> Ability to change schedule interval for a dag from airflow UI
> -
>
> Key: AIRFLOW-3321
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3321
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: webapp
>Affects Versions: 1.10.0
>Reporter: Anshu Agarwal
>Priority: Minor
>
> There are situations when we need to experiment with dag schedule interval on 
> production without code deployment. Having schedule interval edit option in 
> the airflow webserver UI can help to achieve this without any code deployment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3165) Document use of interpolation by ConfigParser

2018-11-08 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3165:
---
Fix Version/s: (was: 2.0.0)
   1.10.1

> Document use of interpolation by ConfigParser
> -
>
> Key: AIRFLOW-3165
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3165
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Bolke de Bruin
>Priority: Major
> Fix For: 1.10.1
>
>
> The config parser interpolates '%' in variables. This can lead to issues when 
> specifiying passwords. As we cant disable inerpolation on a per variable we 
> need to document that people should not use a % sign in their passwords.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3103) Update flask-login

2018-11-08 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3103:
---
Fix Version/s: 1.10.1

> Update flask-login
> --
>
> Key: AIRFLOW-3103
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3103
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: authentication
>Reporter: Josh Carp
>Priority: Minor
> Fix For: 1.10.1
>
>
> Airflow uses a release of flask-login from 2014. Flask-login has fixed some 
> bugs and added some features since then, so we should upgrade. Note: 
> flask-appbuilder also pins to an old version of flask-login, so we'll have to 
> update that library as well; PR submitted at 
> https://github.com/dpgaspar/Flask-AppBuilder/pull/811.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2524) Airflow integration with AWS Sagemaker

2018-11-08 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-2524.

Resolution: Fixed

> Airflow integration with AWS Sagemaker
> --
>
> Key: AIRFLOW-2524
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2524
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws, contrib
>Reporter: Rajeev Srinivasan
>Assignee: Yang Yu
>Priority: Major
>  Labels: AWS
> Fix For: 1.10.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Would it be possible to orchestrate an end to end  AWS  Sagemaker job using 
> Airflow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-1762) Use key_file in SSHHook.create_tunnel()

2018-11-08 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-1762:
---
Summary: Use key_file in SSHHook.create_tunnel()  (was: Use key_file in 
create_tunnel())

> Use key_file in SSHHook.create_tunnel()
> ---
>
> Key: AIRFLOW-1762
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1762
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.8.0, 1.9.0
>Reporter: Nathan McIntyre
>Assignee: Nathan McIntyre
>Priority: Major
>  Labels: patch
> Fix For: 2.0.0, 1.10.1
>
>
> In contrib/hooks/ssh_hook.py, the ssh command created by the create_tunnel() 
> method does not use the key_file attribute. This prevents the creation of 
> tunnels where a key file is required. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3265) Add support for "unix_socket" in connection extra for Mysql Hook

2018-11-08 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3265:
---
Summary: Add support for "unix_socket" in connection extra for Mysql Hook  
(was: Mysql Hook does not support "unix_socket" extra)

> Add support for "unix_socket" in connection extra for Mysql Hook
> 
>
> Key: AIRFLOW-3265
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3265
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: database
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Minor
> Fix For: 1.10.1
>
>
> MySQL hook does not support "unix_socket" extra - which allows to specify 
> different location of linux socket than the default one. This is a blocker 
> for tools like cloud-sql-proxy that creates sockets in an arbitrary place.
> I will provide fix shortly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2476) Update tabulate dependency version to 0.8.2

2018-11-08 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-2476:
---
Summary: Update tabulate dependency version to 0.8.2  (was: tabulate 
update: 0.8.2 is tested)

> Update tabulate dependency version to 0.8.2
> ---
>
> Key: AIRFLOW-2476
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2476
> Project: Apache Airflow
>  Issue Type: Improvement
>Affects Versions: 1.8.0, 1.9.0, 1.10.0, 2.0.0
>Reporter: Ruslan Dautkhanov
>Assignee: Kaxil Naik
>Priority: Major
> Fix For: 1.10.1
>
>
> As discussed on the dev list, tabulate==0.8.2 is good to go with Airflow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-843) Exceptions now available in context during on_failure_callback

2018-11-08 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-843:
--
Summary: Exceptions now available in context during on_failure_callback  
(was: Store task exceptions in context)

> Exceptions now available in context during on_failure_callback
> --
>
> Key: AIRFLOW-843
> URL: https://issues.apache.org/jira/browse/AIRFLOW-843
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Scott Kruger
>Priority: Minor
> Fix For: 1.10.1
>
>
> If a task encounters an exception during execution, it should store the 
> exception on the execution context so that other methods (namely 
> `on_failure_callback` can access it.  This would help with custom error 
> integrations, e.g. Sentry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3312) No log output from BashOperator under test

2018-11-08 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3312.

Resolution: Duplicate

> No log output from BashOperator under test
> --
>
> Key: AIRFLOW-3312
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3312
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging, operators
>Affects Versions: 1.10.0
>Reporter: Chris Bandy
>Priority: Major
>
> The BashOperator logs some messages as well as the stdout of its command at 
> the info level, but none of these appear when running {{airflow test}} with 
> the default configuration.
> For example, this DAG emits the following in Airflow 1.10.0:
> {code:python}
> from airflow import DAG
> from airflow.operators.bash_operator import BashOperator
> from datetime import datetime
> dag = DAG('please', start_date=datetime(year=2018, month=11, day=1))
> BashOperator(dag=dag, task_id='mine', bash_command='echo thank you')
> {code}
> {noformat}
> $ airflow test please mine '2018-11-01'
> [2018-11-08 00:06:54,098] {__init__.py:51} INFO - Using executor 
> SequentialExecutor
> [2018-11-08 00:06:54,246] {models.py:258} INFO - Filling up the DagBag from 
> /usr/local/airflow/dags
> {noformat}
> When executed by the scheduler, logs go to a file:
> {noformat}
> $ airflow scheduler -n 1
> ...
> [2018-11-08 00:41:02,674] {dag_processing.py:582} INFO - Started a process 
> (PID: 9) to generate tasks for /usr/local/airflow/dags/please.py
> [2018-11-08 00:41:03,185] {dag_processing.py:495} INFO - Processor for 
> /usr/local/airflow/dags/please.py finished
> [2018-11-08 00:41:03,525] {jobs.py:1114} INFO - Tasks up for execution:
>   
> [2018-11-08 00:41:03,536] {jobs.py:1147} INFO - Figuring out tasks to run in 
> Pool(name=None) with 128 open slots and 1 task instances in queue
> [2018-11-08 00:41:03,539] {jobs.py:1184} INFO - DAG please has 0/16 running 
> and queued tasks
> [2018-11-08 00:41:03,540] {jobs.py:1216} INFO - Setting the follow tasks to 
> queued state:
>   
> [2018-11-08 00:41:03,573] {jobs.py:1297} INFO - Setting the follow tasks to 
> queued state:
>   
> [2018-11-08 00:41:03,576] {jobs.py:1339} INFO - Sending ('please', 'mine', 
> datetime.datetime(2018, 11, 1, 0, 0, tzinfo=)) to executor 
> with priority 1 and queue default
> [2018-11-08 00:41:03,578] {base_executor.py:56} INFO - Adding to queue: 
> airflow run please mine 2018-11-01T00:00:00+00:00 --local -sd 
> /usr/local/airflow/dags/please.py
> [2018-11-08 00:41:03,593] {sequential_executor.py:45} INFO - Executing 
> command: airflow run please mine 2018-11-01T00:00:00+00:00 --local -sd 
> /usr/local/airflow/dags/please.py
> [2018-11-08 00:41:04,262] {__init__.py:51} INFO - Using executor 
> SequentialExecutor
> [2018-11-08 00:41:04,406] {models.py:258} INFO - Filling up the DagBag from 
> /usr/local/airflow/dags/please.py
> [2018-11-08 00:41:04,458] {cli.py:492} INFO - Running  please.mine 2018-11-01T00:00:00+00:00 [queued]> on host e2e08cf4dfaa
> [2018-11-08 00:41:09,684] {jobs.py:1443} INFO - Executor reports please.mine 
> execution_date=2018-11-01 00:00:00+00:00 as success
> $ cat logs/please/mine/2018-11-01T00\:00\:00+00\:00/1.log
> [2018-11-08 00:41:04,554] {models.py:1335} INFO - Dependencies all met for 
> 
> [2018-11-08 00:41:04,564] {models.py:1335} INFO - Dependencies all met for 
> 
> [2018-11-08 00:41:04,565] {models.py:1547} INFO -
> 
> Starting attempt 1 of 1
> 
> [2018-11-08 00:41:04,605] {models.py:1569} INFO - Executing 
>  on 2018-11-01T00:00:00+00:00
> [2018-11-08 00:41:04,605] {base_task_runner.py:124} INFO - Running: ['bash', 
> '-c', 'airflow run please mine 2018-11-01T00:00:00+00:00 --job_id 142 --raw 
> -sd DAGS_FOLDER/please.py --cfg_path /tmp/tmp9prq7knr']
> [2018-11-08 00:41:05,214] {base_task_runner.py:107} INFO - Job 142: Subtask 
> mine [2018-11-08 00:41:05,213] {__init__.py:51} INFO - Using executor 
> SequentialExecutor
> [2018-11-08 00:41:05,334] {base_task_runner.py:107} INFO - Job 142: Subtask 
> mine [2018-11-08 00:41:05,333] {models.py:258} INFO - Filling up the DagBag 
> from /usr/local/airflow/dags/please.py
> [2018-11-08 00:41:05,368] {base_task_runner.py:107} INFO - Job 142: Subtask 
> mine [2018-11-08 00:41:05,367] {cli.py:492} INFO - Running  please.mine 2018-11-01T00:00:00+00:00 [running]> on host e2e08cf4dfaa
> [2018-11-08 00:41:05,398] {bash_operator.py:74} INFO - Tmp dir root location:
>  /tmp
> [2018-11-08 00:41:05,398] {bash_operator.py:87} INFO - Temporary script 
> location: /tmp/airflowtmp0is6wwxi/mine8tmew5y4
> [2018-11-08 00:41:05,399] {bash_operator.py:97} INFO - Running command: echo 
> 

[jira] [Commented] (AIRFLOW-3312) No log output from BashOperator under test

2018-11-08 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680395#comment-16680395
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3312:


This'll be fixed in 1.10.1

> No log output from BashOperator under test
> --
>
> Key: AIRFLOW-3312
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3312
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging, operators
>Affects Versions: 1.10.0
>Reporter: Chris Bandy
>Priority: Major
>
> The BashOperator logs some messages as well as the stdout of its command at 
> the info level, but none of these appear when running {{airflow test}} with 
> the default configuration.
> For example, this DAG emits the following in Airflow 1.10.0:
> {code:python}
> from airflow import DAG
> from airflow.operators.bash_operator import BashOperator
> from datetime import datetime
> dag = DAG('please', start_date=datetime(year=2018, month=11, day=1))
> BashOperator(dag=dag, task_id='mine', bash_command='echo thank you')
> {code}
> {noformat}
> $ airflow test please mine '2018-11-01'
> [2018-11-08 00:06:54,098] {__init__.py:51} INFO - Using executor 
> SequentialExecutor
> [2018-11-08 00:06:54,246] {models.py:258} INFO - Filling up the DagBag from 
> /usr/local/airflow/dags
> {noformat}
> When executed by the scheduler, logs go to a file:
> {noformat}
> $ airflow scheduler -n 1
> ...
> [2018-11-08 00:41:02,674] {dag_processing.py:582} INFO - Started a process 
> (PID: 9) to generate tasks for /usr/local/airflow/dags/please.py
> [2018-11-08 00:41:03,185] {dag_processing.py:495} INFO - Processor for 
> /usr/local/airflow/dags/please.py finished
> [2018-11-08 00:41:03,525] {jobs.py:1114} INFO - Tasks up for execution:
>   
> [2018-11-08 00:41:03,536] {jobs.py:1147} INFO - Figuring out tasks to run in 
> Pool(name=None) with 128 open slots and 1 task instances in queue
> [2018-11-08 00:41:03,539] {jobs.py:1184} INFO - DAG please has 0/16 running 
> and queued tasks
> [2018-11-08 00:41:03,540] {jobs.py:1216} INFO - Setting the follow tasks to 
> queued state:
>   
> [2018-11-08 00:41:03,573] {jobs.py:1297} INFO - Setting the follow tasks to 
> queued state:
>   
> [2018-11-08 00:41:03,576] {jobs.py:1339} INFO - Sending ('please', 'mine', 
> datetime.datetime(2018, 11, 1, 0, 0, tzinfo=)) to executor 
> with priority 1 and queue default
> [2018-11-08 00:41:03,578] {base_executor.py:56} INFO - Adding to queue: 
> airflow run please mine 2018-11-01T00:00:00+00:00 --local -sd 
> /usr/local/airflow/dags/please.py
> [2018-11-08 00:41:03,593] {sequential_executor.py:45} INFO - Executing 
> command: airflow run please mine 2018-11-01T00:00:00+00:00 --local -sd 
> /usr/local/airflow/dags/please.py
> [2018-11-08 00:41:04,262] {__init__.py:51} INFO - Using executor 
> SequentialExecutor
> [2018-11-08 00:41:04,406] {models.py:258} INFO - Filling up the DagBag from 
> /usr/local/airflow/dags/please.py
> [2018-11-08 00:41:04,458] {cli.py:492} INFO - Running  please.mine 2018-11-01T00:00:00+00:00 [queued]> on host e2e08cf4dfaa
> [2018-11-08 00:41:09,684] {jobs.py:1443} INFO - Executor reports please.mine 
> execution_date=2018-11-01 00:00:00+00:00 as success
> $ cat logs/please/mine/2018-11-01T00\:00\:00+00\:00/1.log
> [2018-11-08 00:41:04,554] {models.py:1335} INFO - Dependencies all met for 
> 
> [2018-11-08 00:41:04,564] {models.py:1335} INFO - Dependencies all met for 
> 
> [2018-11-08 00:41:04,565] {models.py:1547} INFO -
> 
> Starting attempt 1 of 1
> 
> [2018-11-08 00:41:04,605] {models.py:1569} INFO - Executing 
>  on 2018-11-01T00:00:00+00:00
> [2018-11-08 00:41:04,605] {base_task_runner.py:124} INFO - Running: ['bash', 
> '-c', 'airflow run please mine 2018-11-01T00:00:00+00:00 --job_id 142 --raw 
> -sd DAGS_FOLDER/please.py --cfg_path /tmp/tmp9prq7knr']
> [2018-11-08 00:41:05,214] {base_task_runner.py:107} INFO - Job 142: Subtask 
> mine [2018-11-08 00:41:05,213] {__init__.py:51} INFO - Using executor 
> SequentialExecutor
> [2018-11-08 00:41:05,334] {base_task_runner.py:107} INFO - Job 142: Subtask 
> mine [2018-11-08 00:41:05,333] {models.py:258} INFO - Filling up the DagBag 
> from /usr/local/airflow/dags/please.py
> [2018-11-08 00:41:05,368] {base_task_runner.py:107} INFO - Job 142: Subtask 
> mine [2018-11-08 00:41:05,367] {cli.py:492} INFO - Running  please.mine 2018-11-01T00:00:00+00:00 [running]> on host e2e08cf4dfaa
> [2018-11-08 00:41:05,398] {bash_operator.py:74} INFO - Tmp dir root location:
>  /tmp
> [2018-11-08 00:41:05,398] {bash_operator.py:87} INFO - Temporary script 
> location: /tmp/airflowtmp0is6wwxi/mine8tmew5y4
> [2018-11-08 00:41:05,399] 

[jira] [Comment Edited] (AIRFLOW-2780) Adds IMAP Hook to interact with a mail server

2018-11-08 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680390#comment-16680390
 ] 

Ash Berlin-Taylor edited comment on AIRFLOW-2780 at 11/8/18 9:16 PM:
-

You can pick what you like, and I (since I'm doing the release of 1.10.1) treat 
it as an indication that people want it in the next release, and if it's a 
small change, easy to fix there's a reasonable chance of it being pulled in to 
the release branch.

Generally the committer who merges (or someone who comes along after) sets the 
fix version to the next release - 2.0.0 at the moment.


was (Author: ashb):
You can pick what you like, and I (since I'm doing the release of 1.10.1) treat 
it as an indication that people want it in the next release, and if it's a 
small change, easy to fix there's a reasonable chance of it being pulled in to 
the release branch.

> Adds IMAP Hook to interact with a mail server
> -
>
> Key: AIRFLOW-2780
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2780
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Felix Uellendall
>Assignee: Felix Uellendall
>Priority: Major
> Fix For: 1.10.1
>
>
> This Hook connects to a mail server via IMAP to be able to retrieve email 
> attachments by using [Python's IMAP 
> Library.|https://docs.python.org/3.6/library/imaplib.html]
> Features:
> - `has_mail_attachment`: Can be used in a `Sensor` to check if there is an 
> attachment on the mail server with the given name.
> - `retrieve_mail_attachments`: Can be used in an `Operator` to do sth. with 
> the attachments returned as list of tuple.
> - `download_mail_attachments`: Can be used in an `Operator` to download the 
> attachment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2780) Adds IMAP Hook to interact with a mail server

2018-11-08 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680390#comment-16680390
 ] 

Ash Berlin-Taylor commented on AIRFLOW-2780:


You can pick what you like, and I (since I'm doing the release of 1.10.1) treat 
it as an indication that people want it in the next release, and if it's a 
small change, easy to fix there's a reasonable chance of it being pulled in to 
the release branch.

> Adds IMAP Hook to interact with a mail server
> -
>
> Key: AIRFLOW-2780
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2780
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Felix Uellendall
>Assignee: Felix Uellendall
>Priority: Major
> Fix For: 1.10.1
>
>
> This Hook connects to a mail server via IMAP to be able to retrieve email 
> attachments by using [Python's IMAP 
> Library.|https://docs.python.org/3.6/library/imaplib.html]
> Features:
> - `has_mail_attachment`: Can be used in a `Sensor` to check if there is an 
> attachment on the mail server with the given name.
> - `retrieve_mail_attachments`: Can be used in an `Operator` to do sth. with 
> the attachments returned as list of tuple.
> - `download_mail_attachments`: Can be used in an `Operator` to download the 
> attachment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2022) Web GUI paged entries do not show Last Run and DAG Runs

2018-11-07 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-2022.

Resolution: Fixed

We changed how this column was populated on 1.9.0 so it was likely fixed then.

> Web GUI paged entries do not show Last Run and DAG Runs
> ---
>
> Key: AIRFLOW-2022
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2022
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.8.2
>Reporter: Yee Ting Li
>Priority: Minor
>
> i've gone past the 25 dags that fit onto one single page in the web frontend 
> view.
> however, when i page to the page 2, my DAGs do not show the last run nor the 
> DAG Runs column data.
> if i switch to 'Show 50 entries', i still do not see these columns on the 
> items 26 and above.
> the Recent Tasks appear to work fine however.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3297) EmrStepSensor marks cancelled step as successful

2018-11-07 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3297.

   Resolution: Fixed
Fix Version/s: 1.10.1
   2.0.0

> EmrStepSensor marks cancelled step as successful 
> -
>
> Key: AIRFLOW-3297
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3297
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Dana Assa
>Assignee: Dana Assa
>Priority: Minor
> Fix For: 2.0.0, 1.10.1
>
>
> When a step is manually cancelled or when the cluster is terminated 
> unexpectedly, an EMR step can reach 'CANCEL_PENDING' state or 'INTERRUPTED' 
> state.
> These states are not mentioned as FAILED_STATES nor as NON_TERMINAL_STATES in 
> the "EmrStepSensor" class, thus the sensor returns True and the Airflow stage 
> is marked as success instead of failed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-1902) FTPSHook is mapped incorrectly in init file

2018-11-07 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-1902.

Resolution: Fixed

> FTPSHook is mapped incorrectly in init file
> ---
>
> Key: AIRFLOW-1902
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1902
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.8.0
>Reporter: Harshvardhan Chauhan
>Assignee: Harshvardhan Chauhan
>Priority: Minor
>
> 'ftps_hook': ['FTPSHook'] is inside contrib/hooks/ftp_hool.py there is no 
> ftps_hook.py 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2866) Missing CSRF Token Error on Web RBAC UI Create/Update Operations

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-2866:
---
Affects Version/s: 2.0.0
Fix Version/s: (was: 1.10.1)
   2.0.0

> Missing CSRF Token Error on Web RBAC UI Create/Update Operations
> 
>
> Key: AIRFLOW-2866
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2866
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.0.0
>Reporter: Jasper Kahn
>Priority: Major
> Fix For: 2.0.0
>
>
> Attempting to modify or delete many resources (such as Connections or Users) 
> results in a 400 from the webserver:
> {quote}{{Bad Request}}
> {{The CSRF session token is missing.}}{quote}
> Logs report:
> {quote}{{[2018-08-07 18:45:15,771] \{csrf.py:251} INFO - The CSRF session 
> token is missing.}}
> {{192.168.9.1 - - [07/Aug/2018:18:45:15 +] "POST 
> /admin/connection/delete/ HTTP/1.1" 400 150 
> "http://localhost:8081/admin/connection/; "Mozilla/5.0 (X11; Linux x86_64) 
> AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.84 
> Safari/537.36"}}{quote}
> Chrome dev tools show the CSRF token is present in the request payload.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2216) Cannot specify a profile for AWS Hook to load with s3 config file

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-2216.

Resolution: Fixed

> Cannot specify a profile for AWS Hook to load with s3 config file
> -
>
> Key: AIRFLOW-2216
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2216
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 1.9.0
> Environment: IDE: PyCharm
> Airflow 1.9
> Python 3.4.3
>Reporter: Lorena Mesa
>Assignee: Lorena Mesa
>Priority: Minor
> Fix For: 1.10.1
>
>
> Currently the source code for AWS Hook doesn't permit the user to provide a 
> profile when their aws connection object specifies in the extra param's 
> information on s3_config_file:
> {code:java}
> def _get_credentials(self, region_name):
> aws_access_key_id = None
> aws_secret_access_key = None
> aws_session_token = None
> endpoint_url = None
> if self.aws_conn_id:
> try:
> # Cut for brevity
> elif 's3_config_file' in connection_object.extra_dejson:
>  aws_access_key_id, aws_secret_access_key = \
> _parse_s3_config(connection_object.extra_dejson['s3_config_file'],
>connection_object.extra_dejson.get('s3_config_format'),
>  connection_object.extra_dejson.get('profile')){code}
> The _parse_s3_config method has a param for profile set to none, so by not 
> providing it in the method you cannot now specify a profile credential to be 
> loaded. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3307) Update insecure node dependencies

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677279#comment-16677279
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3307:


Sure, We should update them, but the security of it doesn't concern us as they 
are dev-time only so don't affect our users.

> Update insecure node dependencies
> -
>
> Key: AIRFLOW-3307
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3307
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Josh Carp
>Assignee: Josh Carp
>Priority: Trivial
>
> `npm audit` shows some node dependencies that are out of date and potentially 
> insecure. We should update them with `npm audit fix`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3161) Log Url link does not link to task instance logs in RBAC UI

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3161.

   Resolution: Fixed
Fix Version/s: 1.10.1
   2.0.0

> Log Url link does not link to task instance logs in RBAC UI
> ---
>
> Key: AIRFLOW-3161
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3161
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Eric Chang
>Assignee: Eric Chang
>Priority: Minor
> Fix For: 2.0.0, 1.10.1
>
> Attachments: image-2018-10-04-17-33-33-616.png, 
> image-2018-10-04-17-34-12-135.png, image-2018-10-04-17-35-14-224.png
>
>
> In the new RBAC UI, the "Log Url" link (0) for Task instances don't link to 
> the log for the task instances (1). Instead, they link to the DAG log list 
> (2).
> (0)
> !image-2018-10-04-17-35-14-224.png|width=172,height=172!
> (1)
> !image-2018-10-04-17-34-12-135.png|width=660,height=376!
> (2)
> !image-2018-10-04-17-33-33-616.png|width=478,height=238!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (AIRFLOW-3161) Log Url link does not link to task instance logs in RBAC UI

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor reopened AIRFLOW-3161:


Reopening to change fix versions

> Log Url link does not link to task instance logs in RBAC UI
> ---
>
> Key: AIRFLOW-3161
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3161
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Eric Chang
>Assignee: Eric Chang
>Priority: Minor
> Fix For: 2.0.0, 1.10.1
>
> Attachments: image-2018-10-04-17-33-33-616.png, 
> image-2018-10-04-17-34-12-135.png, image-2018-10-04-17-35-14-224.png
>
>
> In the new RBAC UI, the "Log Url" link (0) for Task instances don't link to 
> the log for the task instances (1). Instead, they link to the DAG log list 
> (2).
> (0)
> !image-2018-10-04-17-35-14-224.png|width=172,height=172!
> (1)
> !image-2018-10-04-17-34-12-135.png|width=660,height=376!
> (2)
> !image-2018-10-04-17-33-33-616.png|width=478,height=238!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3285) lazy marking of upstream_failed task state

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677006#comment-16677006
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3285:


The lazy feature as you have described it isn't something we'd accept as it's 
quite a behaviour change and a little bit of a work-ardound, but a combo 
trigger rule so we could do say {{trigger_rule=\{'all_done','one_failed',\}}} 
to say "trigger on any of these conditions" would be acceptable

> lazy marking of upstream_failed task state
> --
>
> Key: AIRFLOW-3285
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3285
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Kevin McHale
>Priority: Minor
>
> Airflow aggressively applies the {{upstream_failed}} task state: as soon as a 
> task fails, all of its downstream dependencies get marked.  This sometimes 
> creates problems for us at Etsy.
> In particular, we use a pattern for our hadoop Airflow DAGs along these lines:
>  # the DAG creates a hadoop cluster in GCP/Dataproc
>  # the DAG executes its tasks on the cluster
>  # the DAG deletes the cluster once all tasks are done
> There are some cases in which the tasks immediately upstream of the 
> cluster-delete step get marked as {{upstream_failed}}, triggering the 
> cluster-delete step, even while other tasks continue to execute without 
> problems on the cluster.  The cluster-delete step of course kills all of the 
> running tasks, requiring all of them to be re-run once the problem with the 
> failed task is mitigated.
> As an example, a DAG that looks like this can exhibit the problem:
> {code:java}
> Cluster = ClusterCreateOperator(...)
> A = Job1Operator(...)
> Cluster << A
> B = Job2Operator(...)
> Cluster << B
> C = Job3Operator(...)
> A << C
> B << C
> ClusterDelete = DeleteClusterOperator(trigger_rule="all_done", ...)
> D << ClusterDelete{code}
> In a DAG like this, suppose task A fails while task B is running.  Task C 
> will immediately be marked as {{upstream_failed}}, which will cause 
> ClusterDelete to run while task B is still running, which will cause task B 
> to also fail.
> Our solution to this problem has been to implement something like [this 
> diff|https://github.com/mchalek/incubator-airflow/commit/585349018656cd9b2e3e3e113db6412345485dde],
>  which lazily applies the {{upstream_failed}} state only to tasks for which 
> all upstream tasks have already completed.
> The consequence in terms of the example above is that task C will not be 
> marked {{upstream_failed}} in response to task A failing until task B 
> completes, ensuring that the cluster is not deleted while any upstream tasks 
> are running.
> We find this not to have any adverse behavior on our airflow instances, so we 
> run all of them with this lazy-marking feature enabled.  However, we 
> recognize that a change in behavior like this may be something that existing 
> users will want to opt-in for, so we included a config flag in the diff that 
> defaults to the original behavior.
> We would appreciate your consideration of incorporating this diff, or 
> something like it, to allow us to configure this behavior in unmodified, 
> upstream airflow.
> Thanks!
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3300) Frequent crash of scheduler while interacting with Airflow Metadata (Mysql)

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676990#comment-16676990
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3300:


Might be fixed by AIRFLOW-2703, but a deadlock is possibly a sign of a bigger 
issue.

> Frequent crash of scheduler while interacting with Airflow Metadata (Mysql)
> ---
>
> Key: AIRFLOW-3300
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3300
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Tanuj Gupta
>Priority: Major
>
> It's been very frequent when scheduler tries to update the task instance 
> table and ends up with scheduler crash due to deadlock occurrence. Following 
> is the stack-trace for the same -
>  
> {noformat}
> sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (1213, 
> 'Deadlock found when trying to get lock; try restarting transaction') [SQL: 
> u'UPDATE task_instance, dag_run SET task_instance.state=%s WHERE 
> task_instance.dag_id IN (%s, %s, %s, %s, %s) AND task_instance.state IN (%s, 
> %s) AND dag_run.dag_id = task_instance.dag_id AND dag_run.execution_date = 
> task_instance.execution_date AND dag_run.state != %s'] [parameters: (None, 
> 'org_test0802_h9zrva', 
> '27bd514b5ab9854b0a494110_45aa7868_1799_4046_ad20_f35e3de1a4ec_p8bkfg', 
> 'org_e2e_trainman1528457521430', 'org_blockerretryissue_6kiafp', 
> 'org_e2e_compute_v21540610294106_svnfmr', u'queued', u'scheduled', 
> u'running')] (Background on this error at: 
> http://sqlalche.me/e/e3q8){noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-1632) MySQL to GCS fails for date/datetime before ~1850

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-1632.

Resolution: Duplicate

> MySQL to GCS fails for date/datetime before ~1850
> -
>
> Key: AIRFLOW-1632
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1632
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: gcp
> Environment: Google Cloud Platform
>Reporter: Michael Ghen
>Assignee: Michael Ghen
>Priority: Minor
>
> For tables in MySQL that use a "date" or "datetime" type, a dag that exports 
> from MySQL to Google Cloud Storage and then loads from GCS to BigQuery will 
> fail when the dates are before 1970.
> When the table is exported as JSON to a GCS bucket, dates and datetimes are 
> converted to timestamps using:
> {code}
> time.mktime(value.timetuple())
> {code} 
> This creates a problem when you try parse a date that can't be converted to a 
> UNIX timestamp. For example:
> {code}
> >>> value = datetime.date(1850,1,1)
> >>> time.mktime(value.timetuple())
> Traceback (most recent call last):
>   File "", line 1, in 
> ValueError: year out of range
> {code}
> *Steps to reproduce*
> 0. Set up a MySQL connection and GCP connection in Airflow.
> 1. Create a MySQL table with a "date" field and put some data into the table. 
> {code}
> CREATE TABLE table_with_date (
> date_field date,
> datetime_field datetime
> );
> INSERT INTO table_with_date (date_field, datetime_field) VALUES 
> ('1850-01-01',NOW());
> {code}
> 2. Create a DAG that will export the data from the MySQL to GCS and then load 
> from GCS to BigQuery (use the schema file). For example:
> {code}
> extract = MySqlToGoogleCloudStorageOperator(
> task_id="extract_table",
> mysql_conn_id='mysql_connection',
> google_cloud_storage_conn_id='gcp_connection',
> sql="SELECT * FROM table_with_date",
> bucket='gcs-bucket',
> filename='table_with_date.json',
> schema_filename='schemas/table_with_date.json',
> dag=dag)
> load = GoogleCloudStorageToBigQueryOperator(
> task_id="load_table",
> bigquery_conn_id='gcp_connection',
> google_cloud_storage_conn_id='gcp_connection',
> bucket='gcs-bucket',
> destination_project_dataset_table="dataset.table_with_date",
> source_objects=['table_with_date.json'],
> schema_object='schemas/table_with_date.json',
> source_format='NEWLINE_DELIMITED_JSON',
> create_disposition='CREATE_IF_NEEDED',
> write_disposition='WRITE_TRUNCATE',
> dag=dag)
> load.set_upstream(extract)
> {code}
> 3. Run the DAG 
> Expected: The DAG runs successfully.
> Actual: The `extract_table` task fails with error:
> {code}
> ...
>  ERROR - year out of range
>  Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/airflow/models.py", line 1374, in run
>  result = task_copy.execute(context=context)
>   File 
> "/usr/lib/python2.7/site-packages/airflow/contrib/operators/mysql_to_gcs.py", 
> line 91, in execute
> files_to_upload = self._write_local_data_files(cursor)
>   File 
> "/usr/lib/python2.7/site-packages/airflow/contrib/operators/mysql_to_gcs.py", 
> line 132, in _write_local_data_files
> row = map(self.convert_types, row)
>   File 
> "/usr/lib/python2.7/site-packages/airflow/contrib/operators/mysql_to_gcs.py", 
> line 196, in convert_types
> return time.mktime(value.timetuple())
> ValueError: year out of range
> ...
> {code}
> *Comments:*
> This is really a problem with Python not being able to handle years before 
> like 1850. Bigquery timestamp seems to be able to take years all the way to 
> year 0001. From, 
> https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#timestamp-type,
>  Timestamp range is:
> {quote}
> 0001-01-01 00:00:00 to -12-31 23:59:59.99 UTC.
> {quote}
> I think the fix is probably to keep date/datetime converting to timestamp but 
> use `calendar.timegm`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2865) Race condition between on_success_callback and LocalTaskJob's cleanup

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-2865:
---
Fix Version/s: 1.10.1

> Race condition between on_success_callback and LocalTaskJob's cleanup
> -
>
> Key: AIRFLOW-2865
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2865
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Marcin Mejran
>Priority: Minor
> Fix For: 2.0.0, 1.10.1
>
>
> The TaskInstance's run_raw_task method first records SUCCESS for the task 
> instance and then runs the on_success_callback function.
> The LocalTaskJob's heartbeat_callback checks for any TI's with a SUCCESS 
> state and terminates their processes.
> As such it's possible for the TI process to be terminated before the 
> on_success_callback function finishes running.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3299) Logs for currently running sensors not visible in the UI

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3299:
---
Summary: Logs for currently running sensors not visible in the UI  (was: 
Logs for currently running tasks fail to load)

> Logs for currently running sensors not visible in the UI
> 
>
> Key: AIRFLOW-3299
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3299
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ui
>Reporter: Brad Holmes
>Priority: Major
>
> When a task is actively running, the logs are not appearing.  I have tracked 
> this down to the {{next_try_number}} logic of task-instances.
> In [the view at line 
> 836|https://github.com/apache/incubator-airflow/blame/master/airflow/www/views.py#L836],
>  we have
> {code:java}
> logs = [''] * (ti.next_try_number - 1 if ti is not None else 0)
> {code}
> The length of the {{logs}} array informs the frontend on the number of 
> {{attempts}} that exist, and thus how many AJAX calls to make to load the 
> logs.
> Here is the current logic I have observed
> ||Task State||Current length of 'logs'||Needed length of 'logs'||
> |Successfully completed in 1 attempt|1|1|
> |Successfully completed in 2 attempt|2|2|
> |Not yet attempted|0|0|
> |Actively running task, first time|0|1|
> That last case is the bug.  Perhaps task-instance needs a method like 
> {{most_recent_try_number}} ?  I don't see how to make use of {{try_number()}} 
> or {{next_try_number()}} to meet the need here.
> ||Task State||try_number()||next_try_number()||Number of Attempts _Should_ 
> Display||
> |Successfully completed in 1 attempt|2|2|1|
> |Successfully completed in 2 attempt|3|3|2|
> |Not yet attempted|1|1|0|
> |Actively running task, first time|0|1|1|
> [~ashb] : You implemented this portion of task-instance 11 months ago.  Any 
> suggestions?  Or perhaps the problem is elsewhere?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3279) Documentation for Google Logging unclear

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3279.

Resolution: Information Provided

> Documentation for Google Logging unclear
> 
>
> Key: AIRFLOW-3279
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3279
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: configuration, Documentation, gcp, logging
>Reporter: Paul Velthuis
>Priority: Blocker
>
> The documentation of how to install logging to a Google Cloud bucket is 
> unclear.
> I am now following the tutorial on the airflow page:
> [https://airflow.apache.org/howto/write-logs.html]
> Here I find it unclear what part of the 'logger' I have to adjust in the 
> `{{airflow/config_templates/airflow_local_settings.py}}`.
>  
> The adjustment states:
>  
>  # Update the airflow.task and airflow.tas_runner blocks to be 'gcs.task' 
> instead of 'file.task'. 'loggers':
>  Unknown macro: \{ 'airflow.task'}
>  
> However what I find in the template is:
> |'loggers': \{\| \|'airflow.processor': { | |'handlers': 
> ['processor'], | |'level': LOG_LEVEL, | 
> |'propagate': False, | |},|
> |'airflow.task': { 
> \| 
> \|'handlers': ['task'], 
> \| 
> \|'level': LOG_LEVEL, 
> \| 
> \|'propagate': False, 
> \| 
> \|},|
> |'flask_appbuilder': { 
> \| 
> \|'handler': ['console'], 
> \| 
> \|'level': FAB_LOG_LEVEL, 
> \| 
> \|'propagate': True, 
> \| 
> \|}|
> },
>  
> Since for me it is very important to do it right at the first time I hope 
> some clarity can be provided in what has to be adjusted in the logger. Is it 
> only the 'airflow.task' or more?
> Furthermore, at step 6 it is a little unclear what remote_log_conn_id means. 
> I would propose to add a little more information to make this more clear.
>  
> The current error I am facing is:
> Traceback (most recent call last):
>  File "/usr/local/bin/airflow", line 16, in 
>  from airflow import configuration
>  File "/usr/local/lib/python2.7/site-packages/airflow/__init__.py", line 31, 
> in 
>  from airflow import settings
>  File "/usr/local/lib/python2.7/site-packages/airflow/settings.py", line 198, 
> in 
>  configure_logging()
>  File "/usr/local/lib/python2.7/site-packages/airflow/logging_config.py", 
> line 71, in configure_logging
>  dictConfig(logging_config)
>  File "/usr/local/lib/python2.7/logging/config.py", line 794, in dictConfig
>  dictConfigClass(config).configure()
>  File "/usr/local/lib/python2.7/logging/config.py", line 568, in configure
>  handler = self.configure_handler(handlers[name])
>  File "/usr/local/lib/python2.7/logging/config.py", line 733, in 
> configure_handler
>  result = factory(**kwargs)
>  File 
> "/usr/local/lib/python2.7/site-packages/airflow/utils/log/gcs_task_handler.py",
>  line 30, in __init__
>  super(GCSTaskHandler, self).__init__(base_log_folder, filename_template)
>  File 
> "/usr/local/lib/python2.7/site-packages/airflow/utils/log/file_task_handler.py",
>  line 46, in __init__
>  self.filename_jinja_template = Template(self.filename_template)
>  File "/usr/local/lib/python2.7/site-packages/jinja2/environment.py", line 
> 926, in __new__
>  return env.from_string(source, template_class=cls)
>  File "/usr/local/lib/python2.7/site-packages/jinja2/environment.py", line 
> 862, in from_string
>  return cls.from_code(self, self.compile(source), globals, None)
>  File "/usr/local/lib/python2.7/site-packages/jinja2/environment.py", line 
> 565, in compile
>  self.handle_exception(exc_info, source_hint=source_hint)
>  File "/usr/local/lib/python2.7/site-packages/jinja2/environment.py", line 
> 754, in handle_exception
>  reraise(exc_type, exc_value, tb)
>  File "", line 1, in template
> jinja2.exceptions.TemplateSyntaxError: expected token ':', got '}'
> Error in atexit._run_exitfuncs:
> Traceback (most recent call last):
>  File "/usr/local/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
>  func(*targs, **kargs)
>  File "/usr/local/lib/python2.7/logging/__init__.py", line 1676, in shutdown
>  h.close()
>  File 
> "/usr/local/lib/python2.7/site-packages/airflow/utils/log/gcs_task_handler.py",
>  line 73, in close
>  if self.closed:
> AttributeError: 'GCSTaskHandler' object has no attribute 'closed'
> Error in sys.exitfunc:
> Traceback (most recent call last):
>  File "/usr/local/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
>  func(*targs, **kargs)
>  File "/usr/local/lib/python2.7/logging/__init__.py", line 1676, in shutdown
>  h.close()
>  File 
> "/usr/local/lib/python2.7/site-packages/airflow/utils/log/gcs_task_handler.py",
>  line 73, in close
>  if self.closed:
> AttributeError: 'GCSTaskHandler' object has no attribute 'closed'
>  If I look at the Airflow code I see the following code for the 
> 

[jira] [Commented] (AIRFLOW-3293) Rename TimeDeltaSensor to ScheduleTimeDeltaSensor

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676783#comment-16676783
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3293:


AIRFLOW-2747 and AIRFLOW-850 would help with your second point. And as of the 
current release even if the sensor behaved how you wanted it would still take 
up an executor slot as that is how sensors work.

I am uncertain on if this is a common enough use case to support directly given 
the two tickets mentioned above.

> Rename TimeDeltaSensor to ScheduleTimeDeltaSensor
> -
>
> Key: AIRFLOW-3293
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3293
> Project: Apache Airflow
>  Issue Type: Wish
>Reporter: Darren Weber
>Priority: Major
>
> The TimeDeltaSensor has baked-in lookups for the schedule and 
> schedule_interval lurking in the class init, it's not a pure time delta.  It 
> would be ideal to have a TimeDelta that is purely relative to the time that 
> an upstream task triggers it.  If there is a way to do this, please note it 
> here or suggest some implementation alternative that could achieve this 
> easily.
> The implementation below using a PythonOperator works, but it consumes a 
> worker for 5min needlessly.  It would be much better to have a TimeDelta that 
> accepts the time when an upstream sensor triggers it and then waits for a 
> timedelta, with options from the base sensor for poke interval (and timeout). 
>  This could be used without consuming a worker as much with the reschedule 
> option.  Something like this can help with adding jitter to downstream tasks 
> that could otherwise hit an HTTP endpoint too hard all at once.
> {code:python}
> def wait5(*args, **kwargs):
> import random
> import time as t
> minutes = random.randint(3,6)
> t.sleep(minutes * 60)
> return True
> wait5_task = PythonOperator(
> task_id="python_op_wait_5min",
> python_callable=wait5,
> dag=a_dag)
> upstream_http_sensor >> wait5_task
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3184) AwsHook with a conn_id that doesn't exist doesn't cause an error

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3184:
---
Labels: easy-fix  (was: )

Looking at the code the fix is probably in _get_credentials inside aws_hook - 
the try block should only re-raise the error if {{self.aws_conn_id != 
'aws_default'}}

> AwsHook with a conn_id that doesn't exist doesn't cause an error
> 
>
> Key: AIRFLOW-3184
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3184
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: aws
>Affects Versions: 1.9.0
>Reporter: Ash Berlin-Taylor
>Priority: Minor
>  Labels: easy-fix
>
> It is possible to create an S3Hook (which is a subclass of the AwsHook) with 
> an invalid connection ID, and rather than it causing an error of "connection 
> not found" or similar, it falls back to something, and continues 
> execution anyway.
> Simple repro code:
> {code}
> h = S3Hook('i-dontexist')
> h.list_keys(bucket_name="bucket", prefix="folder/")
> {code}
> Ideally the first line here should throw an exception of some form or other 
> (possibly _except_ in the case where the {{conn_id}} is the default value of 
> "aws_default") rather than it's current behaviour, as this made it more 
> difficult to track down the source of our problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2865) Race condition between on_success_callback and LocalTaskJob's cleanup

2018-11-05 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-2865.

   Resolution: Fixed
Fix Version/s: 2.0.0

> Race condition between on_success_callback and LocalTaskJob's cleanup
> -
>
> Key: AIRFLOW-2865
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2865
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Marcin Mejran
>Priority: Minor
> Fix For: 2.0.0
>
>
> The TaskInstance's run_raw_task method first records SUCCESS for the task 
> instance and then runs the on_success_callback function.
> The LocalTaskJob's heartbeat_callback checks for any TI's with a SUCCESS 
> state and terminates their processes.
> As such it's possible for the TI process to be terminated before the 
> on_success_callback function finishes running.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3299) Logs for currently running tasks fail to load

2018-11-05 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675446#comment-16675446
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3299:


Hacks upon hacks :(

Not sure the fix.

If we don't have that line there then if the sensor is set with max_tries=1 it 
would never get rescheduled (I think)

> Logs for currently running tasks fail to load
> -
>
> Key: AIRFLOW-3299
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3299
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ui
>Reporter: Brad Holmes
>Priority: Major
>
> When a task is actively running, the logs are not appearing.  I have tracked 
> this down to the {{next_try_number}} logic of task-instances.
> In [the view at line 
> 836|https://github.com/apache/incubator-airflow/blame/master/airflow/www/views.py#L836],
>  we have
> {code:java}
> logs = [''] * (ti.next_try_number - 1 if ti is not None else 0)
> {code}
> The length of the {{logs}} array informs the frontend on the number of 
> {{attempts}} that exist, and thus how many AJAX calls to make to load the 
> logs.
> Here is the current logic I have observed
> ||Task State||Current length of 'logs'||Needed length of 'logs'||
> |Successfully completed in 1 attempt|1|1|
> |Successfully completed in 2 attempt|2|2|
> |Not yet attempted|0|0|
> |Actively running task, first time|0|1|
> That last case is the bug.  Perhaps task-instance needs a method like 
> {{most_recent_try_number}} ?  I don't see how to make use of {{try_number()}} 
> or {{next_try_number()}} to meet the need here.
> ||Task State||try_number()||next_try_number()||Number of Attempts _Should_ 
> Display||
> |Successfully completed in 1 attempt|2|2|1|
> |Successfully completed in 2 attempt|3|3|2|
> |Not yet attempted|1|1|0|
> |Actively running task, first time|0|1|1|
> [~ashb] : You implemented this portion of task-instance 11 months ago.  Any 
> suggestions?  Or perhaps the problem is elsewhere?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3299) Logs for currently running tasks fail to load

2018-11-05 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675432#comment-16675432
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3299:


My memory is a bit hazy now, but As soon as the task gets picked up by a 
worker(? Or maybe by the scheduler when it is enqueued) the try_number should 
be incremented automatically.

(The whole try_number is a bit of a hack, I would rather we created a new 
task_instance row for each attempt, but that is a much bigger refactor)

> Logs for currently running tasks fail to load
> -
>
> Key: AIRFLOW-3299
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3299
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ui
>Reporter: Brad Holmes
>Priority: Major
>
> When a task is actively running, the logs are not appearing.  I have tracked 
> this down to the {{next_try_number}} logic of task-instances.
> In [the view at line 
> 836|https://github.com/apache/incubator-airflow/blame/master/airflow/www/views.py#L836],
>  we have
> {code:java}
> logs = [''] * (ti.next_try_number - 1 if ti is not None else 0)
> {code}
> The length of the {{logs}} array informs the frontend on the number of 
> {{attempts}} that exist, and thus how many AJAX calls to make to load the 
> logs.
> Here is the current logic I have observed
> ||Task State||Current length of 'logs'||Needed length of 'logs'||
> |Successfully completed in 1 attempt|1|1|
> |Successfully completed in 2 attempt|2|2|
> |Not yet attempted|0|0|
> |Actively running task, first time|0|1|
> That last case is the bug.  Perhaps task-instance needs a method like 
> {{most_recent_try_number}} ?  I don't see how to make use of {{try_number()}} 
> or {{next_try_number()}} to meet the need here.
> ||Task State||try_number()||next_try_number()||Number of Attempts _Should_ 
> Display||
> |Successfully completed in 1 attempt|2|2|1|
> |Successfully completed in 2 attempt|3|3|2|
> |Not yet attempted|1|1|0|
> |Actively running task, first time|0|1|1|
> [~ashb] : You implemented this portion of task-instance 11 months ago.  Any 
> suggestions?  Or perhaps the problem is elsewhere?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3299) Logs for currently running tasks fail to load

2018-11-05 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675430#comment-16675430
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3299:


When you say "not appearing" what do you mean? Could you provide a screen shot 
too?

> Logs for currently running tasks fail to load
> -
>
> Key: AIRFLOW-3299
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3299
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ui
>Reporter: Brad Holmes
>Priority: Major
>
> When a task is actively running, the logs are not appearing.  I have tracked 
> this down to the {{next_try_number}} logic of task-instances.
> In [the view at line 
> 836|https://github.com/apache/incubator-airflow/blame/master/airflow/www/views.py#L836],
>  we have
> {code:java}
> logs = [''] * (ti.next_try_number - 1 if ti is not None else 0)
> {code}
> The length of the {{logs}} array informs the frontend on the number of 
> {{attempts}} that exist, and thus how many AJAX calls to make to load the 
> logs.
> Here is the current logic I have observed
> ||Task State||Current length of 'logs'||Needed length of 'logs'||
> |Successfully completed in 1 attempt|1|1|
> |Successfully completed in 2 attempt|2|2|
> |Not yet attempted|0|0|
> |Actively running task, first time|0|1|
> That last case is the bug.  Perhaps task-instance needs a method like 
> {{most_recent_try_number}} ?  I don't see how to make use of {{try_number()}} 
> or {{next_try_number()}} to meet the need here.
> ||Task State||try_number()||next_try_number()||Number of Attempts _Should_ 
> Display||
> |Successfully completed in 1 attempt|2|2|1|
> |Successfully completed in 2 attempt|3|3|2|
> |Not yet attempted|1|1|0|
> |Actively running task, first time|0|1|1|
> [~ashb] : You implemented this portion of task-instance 11 months ago.  Any 
> suggestions?  Or perhaps the problem is elsewhere?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3295) Require encryption in DaskExecutor when certificates are configured.

2018-11-05 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3295:
---
Summary: Require encryption in  DaskExecutor when certificates are 
configured.  (was: Potential Security Issue in DaskExecutor)

> Require encryption in  DaskExecutor when certificates are configured.
> -
>
> Key: AIRFLOW-3295
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3295
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executor
>Affects Versions: 1.10.0
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
> Fix For: 2.0.0, 1.10.1
>
>
> When user decide to use SSL encryption for DaskExecutor communications, 
> `Distributed.Security` object will be created.
> However, one argument (`require_encryption`) is missed to be set to `True` 
> (its default value is `False`). This may fail the SSL encryption setting-up.
>  
> Current implementation: 
> [https://github.com/apache/incubator-airflow/blob/120f4856cdea5134971c4c4a239ddbfdc80db77e/airflow/executors/dask_executor.py#L48]
> Related Documentation: 
> http://distributed.dask.org/en/latest/tls.html?highlight=require_encryption#parameters



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3295) Potential Security Issue in DaskExecutor

2018-11-05 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3295.

   Resolution: Fixed
Fix Version/s: 1.10.1
   2.0.0

> Potential Security Issue in DaskExecutor
> 
>
> Key: AIRFLOW-3295
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3295
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executor
>Affects Versions: 1.10.0
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
> Fix For: 2.0.0, 1.10.1
>
>
> When user decide to use SSL encryption for DaskExecutor communications, 
> `Distributed.Security` object will be created.
> However, one argument (`require_encryption`) is missed to be set to `True` 
> (its default value is `False`). This may fail the SSL encryption setting-up.
>  
> Current implementation: 
> [https://github.com/apache/incubator-airflow/blob/120f4856cdea5134971c4c4a239ddbfdc80db77e/airflow/executors/dask_executor.py#L48]
> Related Documentation: 
> http://distributed.dask.org/en/latest/tls.html?highlight=require_encryption#parameters



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2190) base_url with a subpath generates TypeError

2018-11-05 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-2190:
---
Affects Version/s: 1.10.1

> base_url with a subpath generates TypeError
> ---
>
> Key: AIRFLOW-2190
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2190
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: webserver
>Affects Versions: 1.9.0, 1.10.1
>Reporter: John Arnold
>Priority: Major
>
> I'm running into what looks like a bug in airflow webserver. Running against 
> master:
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: 
> [2018-03-07 18:20:13 +] [102] [ERROR] Error handling request /
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: Traceback 
> (most recent call last):
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: File 
> "/usr/local/lib/python3.6/site-packages/gunicorn/workers/sync.py", line 135, 
> in handle
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: 
> self.handle_request(listener, req, client, addr)
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: File 
> "/usr/local/lib/python3.6/site-packages/gunicorn/workers/sync.py", line 176, 
> in handle_request
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: respiter = 
> self.wsgi(environ, resp.start_response)
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: File 
> "/usr/local/lib/python3.6/site-packages/werkzeug/wsgi.py", line 826, in call
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: return 
> app(environ, start_response)
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: File 
> "/usr/local/lib/python3.6/site-packages/airflow/www/app.py", line 166, in 
> root_app
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: resp(b'404 
> Not Found', [(b'Content-Type', b'text/plain')])
> Mar 7 18:20:14 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: File 
> "/usr/local/lib/python3.6/site-packages/gunicorn/http/wsgi.py", line 261, in 
> start_response
> Mar 7 18:20:14 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: 
> self.process_headers(headers)
> Mar 7 18:20:14 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: File 
> "/usr/local/lib/python3.6/site-packages/gunicorn/http/wsgi.py", line 268, in 
> process_headers
> Mar 7 18:20:14 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: raise 
> TypeError('%r is not a string' % name)
> Mar 7 18:20:14 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: TypeError: 
> b'Content-Type' is not a string
>  
> I just started using the base_url to put the webserver behind nginx proxy 
> under a sub-path, elg [http://domain.com/airflow]
> I've tried following the docs for nginx proxy, i.e.
> [webserver]
> base_url = [http://localhost/airflow|http://airflow-web/airflow]
>  
> I've also tried setting the base_url to the fully-qualified endpoint:
> base_url = [https://example.com/airflow|https://domain.com/airflow]
>  
> Neither work, both give the TypeError exception.
>  
> If I remove the sub-path:
> base_url = [https://example.com|https://domain.com/]
> then the app starts and runs ok and i can access it on the host but not 
> through the proxy.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3159) Update Airflow documentation on GCP Logging

2018-11-05 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3159:
---
Fix Version/s: 1.10.1

> Update Airflow documentation on GCP Logging
> ---
>
> Key: AIRFLOW-3159
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3159
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp, logging
>Reporter: Akshay Iyengar
>Priority: Minor
> Fix For: 2.0.0, 1.10.1
>
>
> The current documentation on how to get remote logging in GCP to work is 
> thoroughly out-of-date, not to mention needlessly complicated. I have a PR 
> open with the updated documentation, that would come as a sigh of relief to 
> anyone who might want to use it, but would otherwise be turned off by how 
> "tough" it is to setup.
>  
> Link to PR: https://github.com/apache/incubator-airflow/pull/3952



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3279) Documentation for Google Logging unclear

2018-11-05 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675427#comment-16675427
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3279:


Those docs are a bit out of date -we've fixed them for what will become the 
next release - you can check them out here: 
https://github.com/apache/incubator-airflow/blob/master/docs/howto/write-logs.rst#writing-logs-to-google-cloud-storage

> Documentation for Google Logging unclear
> 
>
> Key: AIRFLOW-3279
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3279
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: configuration, Documentation, gcp, logging
>Reporter: Paul Velthuis
>Priority: Blocker
>
> The documentation of how to install logging to a Google Cloud bucket is 
> unclear.
> I am now following the tutorial on the airflow page:
> [https://airflow.apache.org/howto/write-logs.html]
> Here I find it unclear what part of the 'logger' I have to adjust in the 
> `{{airflow/config_templates/airflow_local_settings.py}}`.
>  
> The adjustment states:
>  
>  # Update the airflow.task and airflow.tas_runner blocks to be 'gcs.task' 
> instead of 'file.task'. 'loggers':
>  Unknown macro: \{ 'airflow.task'}
>  
> However what I find in the template is:
> |'loggers': \{\| \|'airflow.processor': { | |'handlers': 
> ['processor'], | |'level': LOG_LEVEL, | 
> |'propagate': False, | |},|
> |'airflow.task': { 
> \| 
> \|'handlers': ['task'], 
> \| 
> \|'level': LOG_LEVEL, 
> \| 
> \|'propagate': False, 
> \| 
> \|},|
> |'flask_appbuilder': { 
> \| 
> \|'handler': ['console'], 
> \| 
> \|'level': FAB_LOG_LEVEL, 
> \| 
> \|'propagate': True, 
> \| 
> \|}|
> },
>  
> Since for me it is very important to do it right at the first time I hope 
> some clarity can be provided in what has to be adjusted in the logger. Is it 
> only the 'airflow.task' or more?
> Furthermore, at step 6 it is a little unclear what remote_log_conn_id means. 
> I would propose to add a little more information to make this more clear.
>  
> The current error I am facing is:
> Traceback (most recent call last):
>  File "/usr/local/bin/airflow", line 16, in 
>  from airflow import configuration
>  File "/usr/local/lib/python2.7/site-packages/airflow/__init__.py", line 31, 
> in 
>  from airflow import settings
>  File "/usr/local/lib/python2.7/site-packages/airflow/settings.py", line 198, 
> in 
>  configure_logging()
>  File "/usr/local/lib/python2.7/site-packages/airflow/logging_config.py", 
> line 71, in configure_logging
>  dictConfig(logging_config)
>  File "/usr/local/lib/python2.7/logging/config.py", line 794, in dictConfig
>  dictConfigClass(config).configure()
>  File "/usr/local/lib/python2.7/logging/config.py", line 568, in configure
>  handler = self.configure_handler(handlers[name])
>  File "/usr/local/lib/python2.7/logging/config.py", line 733, in 
> configure_handler
>  result = factory(**kwargs)
>  File 
> "/usr/local/lib/python2.7/site-packages/airflow/utils/log/gcs_task_handler.py",
>  line 30, in __init__
>  super(GCSTaskHandler, self).__init__(base_log_folder, filename_template)
>  File 
> "/usr/local/lib/python2.7/site-packages/airflow/utils/log/file_task_handler.py",
>  line 46, in __init__
>  self.filename_jinja_template = Template(self.filename_template)
>  File "/usr/local/lib/python2.7/site-packages/jinja2/environment.py", line 
> 926, in __new__
>  return env.from_string(source, template_class=cls)
>  File "/usr/local/lib/python2.7/site-packages/jinja2/environment.py", line 
> 862, in from_string
>  return cls.from_code(self, self.compile(source), globals, None)
>  File "/usr/local/lib/python2.7/site-packages/jinja2/environment.py", line 
> 565, in compile
>  self.handle_exception(exc_info, source_hint=source_hint)
>  File "/usr/local/lib/python2.7/site-packages/jinja2/environment.py", line 
> 754, in handle_exception
>  reraise(exc_type, exc_value, tb)
>  File "", line 1, in template
> jinja2.exceptions.TemplateSyntaxError: expected token ':', got '}'
> Error in atexit._run_exitfuncs:
> Traceback (most recent call last):
>  File "/usr/local/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
>  func(*targs, **kargs)
>  File "/usr/local/lib/python2.7/logging/__init__.py", line 1676, in shutdown
>  h.close()
>  File 
> "/usr/local/lib/python2.7/site-packages/airflow/utils/log/gcs_task_handler.py",
>  line 73, in close
>  if self.closed:
> AttributeError: 'GCSTaskHandler' object has no attribute 'closed'
> Error in sys.exitfunc:
> Traceback (most recent call last):
>  File "/usr/local/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
>  func(*targs, **kargs)
>  File "/usr/local/lib/python2.7/logging/__init__.py", line 1676, in shutdown
>  h.close()
>  File 
> 

[jira] [Reopened] (AIRFLOW-3239) Test discovery partial fails due to incorrect name of the test files

2018-11-05 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor reopened AIRFLOW-3239:


[~Fokko] I don't think this ticket is done until tests/core.py and test.jobs.py 
are updated too :)

> Test discovery partial fails due to incorrect name of the test files
> 
>
> Key: AIRFLOW-3239
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3239
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: tests
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Major
> Fix For: 2.0.0
>
>
> In PR [https://github.com/apache/incubator-airflow/pull/4049,] I have fixed 
> the incorrect name of some test files (resulting in partial failure in test 
> discovery).
> There are some other scripts with this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3277) Invalid timezone transition handling for cron schedules

2018-11-05 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3277.

Resolution: Fixed

> Invalid timezone transition handling for cron schedules
> ---
>
> Key: AIRFLOW-3277
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3277
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Bolke de Bruin
>Priority: Blocker
> Fix For: 1.10.1
>
>
> `following_schedule` converts to naive time by using the local time zone. In 
> case of a DST transition, say 3AM -> 2AM ("summer time to winter time") we 
> generate date times that could overlap with earlier schedules. Therefore a 
> DAG that should run every 5 minutes will not do so if it has already seen the 
> schedule.
> We should not convert to naive and keep UTC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-05 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16674948#comment-16674948
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3270:


Asking "please expedite" doesn't help btw - this is a volunteer run project ;)

It is now connecting to the LDAP so that's progress. Did the error get cut off 
at the end, or is there nothing after "invalid attribute type" in the message?

It looks like the problem would be the {{group_member_attr}} config option. You 
have set that to an empty string which I think is the problem. The default 
value is {{memberOf}} but setting it to empty string doesn't disable the group 
checks. Is that what you were trying to do?

> Apache airflow 1.10.0 integration with LDAP anonmyously
> ---
>
> Key: AIRFLOW-3270
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3270
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication
>Affects Versions: 1.10.0
>Reporter: Hari Krishna ADDEPALLI LN
>Priority: Blocker
>
> Please advise what to include in airflow.cfg when going to integrate with 
> LDAP anonymously ? We are using DS389 as LDAP server vendor name. 
>  
> {noformat}
> [webserver] 
> authenticate = True 
> auth_backend = airflow.contrib.auth.backends.ldap_auth  
> {noformat}
>  
> And 
>  
> {noformat}
> [ldap] 
> uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
> user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im
> user_name_attr = uid 
> group_member_attr =
> superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
> data_profiler_filter = 
> bind_user = 
> bind_password = 
> basedn = ou=people,dc=odc,dc=im 
> cacert = /opt/orchestration/airflow/ldap_ca.crt 
> search_scope = LEVEL
> {noformat}
> I am hitting below exception:
> {noformat}
>   File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", 
> line 215, in parse_filter     
> raise LDAPInvalidFilterError('malformed filter') 
> ldap3.core.exceptions.LDAPInvalidFilterError: malformed filter
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673370#comment-16673370
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3270:


Yup, add those lines just before {{tls_configuration = None}}

> Apache airflow 1.10.0 integration with LDAP anonmyously
> ---
>
> Key: AIRFLOW-3270
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3270
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication
>Affects Versions: 1.10.0
>Reporter: Hari Krishna ADDEPALLI LN
>Priority: Blocker
>
> Please advise what to include in airflow.cfg when going to integrate with 
> LDAP anonymously ? We are using DS389 as LDAP server vendor name. 
>  
> {noformat}
> [webserver] 
> authenticate = True 
> auth_backend = airflow.contrib.auth.backends.ldap_auth  
> {noformat}
>  
> And 
>  
> {noformat}
> [ldap] 
> uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
> user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im
> user_name_attr = uid 
> group_member_attr =
> superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
> data_profiler_filter = 
> bind_user = 
> bind_password = 
> basedn = ou=people,dc=odc,dc=im 
> cacert = /opt/orchestration/airflow/ldap_ca.crt 
> search_scope = LEVEL
> {noformat}
> I am hitting below exception:
> {noformat}
>   File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", 
> line 215, in parse_filter     
> raise LDAPInvalidFilterError('malformed filter') 
> ldap3.core.exceptions.LDAPInvalidFilterError: malformed filter
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673351#comment-16673351
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3270:


I think the same issue would still apply on 1.10.0 so it doesn't matter but the 
lines def match up form the stack trace. For instance

{code}
File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 268, in login
LdapUser.try_login(username, password)
{code}

https://github.com/apache/incubator-airflow/blob/1.10.0/airflow/contrib/auth/backends/ldap_auth.py#L268
 -- the is not in a login function.

So you might want to see if you have two versions installed (just so that you 
edit the right one).

> Apache airflow 1.10.0 integration with LDAP anonmyously
> ---
>
> Key: AIRFLOW-3270
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3270
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication
>Affects Versions: 1.10.0
>Reporter: Hari Krishna ADDEPALLI LN
>Priority: Blocker
>
> Please advise what to include in airflow.cfg when going to integrate with 
> LDAP anonymously ? We are using DS389 as LDAP server vendor name. 
>  
> {noformat}
> [webserver] 
> authenticate = True 
> auth_backend = airflow.contrib.auth.backends.ldap_auth  
> {noformat}
>  
> And 
>  
> {noformat}
> [ldap] 
> uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
> user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im
> user_name_attr = uid 
> group_member_attr =
> superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
> data_profiler_filter = 
> bind_user = 
> bind_password = 
> basedn = ou=people,dc=odc,dc=im 
> cacert = /opt/orchestration/airflow/ldap_ca.crt 
> search_scope = LEVEL
> {noformat}
> I am hitting below exception:
> {noformat}
>   File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", 
> line 215, in parse_filter     
> raise LDAPInvalidFilterError('malformed filter') 
> ldap3.core.exceptions.LDAPInvalidFilterError: malformed filter
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673311#comment-16673311
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3270:


Well your stack trace lines are for 1.8, so this might not be true anymore, but 
could you try making this change in airflow/contrib/auth/backends/ldap_auth.py, 
in {{get_ldap_connection}} function:

{code}
diff --git a/airflow/contrib/auth/backends/ldap_auth.py 
b/airflow/contrib/auth/backends/ldap_auth.py
index 13b49f90..42ad7026 100644
--- a/airflow/contrib/auth/backends/ldap_auth.py
+++ b/airflow/contrib/auth/backends/ldap_auth.py
@@ -51,6 +51,13 @@ class LdapException(Exception):
 
 
 def get_ldap_connection(dn=None, password=None):
+# When coming form confing we can't set None, the best we can do is set it
+# to an empty string
+if dn == "":
+dn=None
+if password == "":
+password = None
+
 tls_configuration = None
 use_ssl = False
 try:
{code}

> Apache airflow 1.10.0 integration with LDAP anonmyously
> ---
>
> Key: AIRFLOW-3270
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3270
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication
>Affects Versions: 1.10.0
>Reporter: Hari Krishna ADDEPALLI LN
>Priority: Blocker
>
> Please advise what to include in airflow.cfg when going to integrate with 
> LDAP anonymously ? We are using DS389 as LDAP server vendor name. 
>  
> {noformat}
> [webserver] 
> authenticate = True 
> auth_backend = airflow.contrib.auth.backends.ldap_auth  
> {noformat}
>  
> And 
>  
> {noformat}
> [ldap] 
> uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
> user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im
> user_name_attr = uid 
> group_member_attr =
> superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
> data_profiler_filter = 
> bind_user = 
> bind_password = 
> basedn = ou=people,dc=odc,dc=im 
> cacert = /opt/orchestration/airflow/ldap_ca.crt 
> search_scope = LEVEL
> {noformat}
> I am hitting below exception:
> {noformat}
>   File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", 
> line 215, in parse_filter     
> raise LDAPInvalidFilterError('malformed filter') 
> ldap3.core.exceptions.LDAPInvalidFilterError: malformed filter
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673287#comment-16673287
 ] 

Ash Berlin-Taylor edited comment on AIRFLOW-3270 at 11/2/18 3:44 PM:
-

[~ashb] : the '\{{ = }}'  as the end of the line is copy/paste issue on this 
JIRA. Below is the correct one without formatting. Below is the full exception 
stack included.

 
{code}
[ldap]

uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389

user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im

user_name_attr = uid

group_member_attr =

superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im

data_profiler_filter =

bind_user =

bind_password =

basedn = ou=people,dc=odc,dc=im

cacert = /opt/orchestration/airflow/ldap_ca.crt

search_scope = LEVEL
{code}

 ===

===

{code}
[2018-10-30 04:01:04,520] ERROR in app: Exception on /admin/airflow/login [POST]
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1988, in 
wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1641, in 
full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1544, in 
handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python3.5/site-packages/flask/_compat.py", line 33, in 
reraise
raise value
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1639, in 
full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1625, in 
dispatch_request
return 
self.view_functions[rule.endpoint|https://github.com/cannatag/ldap3/issues/**req.view_args]
File "/usr/local/lib/python3.5/site-packages/flask_admin/base.py", line 69, in 
inner
return self._run_view(f, *args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/flask_admin/base.py", line 368, in 
_run_view
return fn(self, *args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/airflow/www/views.py", line 650, 
in login
return airflow.login.login(self, request)
File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 268, in login
LdapUser.try_login(username, password)
File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 180, in try_login
search_scope=native(search_scope))
File "/usr/local/lib/python3.5/site-packages/ldap3/core/connection.py", line 
779, in search
[2018-10-30 04:01:04,520] [72] \{app.py:1587} ERROR - Exception on 
/admin/airflow/login [POST]
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1988, in 
wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1641, in 
full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1544, in 
handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python3.5/site-packages/flask/_compat.py", line 33, in 
reraise
raise value
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1639, in 
full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1625, in 
dispatch_request
return 
self.view_functions[rule.endpoint|https://github.com/cannatag/ldap3/issues/**req.view_args]
File "/usr/local/lib/python3.5/site-packages/flask_admin/base.py", line 69, in 
inner
return self._run_view(f, *args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/flask_admin/base.py", line 368, in 
_run_view
return fn(self, *args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/airflow/www/views.py", line 650, 
in login
return airflow.login.login(self, request)
File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 268, in login
LdapUser.try_login(username, password)
File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 180, in try_login
search_scope=native(search_scope))
File "/usr/local/lib/python3.5/site-packages/ldap3/core/connection.py", line 
779, in search
check_names=self.check_names)
File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", line 
372, in search_operation
request['filter'] = compile_filter(parse_filter(search_filter, schema, 
auto_escape, auto_encode, validator, check_names).elements[0]) # parse the 
searchFilter string and compile it starting from the root node
File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", line 
215, in parse_filter
raise LDAPInvalidFilterError('malformed filter')
ldap3.core.exceptions.LDAPInvalidFilterError: 

[jira] [Commented] (AIRFLOW-3292) `delete_dag` endpoint and cli commands don't delete on exact dag_id matching

2018-11-02 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673289#comment-16673289
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3292:


{{.}} in a dag is reserves for subdags, so deleting {{schema}} also deletes the 
subdags too. If {{schema.table1}} is _NOT_ a sub-dag then we should probably 
add better validation of the dag id to only allow dots in dag ids when used in 
subdags.

Does that help?

> `delete_dag` endpoint and cli commands don't delete on exact dag_id matching
> 
>
> Key: AIRFLOW-3292
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3292
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: api, cli
>Affects Versions: 1.10.0
>Reporter: Teresa Martyny
>Priority: Major
>
> If you have the following dag ids: `schema`, `schema.table1`, 
> `schema.table2`, `schema_replace`
> When you hit the delete_dag endpoint with the dag id: `schema`, it will 
> delete `schema`, `schema.table1`, and `schema.table2`, not just `schema`. 
> Underscores are fine so it doesn't delete `schema_replace`, but periods are 
> not.
> If this is expected behavior, clarifying that functionality in the docs would 
> be great, and then I can submit a feature request for the ability to use 
> regex for exact matching with this command and endpoint.
> Thanks!! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


<    1   2   3   4   5   6   7   8   >