[jira] [Updated] (AIRFLOW-1565) Incompatibility with Flask-Login-0.4.0

2017-09-05 Thread dud (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dud updated AIRFLOW-1565:
-
Description: 
Hello

I've notice that Airflow v1.8.2 isn't playing well with flask_login v0.4.0 :
{code}
$ pip install --user flask_login==0.4.0
Collecting flask_login==0.4.0
  Using cached Flask_Login-0.4.0-py2.py3-none-any.whl
Requirement already satisfied: Flask in ./.local/lib/python2.7/site-packages 
(from flask_login==0.4.0)
Requirement already satisfied: itsdangerous>=0.21 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: Jinja2>=2.4 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: Werkzeug>=0.7 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: click>=2.0 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: MarkupSafe>=0.23 in 
./.local/lib/python2.7/site-packages (from 
Jinja2>=2.4->Flask->flask_login==0.4.0)
Installing collected packages: flask-login
  Found existing installation: Flask-Login 0.3.2
Uninstalling Flask-Login-0.3.2:
  Successfully uninstalled Flask-Login-0.3.2
Successfully installed flask-login-0.4.0

$ airflow webserver
[2017-09-05 14:30:50,130] {dag_processing.py:627} INFO - Started a process 
(PID: 22751) to generate tasks for 
/home/airflow/dags/metadsp/APB_process_table/airflow.py - logging into 
/var/log/airflow/scheduler/2017-09-05
[2017-09-05 14:30:50,133] {dag_processing.py:627} INFO - Started a process 
(PID: 22752) to generate tasks for 
/home/airflow/dags/metadsp/APN_segment_manager/segment_manager_factory.py - 
logging into /var/log/airflow/sc
[2017-09-05 14:30:50,134] {jobs.py:1443} INFO - Heartbeating the executor
2017-09-05 14:30:50,192] {__init__.py:57} INFO - Using executor CeleryExecutor
[2017-09-05 14:30:50,252] {driver.py:120} INFO - Generating grammar tables from 
/usr/lib/python2.7/lib2to3/Grammar.txt
[2017-09-05 14:30:50,273] {driver.py:120} INFO - Generating grammar tables from 
/usr/lib/python2.7/lib2to3/PatternGrammar.txt
/home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py:23: 
FlaskWTFDeprecationWarning: "flask_wtf.CsrfProtect" has been renamed to 
"CSRFProtect" and will be removed in 1.0.
csrf = CsrfProtect()
/home/airflow/.local/lib/python2.7/site-packages/flask_cache/jinja2ext.py:33: 
ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache 
instead.
from flask.ext.cache import make_template_fragment_key
   _
|__( )_  __/__  /  __
  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
_/_/  |_/_/  /_//_//_/  \//|__/
Traceback (most recent call last):
File "/home/airflow/bin/airflow", line 28, in 
args.func(args)
File "/home/airflow/.local/lib/python2.7/site-packages/airflow/bin/cli.py", 
line 734, in webserver
app = cached_app(conf)
File "/home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py", 
line 161, in cached_app
app = create_app(config)
File "/home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py", 
line 60, in create_app
from airflow.www import views
File "/home/airflow/.local/lib/python2.7/site-packages/airflow/www/views.py", 
line 41, in 
from flask_login import flash
ImportError: cannot import name flash


$ python -c "from flask_login import flash"
Traceback (most recent call last):
  File "", line 1, in 
ImportError: cannot import name flash
{code}

-Downgrading to flask_login==0.3.2 fixes the issue.-

EDIT : Downgrading to flask_login==0.2.11 fixes the issue.

Due to :
{quote}
Version 0.3.0
-

Released on September 10th, 2015
[...]
- BREAKING: The `is_authenticated`, `is_active`, and `is_anonymous` members of 
the user class are now properties, not methods. Applications should update 
their user classes accordingly.
{quote}

Airflow isn't working with flask_login >= v0.3.0 either.

dud

  was:
Hello

I've notice that Airflow v1.8.2 isn't playing well with flask_login v0.4.0 :
{code}
$ pip install --user flask_login==0.4.0
Collecting flask_login==0.4.0
  Using cached Flask_Login-0.4.0-py2.py3-none-any.whl
Requirement already satisfied: Flask in ./.local/lib/python2.7/site-packages 
(from flask_login==0.4.0)
Requirement already satisfied: itsdangerous>=0.21 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: Jinja2>=2.4 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: Werkzeug>=0.7 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: click>=2.0 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: MarkupSafe>=0.23 in 

[jira] [Updated] (AIRFLOW-1565) Incompatibility with Flask-Login-0.4.0

2017-09-05 Thread dud (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dud updated AIRFLOW-1565:
-
Description: 
Hello

I've notice that Airflow v1.8.2 isn't playing well with flask_login v0.4.0 :
{code}
$ pip install --user flask_login==0.4.0
Collecting flask_login==0.4.0
  Using cached Flask_Login-0.4.0-py2.py3-none-any.whl
Requirement already satisfied: Flask in ./.local/lib/python2.7/site-packages 
(from flask_login==0.4.0)
Requirement already satisfied: itsdangerous>=0.21 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: Jinja2>=2.4 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: Werkzeug>=0.7 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: click>=2.0 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: MarkupSafe>=0.23 in 
./.local/lib/python2.7/site-packages (from 
Jinja2>=2.4->Flask->flask_login==0.4.0)
Installing collected packages: flask-login
  Found existing installation: Flask-Login 0.3.2
Uninstalling Flask-Login-0.3.2:
  Successfully uninstalled Flask-Login-0.3.2
Successfully installed flask-login-0.4.0

$ airflow webserver
[2017-09-05 14:30:50,130] {dag_processing.py:627} INFO - Started a process 
(PID: 22751) to generate tasks for 
/home/airflow/dags/metadsp/APB_process_table/airflow.py - logging into 
/var/log/airflow/scheduler/2017-09-05
[2017-09-05 14:30:50,133] {dag_processing.py:627} INFO - Started a process 
(PID: 22752) to generate tasks for 
/home/airflow/dags/metadsp/APN_segment_manager/segment_manager_factory.py - 
logging into /var/log/airflow/sc
[2017-09-05 14:30:50,134] {jobs.py:1443} INFO - Heartbeating the executor
2017-09-05 14:30:50,192] {__init__.py:57} INFO - Using executor CeleryExecutor
[2017-09-05 14:30:50,252] {driver.py:120} INFO - Generating grammar tables from 
/usr/lib/python2.7/lib2to3/Grammar.txt
[2017-09-05 14:30:50,273] {driver.py:120} INFO - Generating grammar tables from 
/usr/lib/python2.7/lib2to3/PatternGrammar.txt
/home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py:23: 
FlaskWTFDeprecationWarning: "flask_wtf.CsrfProtect" has been renamed to 
"CSRFProtect" and will be removed in 1.0.
csrf = CsrfProtect()
/home/airflow/.local/lib/python2.7/site-packages/flask_cache/jinja2ext.py:33: 
ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache 
instead.
from flask.ext.cache import make_template_fragment_key
   _
|__( )_  __/__  /  __
  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
_/_/  |_/_/  /_//_//_/  \//|__/
Traceback (most recent call last):
File "/home/airflow/bin/airflow", line 28, in 
args.func(args)
File "/home/airflow/.local/lib/python2.7/site-packages/airflow/bin/cli.py", 
line 734, in webserver
app = cached_app(conf)
File "/home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py", 
line 161, in cached_app
app = create_app(config)
File "/home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py", 
line 60, in create_app
from airflow.www import views
File "/home/airflow/.local/lib/python2.7/site-packages/airflow/www/views.py", 
line 41, in 
from flask_login import flash
ImportError: cannot import name flash


$ python -c "from flask_login import flash"
Traceback (most recent call last):
  File "", line 1, in 
ImportError: cannot import name flash
{code}

-Downgrading to flask_login==0.3.2 fixes the issue.-

EDIT : 
Downgrading to flask_login==0.2.11 fixes the issue.

Due to :
{quote}
Version 0.3.0
-

Released on September 10th, 2015
[...]
- BREAKING: The `is_authenticated`, `is_active`, and `is_anonymous` members of 
the user class are now properties, not methods. Applications should update 
their user classes accordingly.
{quote}

So for now Airflow isn't working with flask_login >= v0.3.0 either.

dud

  was:
Hello

I've notice that Airflow v1.8.2 isn't playing well with flask_login v0.4.0 :
{code}
$ pip install --user flask_login==0.4.0
Collecting flask_login==0.4.0
  Using cached Flask_Login-0.4.0-py2.py3-none-any.whl
Requirement already satisfied: Flask in ./.local/lib/python2.7/site-packages 
(from flask_login==0.4.0)
Requirement already satisfied: itsdangerous>=0.21 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: Jinja2>=2.4 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: Werkzeug>=0.7 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: click>=2.0 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: MarkupSafe>=0.23 in 

[jira] [Created] (AIRFLOW-1566) 'airflow.www.gunicorn_config doesn't exist

2017-09-05 Thread dud (JIRA)
dud created AIRFLOW-1566:


 Summary: 'airflow.www.gunicorn_config doesn't exist
 Key: AIRFLOW-1566
 URL: https://issues.apache.org/jira/browse/AIRFLOW-1566
 Project: Apache Airflow
  Issue Type: Bug
Reporter: dud
 Fix For: 1.8.2


Hello

starting Airflow v1.8.2 webserver yields this error :
{code}
$ airflow webserver --pid /run/airflow/webserver.pid -hn localhost
/home/airflow/.local/lib/python2.7/site-packages/airflow/configuration.py:540: 
DeprecationWarning: This method will be removed in future versions.  Use 
'parser.read_file()' instead.
  self.readfp(StringIO.StringIO(string))
[2017-09-05 15:19:05,012] {__init__.py:57} INFO - Using executor CeleryExecutor
[2017-09-05 15:19:05,063] {driver.py:120} INFO - Generating grammar tables from 
/usr/lib/python2.7/lib2to3/Grammar.txt
[2017-09-05 15:19:05,083] {driver.py:120} INFO - Generating grammar tables from 
/usr/lib/python2.7/lib2to3/PatternGrammar.txt
/home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py:23: 
FlaskWTFDeprecationWarning: "flask_wtf.CsrfProtect" has been renamed to 
"CSRFProtect" and will be removed in 1.0.
  csrf = CsrfProtect()
     _
 |__( )_  __/__  /  __
  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
 _/_/  |_/_/  /_//_//_/  \//|__/

/home/airflow/.local/lib/python2.7/site-packages/flask_cache/jinja2ext.py:33: 
ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache 
instead.
  from flask.ext.cache import make_template_fragment_key
[2017-09-05 15:19:05,387] [26152] {models.py:168} INFO - Filling up the DagBag 
from /home/airflow/dags
Running the Gunicorn Server with:
Workers: 4 sync
Host: localhost:8080
Timeout: 120
Logfiles: - -
=

Error: 'airflow.www.gunicorn_config' doesn't exist
{code}


The use of this file has been introduced by this 
[commit|https://github.com/apache/incubator-airflow/commit/9d254a317dd54f555270ca568aff1cd0500e1e53]
 authored by [~xuanji]

dud



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (AIRFLOW-1565) Incompatibility with Flask-Login-0.4.0

2017-09-05 Thread dud (JIRA)
dud created AIRFLOW-1565:


 Summary: Incompatibility with Flask-Login-0.4.0
 Key: AIRFLOW-1565
 URL: https://issues.apache.org/jira/browse/AIRFLOW-1565
 Project: Apache Airflow
  Issue Type: Bug
Affects Versions: 1.8.2
Reporter: dud
Priority: Minor


Hello

I've notice that Airflow v1.8.2 isn't playing well with flask_login v0.4.0 :
{code}
$ pip install --user flask_login==0.4.0
Collecting flask_login==0.4.0
  Using cached Flask_Login-0.4.0-py2.py3-none-any.whl
Requirement already satisfied: Flask in ./.local/lib/python2.7/site-packages 
(from flask_login==0.4.0)
Requirement already satisfied: itsdangerous>=0.21 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: Jinja2>=2.4 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: Werkzeug>=0.7 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: click>=2.0 in 
./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0)
Requirement already satisfied: MarkupSafe>=0.23 in 
./.local/lib/python2.7/site-packages (from 
Jinja2>=2.4->Flask->flask_login==0.4.0)
Installing collected packages: flask-login
  Found existing installation: Flask-Login 0.3.2
Uninstalling Flask-Login-0.3.2:
  Successfully uninstalled Flask-Login-0.3.2
Successfully installed flask-login-0.4.0

$ airflow webserver
[2017-09-05 14:30:50,130] {dag_processing.py:627} INFO - Started a process 
(PID: 22751) to generate tasks for 
/home/airflow/dags/metadsp/APB_process_table/airflow.py - logging into 
/var/log/airflow/scheduler/2017-09-05
[2017-09-05 14:30:50,133] {dag_processing.py:627} INFO - Started a process 
(PID: 22752) to generate tasks for 
/home/airflow/dags/metadsp/APN_segment_manager/segment_manager_factory.py - 
logging into /var/log/airflow/sc
[2017-09-05 14:30:50,134] {jobs.py:1443} INFO - Heartbeating the executor
2017-09-05 14:30:50,192] {__init__.py:57} INFO - Using executor CeleryExecutor
[2017-09-05 14:30:50,252] {driver.py:120} INFO - Generating grammar tables from 
/usr/lib/python2.7/lib2to3/Grammar.txt
[2017-09-05 14:30:50,273] {driver.py:120} INFO - Generating grammar tables from 
/usr/lib/python2.7/lib2to3/PatternGrammar.txt
/home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py:23: 
FlaskWTFDeprecationWarning: "flask_wtf.CsrfProtect" has been renamed to 
"CSRFProtect" and will be removed in 1.0.
csrf = CsrfProtect()
/home/airflow/.local/lib/python2.7/site-packages/flask_cache/jinja2ext.py:33: 
ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache 
instead.
from flask.ext.cache import make_template_fragment_key
   _
|__( )_  __/__  /  __
  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
_/_/  |_/_/  /_//_//_/  \//|__/
Traceback (most recent call last):
File "/home/airflow/bin/airflow", line 28, in 
args.func(args)
File "/home/airflow/.local/lib/python2.7/site-packages/airflow/bin/cli.py", 
line 734, in webserver
app = cached_app(conf)
File "/home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py", 
line 161, in cached_app
app = create_app(config)
File "/home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py", 
line 60, in create_app
from airflow.www import views
File "/home/airflow/.local/lib/python2.7/site-packages/airflow/www/views.py", 
line 41, in 
from flask_login import flash
ImportError: cannot import name flash


$ python -c "from flask_login import flash"
Traceback (most recent call last):
  File "", line 1, in 
ImportError: cannot import name flash
{code}

Downgrading to flask_login==0.3.2 fixes the issue.

dud



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (AIRFLOW-121) Documenting dag doc_md feature

2016-11-14 Thread dud (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15663602#comment-15663602
 ] 

dud commented on AIRFLOW-121:
-

Hello.
I think you should file a new bug issue with an example DAG that triggers this 
bug. Please also specify which version you are running.

> Documenting dag doc_md feature
> --
>
> Key: AIRFLOW-121
> URL: https://issues.apache.org/jira/browse/AIRFLOW-121
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: docs
>Reporter: dud
>Assignee: dud
>Priority: Trivial
>
> Dear Airflow Maintainers,
> I added a note about DAG documentation.
> I'd be glad if my PR would be merged : 
> https://github.com/apache/incubator-airflow/pull/1493
> Regards
> dud



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-469) Add MFG Labs as Airflow user

2016-08-30 Thread dud (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dud updated AIRFLOW-469:

Description: 
Hello.

We'd like to be added as official Airflow users.

Regards

GH PR  : https://github.com/apache/incubator-airflow/pull/1768

  was:
Hello.

We'd like to be added as official Airflow users.

Regards


> Add MFG Labs as Airflow user
> 
>
> Key: AIRFLOW-469
> URL: https://issues.apache.org/jira/browse/AIRFLOW-469
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: dud
>Priority: Trivial
>
> Hello.
> We'd like to be added as official Airflow users.
> Regards
> GH PR  : https://github.com/apache/incubator-airflow/pull/1768



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-312) Airflow is not able to load hooks

2016-07-06 Thread dud (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15364495#comment-15364495
 ] 

dud commented on AIRFLOW-312:
-

Right now I was just fighting to get BigQueryToCloudStorageOperator working and 
you just helped me to get out of this mess by pointing me out airflow[xxx]. 
Thanks again.

> Airflow is not able to load hooks
> -
>
> Key: AIRFLOW-312
> URL: https://issues.apache.org/jira/browse/AIRFLOW-312
> Project: Apache Airflow
>  Issue Type: Improvement
>Affects Versions: Airflow 1.7.1.2, Airflow 1.7.1.3
>Reporter: dud
>
> Hello
> I installed latest version of Airflow via pip and I noticed that it is now 
> unable to use any hook :
> {code}
> pip install --user -U pip && pip install --user -U airflow
> python /home/airflow/hooks.py 
> [2016-07-04 16:30:31,973] {__init__.py:36} INFO - Using executor 
> SequentialExecutor
> [2016-07-04 16:30:32,044] {driver.py:120} INFO - Generating grammar tables 
> from /usr/lib/python2.7/lib2to3/Grammar.txt
> [2016-07-04 16:30:32,064] {driver.py:120} INFO - Generating grammar tables 
> from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
> Traceback (most recent call last):
>   File "/home/airflow/hooks.py", line 2, in 
> hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id)
> AttributeError: 'module' object has no attribute 'S3Hook'
> {code}
> Here is the content of this script :
> {code}
> from airflow import hooks
> hook = hooks.S3Hook(s3_conn_id="test value")
> {code}
> Installing boto solves the issue.
> Not sure if there isn't a configuration issue on my system, however maybe 
> boto should be added as a dependency ?
> dud



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-312) Airflow is not able to load hooks

2016-07-06 Thread dud (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15364488#comment-15364488
 ] 

dud commented on AIRFLOW-312:
-

OK, my bad. Thanks for your explanation.

> Airflow is not able to load hooks
> -
>
> Key: AIRFLOW-312
> URL: https://issues.apache.org/jira/browse/AIRFLOW-312
> Project: Apache Airflow
>  Issue Type: Improvement
>Affects Versions: Airflow 1.7.1.2, Airflow 1.7.1.3
>Reporter: dud
>
> Hello
> I installed latest version of Airflow via pip and I noticed that it is now 
> unable to use any hook :
> {code}
> pip install --user -U pip && pip install --user -U airflow
> python /home/airflow/hooks.py 
> [2016-07-04 16:30:31,973] {__init__.py:36} INFO - Using executor 
> SequentialExecutor
> [2016-07-04 16:30:32,044] {driver.py:120} INFO - Generating grammar tables 
> from /usr/lib/python2.7/lib2to3/Grammar.txt
> [2016-07-04 16:30:32,064] {driver.py:120} INFO - Generating grammar tables 
> from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
> Traceback (most recent call last):
>   File "/home/airflow/hooks.py", line 2, in 
> hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id)
> AttributeError: 'module' object has no attribute 'S3Hook'
> {code}
> Here is the content of this script :
> {code}
> from airflow import hooks
> hook = hooks.S3Hook(s3_conn_id="test value")
> {code}
> Installing boto solves the issue.
> Not sure if there isn't a configuration issue on my system, however maybe 
> boto should be added as a dependency ?
> dud



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-312) Airflow is not able to load hooks

2016-07-04 Thread dud (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dud updated AIRFLOW-312:

   Priority: Major  (was: Blocker)
Description: 
Hello

I installed latest version of Airflow via pip and I noticed that it is now 
unable to use any hook :

{code}
pip install --user -U pip && pip install --user -U airflow

python /home/airflow/hooks.py 
[2016-07-04 16:30:31,973] {__init__.py:36} INFO - Using executor 
SequentialExecutor
[2016-07-04 16:30:32,044] {driver.py:120} INFO - Generating grammar tables from 
/usr/lib/python2.7/lib2to3/Grammar.txt
[2016-07-04 16:30:32,064] {driver.py:120} INFO - Generating grammar tables from 
/usr/lib/python2.7/lib2to3/PatternGrammar.txt
Traceback (most recent call last):
  File "/home/airflow/hooks.py", line 2, in 
hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id)
AttributeError: 'module' object has no attribute 'S3Hook'
{code}

Here is the content of this script :
{code}
from airflow import hooks
hook = hooks.S3Hook(s3_conn_id="test value")
{code}


Installing boto solves the issue.

Not sure if there isn't a configuration issue on my system, however maybe boto 
should be added as a dependency ?

dud

  was:
Hello

I installed latest version of Airflow via pip and I noticed that it is now 
unable to use any hook :

{code}
pip install --user -U pip && pip install --user -U airflow

python /home/airflow/hooks.py 
[2016-07-04 16:30:31,973] {__init__.py:36} INFO - Using executor 
SequentialExecutor
[2016-07-04 16:30:32,044] {driver.py:120} INFO - Generating grammar tables from 
/usr/lib/python2.7/lib2to3/Grammar.txt
[2016-07-04 16:30:32,064] {driver.py:120} INFO - Generating grammar tables from 
/usr/lib/python2.7/lib2to3/PatternGrammar.txt
Traceback (most recent call last):
  File "/home/airflow/hooks.py", line 2, in 
hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id)
AttributeError: 'module' object has no attribute 'S3Hook'
{code}

Here is the content of this script :
{code}
from airflow import hooks
hook = hooks.S3Hook(s3_conn_id="test value")
{code}

I tried with 1.7.1.2 and it fails as well.
dud

 Issue Type: Improvement  (was: Bug)

> Airflow is not able to load hooks
> -
>
> Key: AIRFLOW-312
> URL: https://issues.apache.org/jira/browse/AIRFLOW-312
> Project: Apache Airflow
>  Issue Type: Improvement
>Affects Versions: Airflow 1.7.1.2, Airflow 1.7.1.3
>Reporter: dud
>
> Hello
> I installed latest version of Airflow via pip and I noticed that it is now 
> unable to use any hook :
> {code}
> pip install --user -U pip && pip install --user -U airflow
> python /home/airflow/hooks.py 
> [2016-07-04 16:30:31,973] {__init__.py:36} INFO - Using executor 
> SequentialExecutor
> [2016-07-04 16:30:32,044] {driver.py:120} INFO - Generating grammar tables 
> from /usr/lib/python2.7/lib2to3/Grammar.txt
> [2016-07-04 16:30:32,064] {driver.py:120} INFO - Generating grammar tables 
> from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
> Traceback (most recent call last):
>   File "/home/airflow/hooks.py", line 2, in 
> hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id)
> AttributeError: 'module' object has no attribute 'S3Hook'
> {code}
> Here is the content of this script :
> {code}
> from airflow import hooks
> hook = hooks.S3Hook(s3_conn_id="test value")
> {code}
> Installing boto solves the issue.
> Not sure if there isn't a configuration issue on my system, however maybe 
> boto should be added as a dependency ?
> dud



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-312) Airflow is not able to load hooks

2016-07-04 Thread dud (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dud updated AIRFLOW-312:

Description: 
Hello

I installed latest version of Airflow via pip and I noticed that it is now 
unable to use any hook :

{code}
pip install --user -U pip && pip install --user -U airflow

python /home/airflow/hooks.py 
[2016-07-04 16:30:31,973] {__init__.py:36} INFO - Using executor 
SequentialExecutor
[2016-07-04 16:30:32,044] {driver.py:120} INFO - Generating grammar tables from 
/usr/lib/python2.7/lib2to3/Grammar.txt
[2016-07-04 16:30:32,064] {driver.py:120} INFO - Generating grammar tables from 
/usr/lib/python2.7/lib2to3/PatternGrammar.txt
Traceback (most recent call last):
  File "/home/airflow/hooks.py", line 2, in 
hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id)
AttributeError: 'module' object has no attribute 'S3Hook'
{code}

Here is the content of this script :
{code}
from airflow import hooks
hook = hooks.S3Hook(s3_conn_id="test value")
{code}

I tried with 1.7.1.2 and it fails as well.
dud

  was:
Hello

I installed latest version of Airflow via pip and I noticed that it is now 
unable to use any hook :

{code}
pip install --user -U pip && pip install --user -U airflow

python /home/airflow/hooks.py 
[2016-07-04 16:30:31,973] {__init__.py:36} INFO - Using executor 
SequentialExecutor
[2016-07-04 16:30:32,044] {driver.py:120} INFO - Generating grammar tables from 
/usr/lib/python2.7/lib2to3/Grammar.txt
[2016-07-04 16:30:32,064] {driver.py:120} INFO - Generating grammar tables from 
/usr/lib/python2.7/lib2to3/PatternGrammar.txt
Traceback (most recent call last):
  File "/home/airflow/hooks.py", line 2, in 
hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id)
AttributeError: 'module' object has no attribute 'S3Hook'
{code}

Here is the content of this script :
{code}
from airflow import hooks
hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id)
{code}

I tried with 1.7.1.2 and it fails as well.
dud


> Airflow is not able to load hooks
> -
>
> Key: AIRFLOW-312
> URL: https://issues.apache.org/jira/browse/AIRFLOW-312
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: Airflow 1.7.1.2, Airflow 1.7.1.3
>Reporter: dud
>Priority: Blocker
>
> Hello
> I installed latest version of Airflow via pip and I noticed that it is now 
> unable to use any hook :
> {code}
> pip install --user -U pip && pip install --user -U airflow
> python /home/airflow/hooks.py 
> [2016-07-04 16:30:31,973] {__init__.py:36} INFO - Using executor 
> SequentialExecutor
> [2016-07-04 16:30:32,044] {driver.py:120} INFO - Generating grammar tables 
> from /usr/lib/python2.7/lib2to3/Grammar.txt
> [2016-07-04 16:30:32,064] {driver.py:120} INFO - Generating grammar tables 
> from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
> Traceback (most recent call last):
>   File "/home/airflow/hooks.py", line 2, in 
> hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id)
> AttributeError: 'module' object has no attribute 'S3Hook'
> {code}
> Here is the content of this script :
> {code}
> from airflow import hooks
> hook = hooks.S3Hook(s3_conn_id="test value")
> {code}
> I tried with 1.7.1.2 and it fails as well.
> dud



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AIRFLOW-312) Airflow is not able to load hooks

2016-07-04 Thread dud (JIRA)
dud created AIRFLOW-312:
---

 Summary: Airflow is not able to load hooks
 Key: AIRFLOW-312
 URL: https://issues.apache.org/jira/browse/AIRFLOW-312
 Project: Apache Airflow
  Issue Type: Bug
Affects Versions: Airflow 1.7.1.2, Airflow 1.7.1.3
Reporter: dud
Priority: Blocker


Hello

I installed latest version of Airflow via pip and I noticed that it is now 
unable to use any hook :

{code}
pip install --user -U pip && pip install --user -U airflow

python /home/airflow/hooks.py 
[2016-07-04 16:30:31,973] {__init__.py:36} INFO - Using executor 
SequentialExecutor
[2016-07-04 16:30:32,044] {driver.py:120} INFO - Generating grammar tables from 
/usr/lib/python2.7/lib2to3/Grammar.txt
[2016-07-04 16:30:32,064] {driver.py:120} INFO - Generating grammar tables from 
/usr/lib/python2.7/lib2to3/PatternGrammar.txt
Traceback (most recent call last):
  File "/home/airflow/hooks.py", line 2, in 
hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id)
AttributeError: 'module' object has no attribute 'S3Hook'
{code}

Here is the content of this script :
{code}
from airflow import hooks
hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id)
{code}

I tried with 1.7.1.2 and it fails as well.
dud



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-249) Refactor the SLA mechanism

2016-06-21 Thread dud (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dud updated AIRFLOW-249:

Summary: Refactor the SLA mechanism  (was: Refactor the SLA mecanism)

> Refactor the SLA mechanism
> --
>
> Key: AIRFLOW-249
> URL: https://issues.apache.org/jira/browse/AIRFLOW-249
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: dud
>Assignee: dud
>
> Hello
> I've noticed the SLA feature is currently behaving as follow :
> - it doesn't work on DAG scheduled @once or None because they have no 
> dag.followwing_schedule property
> - it keeps endlessly checking for SLA misses without ever worrying about any 
> end_date. Worse I noticed that emails are still being sent for runs that are 
> never happening because of end_date
> - it keeps checking for recent TIs even if SLA notification has been already 
> been sent for them
> - the SLA logic is only being fired after following_schedule + sla has 
> elapsed, in other words one has to wait for the next TI before having a 
> chance of getting any email. Also the email reports dag.following_schedule 
> time (I guess because it is close of TI.start_date), but unfortunately that 
> doesn't match what the task instances shows nor the log filename
> - the SLA logic is based on max(TI.execution_date) for the starting point of 
> its checks, that means that for a DAG whose SLA is longer than its schedule 
> period if half of the TIs are running longer than expected it will go 
> unnoticed. This could be demonstrated with a DAG like this one :
> {code}
> from airflow import DAG
> from airflow.operators import *
> from datetime import datetime, timedelta
> from time import sleep
> default_args = {
> 'owner': 'airflow',
> 'depends_on_past': False,
> 'start_date': datetime(2016, 6, 16, 12, 20),
> 'email': my_email
> 'sla': timedelta(minutes=2),
> }
> dag = DAG('unnoticed_sla', default_args=default_args, 
> schedule_interval=timedelta(minutes=1))
> def alternating_sleep(**kwargs):
> minute = kwargs['execution_date'].strftime("%M")
> is_odd = int(minute) % 2
> if is_odd:
> sleep(300)
> else:
> sleep(10)
> return True
> PythonOperator(
> task_id='sla_miss',
> python_callable=alternating_sleep,
> provide_context=True,
> dag=dag)
> {code}
> I've tried to rework the SLA triggering mechanism by addressing the above 
> points., please [have a look on 
> it|https://github.com/dud225/incubator-airflow/commit/972260354075683a8d55a1c960d839c37e629e7d]
> I made some tests with this patch :
> - the fluctuent DAG shown above no longer make Airflow skip any SLA event :
> {code}
>  task_id  |dag_id |   execution_date| email_sent | 
> timestamp  | description | notification_sent 
> --+---+-+++-+---
>  sla_miss | dag_sla_miss1 | 2016-06-16 15:05:00 | t  | 2016-06-16 
> 15:08:26.058631 | | t
>  sla_miss | dag_sla_miss1 | 2016-06-16 15:07:00 | t  | 2016-06-16 
> 15:10:06.093253 | | t
>  sla_miss | dag_sla_miss1 | 2016-06-16 15:09:00 | t  | 2016-06-16 
> 15:12:06.241773 | | t
> {code}
> - on a normal DAG, the SLA is being triggred more quickly :
> {code}
> // start_date = 2016-06-16 15:55:00
> // end_date = 2016-06-16 16:00:00
> // schedule_interval =  timedelta(minutes=1)
> // sla = timedelta(minutes=2)
>  task_id  |dag_id |   execution_date| email_sent | 
> timestamp  | description | notification_sent 
> --+---+-+++-+---
>  sla_miss | dag_sla_miss1 | 2016-06-16 15:55:00 | t  | 2016-06-16 
> 15:58:11.832299 | | t
>  sla_miss | dag_sla_miss1 | 2016-06-16 15:56:00 | t  | 2016-06-16 
> 15:59:09.663778 | | t
>  sla_miss | dag_sla_miss1 | 2016-06-16 15:57:00 | t  | 2016-06-16 
> 16:00:13.651422 | | t
>  sla_miss | dag_sla_miss1 | 2016-06-16 15:58:00 | t  | 2016-06-16 
> 16:01:08.576399 | | t
>  sla_miss | dag_sla_miss1 | 2016-06-16 15:59:00 | t  | 2016-06-16 
> 16:02:08.523486 | | t
>  sla_miss | dag_sla_miss1 | 2016-06-16 16:00:00 | t  | 2016-06-16 
> 16:03:08.538593 | | t
> (6 rows)
> {code}
> than before (current master branch) :
> {code}
> // start_date = 2016-06-16 15:40:00
> // end_date = 2016-06-16 15:45:00
> // schedule_interval =  timedelta(minutes=1)
> // sla = timedelta(minutes=2)
>  task_id  |dag_id |   execution_date| email_sent | 
> timestamp  | description | notification_sent 
> 

[jira] [Created] (AIRFLOW-249) Refactor the SLA mecanism

2016-06-16 Thread dud (JIRA)
dud created AIRFLOW-249:
---

 Summary: Refactor the SLA mecanism
 Key: AIRFLOW-249
 URL: https://issues.apache.org/jira/browse/AIRFLOW-249
 Project: Apache Airflow
  Issue Type: Improvement
Reporter: dud


Hello

I've noticed the SLA feature is currently behaving as follow :
- it doesn't work on DAG scheduled @once or None because they have no 
dag.followwing_schedule property
- it keeps endlessly checking for SLA misses without ever worrying about any 
end_date. Worse I noticed that emails are still being sent for runs that are 
never happening because of end_date
- it keeps checking for recent TIs even if SLA notification has been already 
been sent for them
- the SLA logic is only being fired after following_schedule + sla has elapsed, 
in other words one has to wait for the next TI before having a chance of 
getting any email. Also the email reports dag.following_schedule time (I guess 
because it is close of TI.start_date), but unfortunately that doesn't match 
what the task instances shows nor the log filename
- the SLA logic is based on max(TI.execution_date) for the starting point of 
its checks, that means that for a DAG whose SLA is longer than its schedule 
period if half of the TIs are running longer than expected it will go 
unnoticed. This could be demonstrated with a DAG like this one :
{code}
from airflow import DAG
from airflow.operators import *
from datetime import datetime, timedelta
from time import sleep

default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': datetime(2016, 6, 16, 12, 20),
'email': my_email
'sla': timedelta(minutes=2),
}

dag = DAG('unnoticed_sla', default_args=default_args, 
schedule_interval=timedelta(minutes=1))

def alternating_sleep(**kwargs):
minute = kwargs['execution_date'].strftime("%M")
is_odd = int(minute) % 2
if is_odd:
sleep(300)
else:
sleep(10)
return True

PythonOperator(
task_id='sla_miss',
python_callable=alternating_sleep,
provide_context=True,
dag=dag)
{code}


I've tried to rework the SLA triggering mechanism by addressing the above 
points., please [have a look on 
it|https://github.com/dud225/incubator-airflow/commit/972260354075683a8d55a1c960d839c37e629e7d]

I made some tests with this patch :
- the fluctuent DAG shown above no longer make Airflow skip any SLA event :
{code}
 task_id  |dag_id |   execution_date| email_sent | 
timestamp  | description | notification_sent 
--+---+-+++-+---
 sla_miss | dag_sla_miss1 | 2016-06-16 15:05:00 | t  | 2016-06-16 
15:08:26.058631 | | t
 sla_miss | dag_sla_miss1 | 2016-06-16 15:07:00 | t  | 2016-06-16 
15:10:06.093253 | | t
 sla_miss | dag_sla_miss1 | 2016-06-16 15:09:00 | t  | 2016-06-16 
15:12:06.241773 | | t
{code}
- on a normal DAG, the SLA is being triggred more quickly :
{code}
// start_date = 2016-06-16 15:55:00
// end_date = 2016-06-16 16:00:00
// schedule_interval =  timedelta(minutes=1)
// sla = timedelta(minutes=2)

 task_id  |dag_id |   execution_date| email_sent | 
timestamp  | description | notification_sent 
--+---+-+++-+---
 sla_miss | dag_sla_miss1 | 2016-06-16 15:55:00 | t  | 2016-06-16 
15:58:11.832299 | | t
 sla_miss | dag_sla_miss1 | 2016-06-16 15:56:00 | t  | 2016-06-16 
15:59:09.663778 | | t
 sla_miss | dag_sla_miss1 | 2016-06-16 15:57:00 | t  | 2016-06-16 
16:00:13.651422 | | t
 sla_miss | dag_sla_miss1 | 2016-06-16 15:58:00 | t  | 2016-06-16 
16:01:08.576399 | | t
 sla_miss | dag_sla_miss1 | 2016-06-16 15:59:00 | t  | 2016-06-16 
16:02:08.523486 | | t
 sla_miss | dag_sla_miss1 | 2016-06-16 16:00:00 | t  | 2016-06-16 
16:03:08.538593 | | t
(6 rows)

{code}
than before (current master branch) :
{code}
// start_date = 2016-06-16 15:40:00
// end_date = 2016-06-16 15:45:00
// schedule_interval =  timedelta(minutes=1)
// sla = timedelta(minutes=2)

 task_id  |dag_id |   execution_date| email_sent | 
timestamp  | description | notification_sent 
--+---+-+++-+---
 sla_miss | dag_sla_miss1 | 2016-06-16 15:41:00 | t  | 2016-06-16 
15:44:30.305287 | | t
 sla_miss | dag_sla_miss1 | 2016-06-16 15:42:00 | t  | 2016-06-16 
15:45:35.372118 | | t
 sla_miss | dag_sla_miss1 | 2016-06-16 15:43:00 | t  | 2016-06-16 
15:46:30.415744 | | t
 sla_miss | dag_sla_miss1 | 2016-06-16 

[jira] [Resolved] (AIRFLOW-140) DagRun state not updated

2016-06-16 Thread dud (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dud resolved AIRFLOW-140.
-
Resolution: Fixed

This issue has been solved by commit 
[b18c9959142f3f1e2cb031c8709225af01192e32|https://github.com/apache/incubator-airflow/commit/b18c9959142f3f1e2cb031c8709225af01192e32].

[~bolke] many thanks for your work :)


> DagRun state not updated
> 
>
> Key: AIRFLOW-140
> URL: https://issues.apache.org/jira/browse/AIRFLOW-140
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
> Environment: Airflow latest Git version
>Reporter: dud
>Priority: Minor
>
> Hello
> I've noticed a strange behaviour : when launching a DAG whose task execution 
> duration is alternatingly slower and longer, DagRun state is only updated if 
> all previous DagRuns have ended.
> Here is DAG that can trigger this behaviour :
> {code}
> from airflow import DAG
> from airflow.operators import *
> from datetime import datetime, timedelta
> from time import sleep
> default_args = {
> 'owner': 'airflow',
> 'depends_on_past': False,
> 'start_date': datetime(2016, 5, 19, 10, 15),
> 'end_date': datetime(2016, 5, 19, 10, 20),
> }
> dag = DAG('dagrun_not_updated', default_args=default_args, 
> schedule_interval=timedelta(minutes=1))
> def alternating_sleep(**kwargs):
> minute = kwargs['execution_date'].strftime("%M")
> is_odd = int(minute) % 2
> if is_odd:
> sleep(300)
> else:
> sleep(10)
> return True
> PythonOperator(
> task_id='alt_sleep',
> python_callable=alternating_sleep,
> provide_context=True,
> dag=dag)
> {code}
> When this operator is executed, being run at an even minute makes the TI runs 
> faster than an odd one.
> I'm observing the following behaviour :
> - after some time, the second DagRun is still i running state despites it has 
> ended for a while :
> {code}
> airflow=> SELECT * FROM task_instance WHERE dag_id = :dag_id ORDER BY 
> execution_date ;  SELECT * FROM dag_run WHERE dag_id = :dag_id ;
>   task_id  |   dag_id   |   execution_date| start_date
>  |  end_date  | duration  |  state  | try_number | 
> hostname  | unixname | job_id | pool |  queue  | priority_weight |
> operator| queued_dttm
> --+---+-+++---+-++---+--++--+-+-++-
>  alt_sleep | dagrun_not_updated | 2016-05-19 10:15:00 | 2016-05-19 
> 10:17:19.039565 ||   | running |  
> 1 | localhost | airflow  |   3196 |  | default |   1 | 
> PythonOperator |
>  alt_sleep | dagrun_not_updated | 2016-05-19 10:16:00 | 2016-05-19 
> 10:17:23.698928 | 2016-05-19 10:17:33.823066 | 10.124138 | success |  
> 1 | localhost | airflow  |   3197 |  | default |   1 | 
> PythonOperator |
>  alt_sleep | dagrun_not_updated | 2016-05-19 10:17:00 | 2016-05-19 
> 10:18:03.025546 ||   | running |  
> 1 | localhost | airflow  |   3198 |  | default |   1 | 
> PythonOperator |
> (3 rows)
>   id  |   dag_id   |   execution_date|  state  | 
> run_id | external_trigger | conf | end_date | start_date  
>   
> --+---+-+-++--+--+--+
>  1479 | dagrun_not_updated | 2016-05-19 10:15:00 | running | 
> scheduled__2016-05-19T10:15:00 | f|  |  | 
> 2016-05-19 10:17:06.563842
>  1480 | dagrun_not_updated | 2016-05-19 10:16:00 | running | 
> scheduled__2016-05-19T10:16:00 | f|  |  | 
> 2016-05-19 10:17:12.188781
>  1481 | dagrun_not_updated | 2016-05-19 10:17:00 | running | 
> scheduled__2016-05-19T10:17:00 | f|  |  | 
> 2016-05-19 10:18:01.550625
> (3 rows)
> {code}
> - afer some time, all reportedly still running DagRuns are being marked as 
> successful at the same time :
> {code}
> 2016-05-19 10:23:11 UTC [12073-18] airflow@airflow LOG:  duration: 0.168 ms  
> statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1479
> 2016-05-19 10:23:11 UTC [12073-19] airflow@airflow LOG:  duration: 0.106 ms  
> statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1480
> 2016-05-19 10:23:11 UTC [12073-20] airflow@airflow LOG:  duration: 0.083 ms  
> statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1481
> 2016-05-19 10:23:11 UTC [12073-21] airflow@airflow LOG:  duration: 0.081 ms  
> statement: UPDATE dag_run SET state='success' WHERE 

[jira] [Created] (AIRFLOW-223) Make parametrable the IP on which Flower binds to

2016-06-09 Thread dud (JIRA)
dud created AIRFLOW-223:
---

 Summary: Make parametrable the IP on which Flower binds to
 Key: AIRFLOW-223
 URL: https://issues.apache.org/jira/browse/AIRFLOW-223
 Project: Apache Airflow
  Issue Type: Improvement
  Components: celery, webserver
Reporter: dud
Priority: Minor


Hello

I'd be glad if Flower could be set to bind to localhost only as could be done 
for Gunicorn.
dud



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (AIRFLOW-140) DagRun state not updated

2016-05-23 Thread dud (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292963#comment-15292963
 ] 

dud edited comment on AIRFLOW-140 at 5/23/16 9:49 AM:
--

Hello.
I tried with the LocalExecutor as requested and I observed the same behaviour :
{code}
airflow=> SELECT * FROM task_instance WHERE dag_id = :dag_id ORDER BY 
execution_date ; SELECT * FROM dag_run WHERE dag_id = :dag_id ; SELECT * FROM 
job ORDER BY start_date DESC LIMIT 5;
 task_id   |   dag_id   |   execution_date| start_date  
   |  end_date  | duration  |  state  | try_number | hostname  
| unixname | job_id | pool |  queue  | priority_weight |operator| 
queued_dttm
--+---+-+++---+-++---+--++--+-+-++-
 alt_sleep | dagrun_not_updated | 2016-05-20 07:45:00 | 2016-05-20 
07:46:54.372843 ||   | running |  1 
| localhost | airflow  |   3203 |  | default |   1 | 
PythonOperator |
 alt_sleep | dagrun_not_updated | 2016-05-20 07:46:00 | 2016-05-20 
07:47:19.317705 | 2016-05-20 07:47:29.453316 | 10.135611 | success |  1 
| localhost | airflow  |   3204 |  | default |   1 | 
PythonOperator |
 alt_sleep | dagrun_not_updated | 2016-05-20 07:47:00 | 2016-05-20 
07:48:01.724885 ||   | running |  1 
| localhost | airflow  |   3205 |  | default |   1 | 
PythonOperator |
 alt_sleep | dagrun_not_updated | 2016-05-20 07:48:00 | 2016-05-20 
07:49:12.031225 | 2016-05-20 07:49:22.083763 | 10.052538 | success |  1 
| localhost | airflow  |   3206 |  | default |   1 | 
PythonOperator |
(4 rows)

  id  |dag_id  |   execution_date|  state  | run_id 
| external_trigger | conf | end_date | start_date
--+---+-+-++--+--+--+
 1485 | dagrun_not_updated | 2016-05-20 07:45:00 | running | 
scheduled__2016-05-20T07:45:00 | f|  |  | 
2016-05-20 07:46:38.30924
 1486 | dagrun_not_updated | 2016-05-20 07:46:00 | running | 
scheduled__2016-05-20T07:46:00 | f|  |  | 
2016-05-20 07:47:01.563541
 1487 | dagrun_not_updated | 2016-05-20 07:47:00 | running | 
scheduled__2016-05-20T07:47:00 | f|  |  | 
2016-05-20 07:48:00.016718
 1488 | dagrun_not_updated | 2016-05-20 07:48:00 | running | 
scheduled__2016-05-20T07:48:00 | f|  |  | 
2016-05-20 07:49:00.203204
(4 rows)

  id  | dag_id |  state  |   job_type   | start_date |  
end_date  |  latest_heartbeat  | executor_class | hostname  | 
unixname 
--++-+--+++++---+--
 3206 || success | LocalTaskJob | 2016-05-20 07:49:08.691714 | 
2016-05-20 07:49:23.706144 | 2016-05-20 07:49:08.691725 | LocalExecutor  | 
localhost | airflow
 3205 || running | LocalTaskJob | 2016-05-20 07:48:01.155988 |  
  | 2016-05-20 07:50:51.312164 | LocalExecutor  | localhost | 
airflow
 3204 || success | LocalTaskJob | 2016-05-20 07:47:16.153078 | 
2016-05-20 07:47:31.168997 | 2016-05-20 07:47:16.153091 | LocalExecutor  | 
localhost | airflow
 3203 || running | LocalTaskJob | 2016-05-20 07:46:48.198379 |  
  | 2016-05-20 07:50:53.42636  | LocalExecutor  | localhost | 
airflow
 3202 || running | SchedulerJob | 2016-05-20 07:45:31.43799  |  
  | 2016-05-20 07:50:55.061958 | LocalExecutor  | localhost | 
airflow
{code}

Extract of database logs :
{code}
2016-05-20 07:47:31 UTC [24003-36] airflow@airflow LOG:  duration: 38.731 ms  
statement: UPDATE job SET state='success', 
end_date='2016-05-20T07:47:31.168997'::timestamp, 
latest_heartbeat='2016-05-20T07:47:16.153091'::timestamp WHERE job.id = 3204
2016-05-20 07:49:23 UTC [24107-36] airflow@airflow LOG:  duration: 0.179 ms  
statement: UPDATE job SET state='success', 
end_date='2016-05-20T07:49:23.706144'::timestamp, 
latest_heartbeat='2016-05-20T07:49:08.691725'::timestamp WHERE job.id = 3206
2016-05-20 07:52:03 UTC [23971-336] airflow@airflow LOG:  duration: 0.291 ms  
statement: UPDATE job SET state='success', 
end_date='2016-05-20T07:52:03.526927'::timestamp, 
latest_heartbeat='2016-05-20T07:46:48.198389'::timestamp WHERE job.id = 3203
2016-05-20 07:53:06 UTC [24047-326] airflow@airflow LOG:  duration: 0.179 

[jira] [Comment Edited] (AIRFLOW-140) DagRun state not updated

2016-05-20 Thread dud (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292963#comment-15292963
 ] 

dud edited comment on AIRFLOW-140 at 5/20/16 8:02 AM:
--

Hello.
I tried with the LocalExecutor as requested and I observed the same behaviour :
{code}
airflow=> SELECT * FROM task_instance WHERE dag_id = :dag_id ORDER BY 
execution_date ; SELECT * FROM dag_run WHERE dag_id = :dag_id ; SELECT * FROM 
job ORDER BY start_date DESC LIMIT 5;
 task_id   |   dag_id   |   execution_date| start_date  
   |  end_date  | duration  |  state  | try_number | hostname  
| unixname | job_id | pool |  queue  | priority_weight |operator| 
queued_dttm
--+---+-+++---+-++---+--++--+-+-++-
 alt_sleep | dagrun_not_updated | 2016-05-20 07:45:00 | 2016-05-20 
07:46:54.372843 ||   | running |  1 
| localhost | airflow  |   3203 |  | default |   1 | 
PythonOperator |
 alt_sleep | dagrun_not_updated | 2016-05-20 07:46:00 | 2016-05-20 
07:47:19.317705 | 2016-05-20 07:47:29.453316 | 10.135611 | success |  1 
| localhost | airflow  |   3204 |  | default |   1 | 
PythonOperator |
 alt_sleep | dagrun_not_updated | 2016-05-20 07:47:00 | 2016-05-20 
07:48:01.724885 ||   | running |  1 
| localhost | airflow  |   3205 |  | default |   1 | 
PythonOperator |
 alt_sleep | dagrun_not_updated | 2016-05-20 07:48:00 | 2016-05-20 
07:49:12.031225 | 2016-05-20 07:49:22.083763 | 10.052538 | success |  1 
| localhost | airflow  |   3206 |  | default |   1 | 
PythonOperator |
(4 rows)

  id  |dag_id  |   execution_date|  state  | run_id 
| external_trigger | conf | end_date | start_date
--+---+-+-++--+--+--+
 1485 | dagrun_not_updated | 2016-05-20 07:45:00 | running | 
scheduled__2016-05-20T07:45:00 | f|  |  | 
2016-05-20 07:46:38.30924
 1486 | dagrun_not_updated | 2016-05-20 07:46:00 | running | 
scheduled__2016-05-20T07:46:00 | f|  |  | 
2016-05-20 07:47:01.563541
 1487 | dagrun_not_updated | 2016-05-20 07:47:00 | running | 
scheduled__2016-05-20T07:47:00 | f|  |  | 
2016-05-20 07:48:00.016718
 1488 | dagrun_not_updated | 2016-05-20 07:48:00 | running | 
scheduled__2016-05-20T07:48:00 | f|  |  | 
2016-05-20 07:49:00.203204
(4 rows)

  id  | dag_id |  state  |   job_type   | start_date |  
end_date  |  latest_heartbeat  | executor_class | hostname  | 
unixname 
--++-+--+++++---+--
 3206 || success | LocalTaskJob | 2016-05-20 07:49:08.691714 | 
2016-05-20 07:49:23.706144 | 2016-05-20 07:49:08.691725 | LocalExecutor  | 
localhost | airflow
 3205 || running | LocalTaskJob | 2016-05-20 07:48:01.155988 |  
  | 2016-05-20 07:50:51.312164 | LocalExecutor  | localhost | 
airflow
 3204 || success | LocalTaskJob | 2016-05-20 07:47:16.153078 | 
2016-05-20 07:47:31.168997 | 2016-05-20 07:47:16.153091 | LocalExecutor  | 
localhost | airflow
 3203 || running | LocalTaskJob | 2016-05-20 07:46:48.198379 |  
  | 2016-05-20 07:50:53.42636  | LocalExecutor  | localhost | 
airflow
 3202 || running | SchedulerJob | 2016-05-20 07:45:31.43799  |  
  | 2016-05-20 07:50:55.061958 | LocalExecutor  | localhost | 
airflow
{code}

Extract of database logs :
{code}
2016-05-20 07:47:31 UTC [24003-36] airflow@airflow LOG:  duration: 38.731 ms  
statement: UPDATE job SET state='success', 
end_date='2016-05-20T07:47:31.168997'::timestamp, 
latest_heartbeat='2016-05-20T07:47:16.153091'::timestamp WHERE job.id = 3204
2016-05-20 07:49:23 UTC [24107-36] airflow@airflow LOG:  duration: 0.179 ms  
statement: UPDATE job SET state='success', 
end_date='2016-05-20T07:49:23.706144'::timestamp, 
latest_heartbeat='2016-05-20T07:49:08.691725'::timestamp WHERE job.id = 3206
2016-05-20 07:52:03 UTC [23971-336] airflow@airflow LOG:  duration: 0.291 ms  
statement: UPDATE job SET state='success', 
end_date='2016-05-20T07:52:03.526927'::timestamp, 
latest_heartbeat='2016-05-20T07:46:48.198389'::timestamp WHERE job.id = 3203
2016-05-20 07:53:06 UTC [24047-326] airflow@airflow LOG:  duration: 0.179 

[jira] [Commented] (AIRFLOW-140) DagRun state not updated

2016-05-20 Thread dud (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292963#comment-15292963
 ] 

dud commented on AIRFLOW-140:
-

Hello.
I tried with the LocalExecutor as requested and I observed the same behaviour :
{code}
airflow=> SELECT * FROM task_instance WHERE dag_id = :dag_id ORDER BY 
execution_date ; SELECT * FROM dag_run WHERE dag_id = :dag_id ; SELECT * FROM 
job ORDER BY start_date DESC LIMIT 5;
 task_id   |   dag_id   |   execution_date| start_date  
   |  end_date  | duration  |  state  | try_number | hostname  
| unixname | job_id | pool |  queue  | priority_weight |operator| 
queued_dttm
--+---+-+++---+-++---+--++--+-+-++-
 alt_sleep | dagrun_not_updated | 2016-05-20 07:45:00 | 2016-05-20 
07:46:54.372843 ||   | running |  1 
| localhost | airflow  |   3203 |  | default |   1 | 
PythonOperator |
 alt_sleep | dagrun_not_updated | 2016-05-20 07:46:00 | 2016-05-20 
07:47:19.317705 | 2016-05-20 07:47:29.453316 | 10.135611 | success |  1 
| localhost | airflow  |   3204 |  | default |   1 | 
PythonOperator |
 alt_sleep | dagrun_not_updated | 2016-05-20 07:47:00 | 2016-05-20 
07:48:01.724885 ||   | running |  1 
| localhost | airflow  |   3205 |  | default |   1 | 
PythonOperator |
 alt_sleep | dagrun_not_updated | 2016-05-20 07:48:00 | 2016-05-20 
07:49:12.031225 | 2016-05-20 07:49:22.083763 | 10.052538 | success |  1 
| localhost | airflow  |   3206 |  | default |   1 | 
PythonOperator |
(4 rows)

  id  |dag_id  |   execution_date|  state  | run_id 
| external_trigger | conf | end_date | start_date
--+---+-+-++--+--+--+
 1485 | dagrun_not_updated | 2016-05-20 07:45:00 | running | 
scheduled__2016-05-20T07:45:00 | f|  |  | 
2016-05-20 07:46:38.30924
 1486 | dagrun_not_updated | 2016-05-20 07:46:00 | running | 
scheduled__2016-05-20T07:46:00 | f|  |  | 
2016-05-20 07:47:01.563541
 1487 | dagrun_not_updated | 2016-05-20 07:47:00 | running | 
scheduled__2016-05-20T07:47:00 | f|  |  | 
2016-05-20 07:48:00.016718
 1488 | dagrun_not_updated | 2016-05-20 07:48:00 | running | 
scheduled__2016-05-20T07:48:00 | f|  |  | 
2016-05-20 07:49:00.203204
(4 rows)

  id  | dag_id |  state  |   job_type   | start_date |  
end_date  |  latest_heartbeat  | executor_class | hostname  | 
unixname 
--++-+--+++++---+--
 3206 || success | LocalTaskJob | 2016-05-20 07:49:08.691714 | 
2016-05-20 07:49:23.706144 | 2016-05-20 07:49:08.691725 | LocalExecutor  | 
localhost | airflow
 3205 || running | LocalTaskJob | 2016-05-20 07:48:01.155988 |  
  | 2016-05-20 07:50:51.312164 | LocalExecutor  | localhost | 
airflow
 3204 || success | LocalTaskJob | 2016-05-20 07:47:16.153078 | 
2016-05-20 07:47:31.168997 | 2016-05-20 07:47:16.153091 | LocalExecutor  | 
localhost | airflow
 3203 || running | LocalTaskJob | 2016-05-20 07:46:48.198379 |  
  | 2016-05-20 07:50:53.42636  | LocalExecutor  | localhost | 
airflow
 3202 || running | SchedulerJob | 2016-05-20 07:45:31.43799  |  
  | 2016-05-20 07:50:55.061958 | LocalExecutor  | localhost | 
airflow
{code}

Database logs :
{code}
2016-05-20 07:47:31 UTC [24003-36] airflow@airflow LOG:  duration: 38.731 ms  
statement: UPDATE job SET state='success', 
end_date='2016-05-20T07:47:31.168997'::timestamp, 
latest_heartbeat='2016-05-20T07:47:16.153091'::timestamp WHERE job.id = 3204
2016-05-20 07:49:23 UTC [24107-36] airflow@airflow LOG:  duration: 0.179 ms  
statement: UPDATE job SET state='success', 
end_date='2016-05-20T07:49:23.706144'::timestamp, 
latest_heartbeat='2016-05-20T07:49:08.691725'::timestamp WHERE job.id = 3206
2016-05-20 07:52:03 UTC [23971-336] airflow@airflow LOG:  duration: 0.291 ms  
statement: UPDATE job SET state='success', 
end_date='2016-05-20T07:52:03.526927'::timestamp, 
latest_heartbeat='2016-05-20T07:46:48.198389'::timestamp WHERE job.id = 3203
2016-05-20 07:53:06 UTC [24047-326] airflow@airflow LOG:  duration: 0.179 ms  
statement: UPDATE job SET state='success', 

[jira] [Commented] (AIRFLOW-140) DagRun state not updated

2016-05-19 Thread dud (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291115#comment-15291115
 ] 

dud commented on AIRFLOW-140:
-

Latest version from Git (8ae86814b3cf5cd65e3c6b04c6c700139d8cc20b)

> DagRun state not updated
> 
>
> Key: AIRFLOW-140
> URL: https://issues.apache.org/jira/browse/AIRFLOW-140
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: Airflow latest Git version
>Reporter: dud
>Priority: Minor
>
> Hello
> I've noticed a strange behaviour : when launching a DAG whose task execution 
> duration is alternatingly slower and longer, DagRun state is only updated if 
> all previous DagRuns have ended.
> Here is DAG that can trigger this behaviour :
> {code}
> from airflow import DAG
> from airflow.operators import *
> from datetime import datetime, timedelta
> from time import sleep
> default_args = {
> 'owner': 'airflow',
> 'depends_on_past': False,
> 'start_date': datetime(2016, 5, 19, 10, 15),
> 'end_date': datetime(2016, 5, 19, 10, 20),
> }
> dag = DAG('dagrun_not_updated', default_args=default_args, 
> schedule_interval=timedelta(minutes=1))
> def alternating_sleep(**kwargs):
> minute = kwargs['execution_date'].strftime("%M")
> is_odd = int(minute) % 2
> if is_odd:
> sleep(300)
> else:
> sleep(10)
> return True
> PythonOperator(
> task_id='alt_sleep',
> python_callable=alternating_sleep,
> provide_context=True,
> dag=dag)
> {code}
> When this operator is executed, being run at an even minute makes the TI runs 
> faster than an odd one.
> I'm observing the following behaviour :
> - after some time, the second DagRun is still i running state despites it has 
> ended for a while :
> {code}
> airflow=> SELECT * FROM task_instance WHERE dag_id = :dag_id ORDER BY 
> execution_date ;  SELECT * FROM dag_run WHERE dag_id = :dag_id ;
>   task_id  |   dag_id   |   execution_date| start_date
>  |  end_date  | duration  |  state  | try_number | 
> hostname  | unixname | job_id | pool |  queue  | priority_weight |
> operator| queued_dttm
> --+---+-+++---+-++---+--++--+-+-++-
>  alt_sleep | dagrun_not_updated | 2016-05-19 10:15:00 | 2016-05-19 
> 10:17:19.039565 ||   | running |  
> 1 | localhost | airflow  |   3196 |  | default |   1 | 
> PythonOperator |
>  alt_sleep | dagrun_not_updated | 2016-05-19 10:16:00 | 2016-05-19 
> 10:17:23.698928 | 2016-05-19 10:17:33.823066 | 10.124138 | success |  
> 1 | localhost | airflow  |   3197 |  | default |   1 | 
> PythonOperator |
>  alt_sleep | dagrun_not_updated | 2016-05-19 10:17:00 | 2016-05-19 
> 10:18:03.025546 ||   | running |  
> 1 | localhost | airflow  |   3198 |  | default |   1 | 
> PythonOperator |
> (3 rows)
>   id  |   dag_id   |   execution_date|  state  | 
> run_id | external_trigger | conf | end_date | start_date  
>   
> --+---+-+-++--+--+--+
>  1479 | dagrun_not_updated | 2016-05-19 10:15:00 | running | 
> scheduled__2016-05-19T10:15:00 | f|  |  | 
> 2016-05-19 10:17:06.563842
>  1480 | dagrun_not_updated | 2016-05-19 10:16:00 | running | 
> scheduled__2016-05-19T10:16:00 | f|  |  | 
> 2016-05-19 10:17:12.188781
>  1481 | dagrun_not_updated | 2016-05-19 10:17:00 | running | 
> scheduled__2016-05-19T10:17:00 | f|  |  | 
> 2016-05-19 10:18:01.550625
> (3 rows)
> {code}
> - afer some time, all reportedly still running DagRuns are being marked as 
> successful at the same time :
> {code}
> 2016-05-19 10:23:11 UTC [12073-18] airflow@airflow LOG:  duration: 0.168 ms  
> statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1479
> 2016-05-19 10:23:11 UTC [12073-19] airflow@airflow LOG:  duration: 0.106 ms  
> statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1480
> 2016-05-19 10:23:11 UTC [12073-20] airflow@airflow LOG:  duration: 0.083 ms  
> statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1481
> 2016-05-19 10:23:11 UTC [12073-21] airflow@airflow LOG:  duration: 0.081 ms  
> statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1482
> {code}
> So it waited till the 4th DagRun ended to update the dag_run table.
> I've looked at the code I'm not sure whether the issue lies in 

[jira] [Created] (AIRFLOW-140) DagRun state not updated

2016-05-19 Thread dud (JIRA)
dud created AIRFLOW-140:
---

 Summary: DagRun state not updated
 Key: AIRFLOW-140
 URL: https://issues.apache.org/jira/browse/AIRFLOW-140
 Project: Apache Airflow
  Issue Type: Bug
 Environment: Airflow latest Git version
Reporter: dud
Priority: Minor


Hello

I've noticed a strange behaviour : when launching a DAG whose task execution 
duration is alternatingly slower and longer, DagRun state is only updated if 
all previous DagRuns have ended.

Here is DAG that can trigger this behaviour :
{code}
from airflow import DAG
from airflow.operators import *
from datetime import datetime, timedelta
from time import sleep

default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': datetime(2016, 5, 19, 10, 15),
'end_date': datetime(2016, 5, 19, 10, 20),
}

dag = DAG('dagrun_not_updated', default_args=default_args, 
schedule_interval=timedelta(minutes=1))

def alternating_sleep(**kwargs):
minute = kwargs['execution_date'].strftime("%M")
is_odd = int(minute) % 2
if is_odd:
sleep(300)
else:
sleep(10)
return True

PythonOperator(
task_id='alt_sleep',
python_callable=alternating_sleep,
provide_context=True,
dag=dag)
{code}

When this operator is executed, being run at an even minute makes the TI runs 
faster than an odd one.

I'm observing the following behaviour :
- after some time, the second DagRun is still i running state despites it has 
ended for a while :
{code}
airflow=> SELECT * FROM task_instance WHERE dag_id = :dag_id ORDER BY 
execution_date ;  SELECT * FROM dag_run WHERE dag_id = :dag_id ;
  task_id  |   dag_id   |   execution_date| start_date  
   |  end_date  | duration  |  state  | try_number | hostname  
| unixname | job_id | pool |  queue  | priority_weight |operator| 
queued_dttm
--+---+-+++---+-++---+--++--+-+-++-
 alt_sleep | dagrun_not_updated | 2016-05-19 10:15:00 | 2016-05-19 
10:17:19.039565 ||   | running |  1 
| localhost | airflow  |   3196 |  | default |   1 | 
PythonOperator |
 alt_sleep | dagrun_not_updated | 2016-05-19 10:16:00 | 2016-05-19 
10:17:23.698928 | 2016-05-19 10:17:33.823066 | 10.124138 | success |  1 
| localhost | airflow  |   3197 |  | default |   1 | 
PythonOperator |
 alt_sleep | dagrun_not_updated | 2016-05-19 10:17:00 | 2016-05-19 
10:18:03.025546 ||   | running |  1 
| localhost | airflow  |   3198 |  | default |   1 | 
PythonOperator |
(3 rows)


  id  |   dag_id   |   execution_date|  state  | run_id 
| external_trigger | conf | end_date | start_date
--+---+-+-++--+--+--+
 1479 | dagrun_not_updated | 2016-05-19 10:15:00 | running | 
scheduled__2016-05-19T10:15:00 | f|  |  | 
2016-05-19 10:17:06.563842
 1480 | dagrun_not_updated | 2016-05-19 10:16:00 | running | 
scheduled__2016-05-19T10:16:00 | f|  |  | 
2016-05-19 10:17:12.188781
 1481 | dagrun_not_updated | 2016-05-19 10:17:00 | running | 
scheduled__2016-05-19T10:17:00 | f|  |  | 
2016-05-19 10:18:01.550625
(3 rows)
{code}

- afer some time, all reportedly still running DagRuns are being marked as 
successful at the same time :
{code}
2016-05-19 10:23:11 UTC [12073-18] airflow@airflow LOG:  duration: 0.168 ms  
statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1479
2016-05-19 10:23:11 UTC [12073-19] airflow@airflow LOG:  duration: 0.106 ms  
statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1480
2016-05-19 10:23:11 UTC [12073-20] airflow@airflow LOG:  duration: 0.083 ms  
statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1481
2016-05-19 10:23:11 UTC [12073-21] airflow@airflow LOG:  duration: 0.081 ms  
statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1482
{code}

So it waited till the 4th DagRun ended to update the dag_run table.

I've looked at the code I'm not sure whether the issue lies in Airflow as the 
scheduler properly runs the code that updates the state to sucess :
{code}
May 19 10:17:36 airflow-ec2 airflow-scheduler[11543]: [2016-05-19 10:17:36,542] 
{models.py:2725} INFO - Marking run  
successful
May 19 10:17:41 airflow-ec2 airflow-scheduler[11543]: [2016-05-19 10:17:41,666] 
{models.py:2725} INFO - Marking run  
successful
May 19 10:17:51 airflow-ec2 airflow-scheduler[11543]: [2016-05-19 

[jira] [Updated] (AIRFLOW-127) Makes filter_by_owner aware of multi-owner DAG

2016-05-18 Thread dud (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dud updated AIRFLOW-127:

Description: 
Hello

When enabling the _webserver.filter_by_owner_ setting, Airflow issues a request 
like this one :
{code}
SELECT * FROM dag WHERE NOT dag.is_subdag AND dag.is_active AND dag.owners = 
username
{code}
Unfortunately that doesn't account for DAG authored by multiple users.

Please see my PR : https://github.com/apache/incubator-airflow/pull/1508

dud

  was:
Hello

When enabling the _webserver.filter_by_owner_ setting, Airflow issues a request 
like this one :
{code}
SELECT * FROM dag WHERE NOT dag.is_subdag AND dag.is_active AND dag.owners = 
username
{code}
Unfortunately that doesn't account for DAG authored by multiple users.

dud


> Makes filter_by_owner aware of multi-owner DAG
> --
>
> Key: AIRFLOW-127
> URL: https://issues.apache.org/jira/browse/AIRFLOW-127
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: dud
>Priority: Minor
>
> Hello
> When enabling the _webserver.filter_by_owner_ setting, Airflow issues a 
> request like this one :
> {code}
> SELECT * FROM dag WHERE NOT dag.is_subdag AND dag.is_active AND dag.owners = 
> username
> {code}
> Unfortunately that doesn't account for DAG authored by multiple users.
> Please see my PR : https://github.com/apache/incubator-airflow/pull/1508
> dud



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-121) Documenting dag doc_md feature

2016-05-16 Thread dud (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dud updated AIRFLOW-121:

Description: 
Dear Airflow Maintainers,

I added a note about DAG documentation.

I'd be glad if my PR would be merged : 
https://github.com/apache/incubator-airflow/pull/1493

Regards
dud


  was:
Dear Airflow Maintainers,

I added a note about DAG documentation.

I'd be glad if my PR would be merged.

Regards
dud



> Documenting dag doc_md feature
> --
>
> Key: AIRFLOW-121
> URL: https://issues.apache.org/jira/browse/AIRFLOW-121
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: dud
>Priority: Trivial
>
> Dear Airflow Maintainers,
> I added a note about DAG documentation.
> I'd be glad if my PR would be merged : 
> https://github.com/apache/incubator-airflow/pull/1493
> Regards
> dud



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-106) email_on_failure doesn't being triggered

2016-05-11 Thread dud (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dud updated AIRFLOW-106:

Description: 
Hello.

I created the following workflow :

{code}
from airflow import DAG
from airflow.operators import PythonOperator
from datetime import datetime, timedelta
from airflow.models import Variable
from time import sleep

default_args = {
'depends_on_past': False,
'start_date': datetime(2016, 5, 11, 15, 20),
'email': 
'email_on_failure': True,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=2),
'end_date': datetime(2016, 5, 11, 16, 00),
}

PARENT_DAG_NAME = 'test'

dag = DAG(PARENT_DAG_NAME, default_args=default_args, 
schedule_interval=timedelta(minutes=10))

def sleep1_function(**kwargs):
sleep(90)
return Variable.get('test_var')

sleep1 = PythonOperator(
task_id='sleep1',
python_callable=sleep1_function,
dag=dag)
{code}

I forgot to declare test_var so when this DAG launched it failed quickly. 
However no failure email was ever sent. Clearing the failed task to make it 
rerun doesn't trigger any email.

Here is the logs :
{code}
[2016-05-11 15:53:31,784] {models.py:157} INFO - Filling up the DagBag from 
/var/lib/airflow/airflow/dags/test.py
[2016-05-11 15:53:32,272] {models.py:157} INFO - Filling up the DagBag from 
/var/lib/airflow/airflow/dags/test.py
[2016-05-11 15:53:32,313] {models.py:1216} INFO - 

Starting attempt 1 of 2


[2016-05-11 15:53:32,333] {models.py:1239} INFO - Executing 
 on 2016-05-11 15:20:00
[2016-05-11 15:55:03,450] {models.py:1306} ERROR - Variable test_var does not 
exist
Traceback (most recent call last):
  File 
"/usr/local/lib/python2.7/dist-packages/airflow-1.7.0-py2.7.egg/airflow/models.py",
 line 1265, in run
result = task_copy.execute(context=context)
  File 
"/usr/local/lib/python2.7/dist-packages/airflow-1.7.0-py2.7.egg/airflow/operators/python_operator.py",
 line 66, in execute
return_value = self.python_callable(*self.op_args, **self.op_kwargs)
  File "/var/lib/airflow/airflow/dags/test.py", line 31, in sleep1_function
return Variable.get('test_var')
  File 
"/usr/local/lib/python2.7/dist-packages/airflow-1.7.0-py2.7.egg/airflow/utils/db.py",
 line 53, in wrapper
result = func(*args, **kwargs)
  File 
"/usr/local/lib/python2.7/dist-packages/airflow-1.7.0-py2.7.egg/airflow/models.py",
 line 3145, in get
raise ValueError('Variable {} does not exist'.format(key))
ValueError: Variable test_var does not exist
[2016-05-11 15:55:03,581] {models.py:1318} INFO - Marking task as UP_FOR_RETRY
[2016-05-11 15:55:03,759] {models.py:1347} ERROR - Variable test_var does not 
exist
{code}

In the DAG Runs page, the workflow is set as failed. In hte taks instance page, 
it is set as up_for_retry but no new run is ever scheduled.

I tried incrementing the retires parameter, but nothing different happens, 
Airflow never retries after the first run.

dud

  was:
Hello.

I created the following workflow :

{code}
from airflow import DAG
from airflow.operators import PythonOperator
from datetime import datetime, timedelta
from airflow.models import Variable
from time import sleep

default_args = {
'depends_on_past': False,
'start_date': datetime(2016, 5, 11, 15, 20),
'email': 
'email_on_failure': True,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=2),
'end_date': datetime(2016, 5, 11, 16, 00),
}

PARENT_DAG_NAME = 'test'

dag = DAG(PARENT_DAG_NAME, default_args=default_args, 
schedule_interval=timedelta(minutes=10))

def sleep1_function(**kwargs):
sleep(90)
return Variable.get('test_var')

sleep1 = PythonOperator(
task_id='sleep1',
python_callable=sleep1_function,
dag=dag)
{code}

I forgot to declare test_var so when this DAG launched it failed quickly. 
However no failure email was ever sent. Clearing the failed task to make it 
rerun doesn't trigger any email.

Here is the logs :
{code}
[2016-05-11 15:53:31,784] {models.py:157} INFO - Filling up the DagBag from 
/var/lib/airflow/airflow/dags/test.py
[2016-05-11 15:53:32,272] {models.py:157} INFO - Filling up the DagBag from 
/var/lib/airflow/airflow/dags/test.py
[2016-05-11 15:53:32,313] {models.py:1216} INFO - 

Starting attempt 1 of 2


[2016-05-11 15:53:32,333] {models.py:1239} INFO - Executing 
 on 2016-05-11 15:20:00
[2016-05-11 15:55:03,450] {models.py:1306} ERROR - Variable test_var does not 
exist
Traceback (most recent call last):
  File 

[jira] [Created] (AIRFLOW-106) email_on_failure doesn't being triggered

2016-05-11 Thread dud (JIRA)
dud created AIRFLOW-106:
---

 Summary: email_on_failure doesn't being triggered
 Key: AIRFLOW-106
 URL: https://issues.apache.org/jira/browse/AIRFLOW-106
 Project: Apache Airflow
  Issue Type: Bug
 Environment: Latest version from Git
Reporter: dud


Hello.

I created the following workflow :

{code}
from airflow import DAG
from airflow.operators import PythonOperator
from datetime import datetime, timedelta
from airflow.models import Variable
from time import sleep

default_args = {
'depends_on_past': False,
'start_date': datetime(2016, 5, 11, 15, 20),
'email': 
'email_on_failure': True,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=2),
'end_date': datetime(2016, 5, 11, 16, 00),
}

PARENT_DAG_NAME = 'test'

dag = DAG(PARENT_DAG_NAME, default_args=default_args, 
schedule_interval=timedelta(minutes=10))

def sleep1_function(**kwargs):
sleep(90)
return Variable.get('test_var')

sleep1 = PythonOperator(
task_id='sleep1',
python_callable=sleep1_function,
dag=dag)
{code}

I forgot to declare test_var so when this DAG launched it failed quickly. 
However no failure email was ever sent. Clearing the failed task to make it 
rerun doesn't trigger any email.

Here is the logs :
{code}
[2016-05-11 15:53:31,784] {models.py:157} INFO - Filling up the DagBag from 
/var/lib/airflow/airflow/dags/test.py
[2016-05-11 15:53:32,272] {models.py:157} INFO - Filling up the DagBag from 
/var/lib/airflow/airflow/dags/test.py
[2016-05-11 15:53:32,313] {models.py:1216} INFO - 

Starting attempt 1 of 2


[2016-05-11 15:53:32,333] {models.py:1239} INFO - Executing 
 on 2016-05-11 15:20:00
[2016-05-11 15:55:03,450] {models.py:1306} ERROR - Variable test_var does not 
exist
Traceback (most recent call last):
  File 
"/usr/local/lib/python2.7/dist-packages/airflow-1.7.0-py2.7.egg/airflow/models.py",
 line 1265, in run
result = task_copy.execute(context=context)
  File 
"/usr/local/lib/python2.7/dist-packages/airflow-1.7.0-py2.7.egg/airflow/operators/python_operator.py",
 line 66, in execute
return_value = self.python_callable(*self.op_args, **self.op_kwargs)
  File "/var/lib/airflow/airflow/dags/test.py", line 31, in sleep1_function
return Variable.get('test_var')
  File 
"/usr/local/lib/python2.7/dist-packages/airflow-1.7.0-py2.7.egg/airflow/utils/db.py",
 line 53, in wrapper
result = func(*args, **kwargs)
  File 
"/usr/local/lib/python2.7/dist-packages/airflow-1.7.0-py2.7.egg/airflow/models.py",
 line 3145, in get
raise ValueError('Variable {} does not exist'.format(key))
ValueError: Variable test_var does not exist
[2016-05-11 15:55:03,581] {models.py:1318} INFO - Marking task as UP_FOR_RETRY
[2016-05-11 15:55:03,759] {models.py:1347} ERROR - Variable test_var does not 
exist
{code}

In the DAG Runs page, the workflow is set as failed. In hte taks instance page, 
it is set as up_for_retry but no new run is ever scheduled.

I tried incrementing the retires parameter, but nothing different happens, 
Airflow ever retries after the first run.

dud



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-39) DagRun scheduled after end_date

2016-05-06 Thread dud (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-39?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273774#comment-15273774
 ] 

dud commented on AIRFLOW-39:


Great news. Thanks.

> DagRun scheduled after end_date
> ---
>
> Key: AIRFLOW-39
> URL: https://issues.apache.org/jira/browse/AIRFLOW-39
> Project: Apache Airflow
>  Issue Type: Wish
>Affects Versions: Airflow 1.7.0
> Environment: -  Airflow version: v1.7.0
> -  Airflow components: webserver and scheduler with a postgres database and 
> LocalExecutor
> -  Python Version: 2.7.9
>Reporter: dud
>Assignee: Siddharth Anand
>Priority: Minor
> Attachments: 1_7_0_dag_runs_past_date_running.png, 
> Head_master_dag_runs_past_date_failed.png, 
> Jobs_view_only_shows_runs_within_start_end_range.png, 
> Task_instance_view_only_shows_TIs_within_start_end_range.png, 
> Tree_view_only_shows_runs_within_start_end_range.png, tutorial4.png, 
> tutorial4_grpah.png
>
>
> Hello
> When I write a new DAG, it is automatically picked by the scheduler as 
> intended. However the end_date parameter is not accounted when creating a 
> DagRun, so that DagRuns scheduled after end_date are being displayed in the 
> web interface and are staying in the running state forever as the scheduler 
> doesn't launch them.
> From my reading of the sources, the end_date paremeter is not used when 
> creating a DagRun.
> Is there any reason for this behaviour ?
> Regards
> dud



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-39) DagRun scheduled after end_date

2016-05-04 Thread dud (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-39?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dud updated AIRFLOW-39:
---
Attachment: tutorial4_grpah.png
tutorial4.png

> DagRun scheduled after end_date
> ---
>
> Key: AIRFLOW-39
> URL: https://issues.apache.org/jira/browse/AIRFLOW-39
> Project: Apache Airflow
>  Issue Type: Wish
> Environment: -  Airflow version: v1.7.0
> -  Airflow components: webserver and scheduler with a postgres database and 
> LocalExecutor
> -  Python Version: 2.7.9
>Reporter: dud
>Assignee: Siddharth Anand
>Priority: Minor
> Attachments: tutorial4.png, tutorial4_grpah.png
>
>
> Hello
> When I write a new DAG, it is automatically picked by the scheduler as 
> intended. However the end_date parameter is not accounted when creating a 
> DagRun, so that DagRuns scheduled after end_date are being displayed in the 
> web interface and are staying in the running state forever as the scheduler 
> doesn't launch them.
> From my reading of the sources, the end_date paremeter is not used when 
> creating a DagRun.
> Is there any reason for this behaviour ?
> Regards
> dud



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-39) DagRun scheduled after end_date

2016-05-04 Thread dud (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-39?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15271009#comment-15271009
 ] 

dud commented on AIRFLOW-39:


{code}
from airflow import DAG
from airflow.operators import BashOperator
from datetime import datetime, timedelta


default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': datetime(2016, 5, 4, 16, 40),
'end_date': datetime(2016, 5, 4, 16, 50),
'email': ['airf...@airflow.com'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=5),
# 'queue': 'bash_queue',
# 'pool': 'backfill',
# 'priority_weight': 10,
# 'end_date': datetime(2016, 1, 1),
}

dag = DAG('tutorial4', default_args=default_args, 
schedule_interval=timedelta(minutes=5))

# t1, t2 and t3 are examples of tasks created by instatiating operators
t1 = BashOperator(
task_id='print_date',
bash_command='date',
dag=dag)

t2 = BashOperator(
task_id='sleep',
bash_command='sleep 5',
retries=3,
dag=dag)

templated_command = """
{% for i in range(5) %}
echo "{{ ds }}"
echo "{{ macros.ds_add(ds, 7)}}"
echo "{{ params.my_param }}"
{% endfor %}
"""

t3 = BashOperator(
task_id='templated',
bash_command=templated_command,
params={'my_param': 'Parameter I passed in'},
dag=dag)

t2.set_upstream(t1)
t3.set_upstream(t1)
{code}

Tree view just shows runs between 16:40 and 16:50.

Please note that I tried with latest version from Github.

> DagRun scheduled after end_date
> ---
>
> Key: AIRFLOW-39
> URL: https://issues.apache.org/jira/browse/AIRFLOW-39
> Project: Apache Airflow
>  Issue Type: Wish
> Environment: -  Airflow version: v1.7.0
> -  Airflow components: webserver and scheduler with a postgres database and 
> LocalExecutor
> -  Python Version: 2.7.9
>Reporter: dud
>Assignee: Siddharth Anand
>Priority: Minor
>
> Hello
> When I write a new DAG, it is automatically picked by the scheduler as 
> intended. However the end_date parameter is not accounted when creating a 
> DagRun, so that DagRuns scheduled after end_date are being displayed in the 
> web interface and are staying in the running state forever as the scheduler 
> doesn't launch them.
> From my reading of the sources, the end_date paremeter is not used when 
> creating a DagRun.
> Is there any reason for this behaviour ?
> Regards
> dud



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)