[jira] [Updated] (AIRFLOW-1565) Incompatibility with Flask-Login-0.4.0
[ https://issues.apache.org/jira/browse/AIRFLOW-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dud updated AIRFLOW-1565: - Description: Hello I've notice that Airflow v1.8.2 isn't playing well with flask_login v0.4.0 : {code} $ pip install --user flask_login==0.4.0 Collecting flask_login==0.4.0 Using cached Flask_Login-0.4.0-py2.py3-none-any.whl Requirement already satisfied: Flask in ./.local/lib/python2.7/site-packages (from flask_login==0.4.0) Requirement already satisfied: itsdangerous>=0.21 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: Jinja2>=2.4 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: Werkzeug>=0.7 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: click>=2.0 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: MarkupSafe>=0.23 in ./.local/lib/python2.7/site-packages (from Jinja2>=2.4->Flask->flask_login==0.4.0) Installing collected packages: flask-login Found existing installation: Flask-Login 0.3.2 Uninstalling Flask-Login-0.3.2: Successfully uninstalled Flask-Login-0.3.2 Successfully installed flask-login-0.4.0 $ airflow webserver [2017-09-05 14:30:50,130] {dag_processing.py:627} INFO - Started a process (PID: 22751) to generate tasks for /home/airflow/dags/metadsp/APB_process_table/airflow.py - logging into /var/log/airflow/scheduler/2017-09-05 [2017-09-05 14:30:50,133] {dag_processing.py:627} INFO - Started a process (PID: 22752) to generate tasks for /home/airflow/dags/metadsp/APN_segment_manager/segment_manager_factory.py - logging into /var/log/airflow/sc [2017-09-05 14:30:50,134] {jobs.py:1443} INFO - Heartbeating the executor 2017-09-05 14:30:50,192] {__init__.py:57} INFO - Using executor CeleryExecutor [2017-09-05 14:30:50,252] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt [2017-09-05 14:30:50,273] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt /home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py:23: FlaskWTFDeprecationWarning: "flask_wtf.CsrfProtect" has been renamed to "CSRFProtect" and will be removed in 1.0. csrf = CsrfProtect() /home/airflow/.local/lib/python2.7/site-packages/flask_cache/jinja2ext.py:33: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache instead. from flask.ext.cache import make_template_fragment_key _ |__( )_ __/__ / __ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / / ___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ / _/_/ |_/_/ /_//_//_/ \//|__/ Traceback (most recent call last): File "/home/airflow/bin/airflow", line 28, in args.func(args) File "/home/airflow/.local/lib/python2.7/site-packages/airflow/bin/cli.py", line 734, in webserver app = cached_app(conf) File "/home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py", line 161, in cached_app app = create_app(config) File "/home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py", line 60, in create_app from airflow.www import views File "/home/airflow/.local/lib/python2.7/site-packages/airflow/www/views.py", line 41, in from flask_login import flash ImportError: cannot import name flash $ python -c "from flask_login import flash" Traceback (most recent call last): File "", line 1, in ImportError: cannot import name flash {code} -Downgrading to flask_login==0.3.2 fixes the issue.- EDIT : Downgrading to flask_login==0.2.11 fixes the issue. Due to : {quote} Version 0.3.0 - Released on September 10th, 2015 [...] - BREAKING: The `is_authenticated`, `is_active`, and `is_anonymous` members of the user class are now properties, not methods. Applications should update their user classes accordingly. {quote} Airflow isn't working with flask_login >= v0.3.0 either. dud was: Hello I've notice that Airflow v1.8.2 isn't playing well with flask_login v0.4.0 : {code} $ pip install --user flask_login==0.4.0 Collecting flask_login==0.4.0 Using cached Flask_Login-0.4.0-py2.py3-none-any.whl Requirement already satisfied: Flask in ./.local/lib/python2.7/site-packages (from flask_login==0.4.0) Requirement already satisfied: itsdangerous>=0.21 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: Jinja2>=2.4 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: Werkzeug>=0.7 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: click>=2.0 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: MarkupSafe>=0.23 in
[jira] [Updated] (AIRFLOW-1565) Incompatibility with Flask-Login-0.4.0
[ https://issues.apache.org/jira/browse/AIRFLOW-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dud updated AIRFLOW-1565: - Description: Hello I've notice that Airflow v1.8.2 isn't playing well with flask_login v0.4.0 : {code} $ pip install --user flask_login==0.4.0 Collecting flask_login==0.4.0 Using cached Flask_Login-0.4.0-py2.py3-none-any.whl Requirement already satisfied: Flask in ./.local/lib/python2.7/site-packages (from flask_login==0.4.0) Requirement already satisfied: itsdangerous>=0.21 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: Jinja2>=2.4 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: Werkzeug>=0.7 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: click>=2.0 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: MarkupSafe>=0.23 in ./.local/lib/python2.7/site-packages (from Jinja2>=2.4->Flask->flask_login==0.4.0) Installing collected packages: flask-login Found existing installation: Flask-Login 0.3.2 Uninstalling Flask-Login-0.3.2: Successfully uninstalled Flask-Login-0.3.2 Successfully installed flask-login-0.4.0 $ airflow webserver [2017-09-05 14:30:50,130] {dag_processing.py:627} INFO - Started a process (PID: 22751) to generate tasks for /home/airflow/dags/metadsp/APB_process_table/airflow.py - logging into /var/log/airflow/scheduler/2017-09-05 [2017-09-05 14:30:50,133] {dag_processing.py:627} INFO - Started a process (PID: 22752) to generate tasks for /home/airflow/dags/metadsp/APN_segment_manager/segment_manager_factory.py - logging into /var/log/airflow/sc [2017-09-05 14:30:50,134] {jobs.py:1443} INFO - Heartbeating the executor 2017-09-05 14:30:50,192] {__init__.py:57} INFO - Using executor CeleryExecutor [2017-09-05 14:30:50,252] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt [2017-09-05 14:30:50,273] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt /home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py:23: FlaskWTFDeprecationWarning: "flask_wtf.CsrfProtect" has been renamed to "CSRFProtect" and will be removed in 1.0. csrf = CsrfProtect() /home/airflow/.local/lib/python2.7/site-packages/flask_cache/jinja2ext.py:33: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache instead. from flask.ext.cache import make_template_fragment_key _ |__( )_ __/__ / __ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / / ___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ / _/_/ |_/_/ /_//_//_/ \//|__/ Traceback (most recent call last): File "/home/airflow/bin/airflow", line 28, in args.func(args) File "/home/airflow/.local/lib/python2.7/site-packages/airflow/bin/cli.py", line 734, in webserver app = cached_app(conf) File "/home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py", line 161, in cached_app app = create_app(config) File "/home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py", line 60, in create_app from airflow.www import views File "/home/airflow/.local/lib/python2.7/site-packages/airflow/www/views.py", line 41, in from flask_login import flash ImportError: cannot import name flash $ python -c "from flask_login import flash" Traceback (most recent call last): File "", line 1, in ImportError: cannot import name flash {code} -Downgrading to flask_login==0.3.2 fixes the issue.- EDIT : Downgrading to flask_login==0.2.11 fixes the issue. Due to : {quote} Version 0.3.0 - Released on September 10th, 2015 [...] - BREAKING: The `is_authenticated`, `is_active`, and `is_anonymous` members of the user class are now properties, not methods. Applications should update their user classes accordingly. {quote} So for now Airflow isn't working with flask_login >= v0.3.0 either. dud was: Hello I've notice that Airflow v1.8.2 isn't playing well with flask_login v0.4.0 : {code} $ pip install --user flask_login==0.4.0 Collecting flask_login==0.4.0 Using cached Flask_Login-0.4.0-py2.py3-none-any.whl Requirement already satisfied: Flask in ./.local/lib/python2.7/site-packages (from flask_login==0.4.0) Requirement already satisfied: itsdangerous>=0.21 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: Jinja2>=2.4 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: Werkzeug>=0.7 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: click>=2.0 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: MarkupSafe>=0.23 in
[jira] [Created] (AIRFLOW-1566) 'airflow.www.gunicorn_config doesn't exist
dud created AIRFLOW-1566: Summary: 'airflow.www.gunicorn_config doesn't exist Key: AIRFLOW-1566 URL: https://issues.apache.org/jira/browse/AIRFLOW-1566 Project: Apache Airflow Issue Type: Bug Reporter: dud Fix For: 1.8.2 Hello starting Airflow v1.8.2 webserver yields this error : {code} $ airflow webserver --pid /run/airflow/webserver.pid -hn localhost /home/airflow/.local/lib/python2.7/site-packages/airflow/configuration.py:540: DeprecationWarning: This method will be removed in future versions. Use 'parser.read_file()' instead. self.readfp(StringIO.StringIO(string)) [2017-09-05 15:19:05,012] {__init__.py:57} INFO - Using executor CeleryExecutor [2017-09-05 15:19:05,063] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt [2017-09-05 15:19:05,083] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt /home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py:23: FlaskWTFDeprecationWarning: "flask_wtf.CsrfProtect" has been renamed to "CSRFProtect" and will be removed in 1.0. csrf = CsrfProtect() _ |__( )_ __/__ / __ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / / ___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ / _/_/ |_/_/ /_//_//_/ \//|__/ /home/airflow/.local/lib/python2.7/site-packages/flask_cache/jinja2ext.py:33: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache instead. from flask.ext.cache import make_template_fragment_key [2017-09-05 15:19:05,387] [26152] {models.py:168} INFO - Filling up the DagBag from /home/airflow/dags Running the Gunicorn Server with: Workers: 4 sync Host: localhost:8080 Timeout: 120 Logfiles: - - = Error: 'airflow.www.gunicorn_config' doesn't exist {code} The use of this file has been introduced by this [commit|https://github.com/apache/incubator-airflow/commit/9d254a317dd54f555270ca568aff1cd0500e1e53] authored by [~xuanji] dud -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (AIRFLOW-1565) Incompatibility with Flask-Login-0.4.0
dud created AIRFLOW-1565: Summary: Incompatibility with Flask-Login-0.4.0 Key: AIRFLOW-1565 URL: https://issues.apache.org/jira/browse/AIRFLOW-1565 Project: Apache Airflow Issue Type: Bug Affects Versions: 1.8.2 Reporter: dud Priority: Minor Hello I've notice that Airflow v1.8.2 isn't playing well with flask_login v0.4.0 : {code} $ pip install --user flask_login==0.4.0 Collecting flask_login==0.4.0 Using cached Flask_Login-0.4.0-py2.py3-none-any.whl Requirement already satisfied: Flask in ./.local/lib/python2.7/site-packages (from flask_login==0.4.0) Requirement already satisfied: itsdangerous>=0.21 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: Jinja2>=2.4 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: Werkzeug>=0.7 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: click>=2.0 in ./.local/lib/python2.7/site-packages (from Flask->flask_login==0.4.0) Requirement already satisfied: MarkupSafe>=0.23 in ./.local/lib/python2.7/site-packages (from Jinja2>=2.4->Flask->flask_login==0.4.0) Installing collected packages: flask-login Found existing installation: Flask-Login 0.3.2 Uninstalling Flask-Login-0.3.2: Successfully uninstalled Flask-Login-0.3.2 Successfully installed flask-login-0.4.0 $ airflow webserver [2017-09-05 14:30:50,130] {dag_processing.py:627} INFO - Started a process (PID: 22751) to generate tasks for /home/airflow/dags/metadsp/APB_process_table/airflow.py - logging into /var/log/airflow/scheduler/2017-09-05 [2017-09-05 14:30:50,133] {dag_processing.py:627} INFO - Started a process (PID: 22752) to generate tasks for /home/airflow/dags/metadsp/APN_segment_manager/segment_manager_factory.py - logging into /var/log/airflow/sc [2017-09-05 14:30:50,134] {jobs.py:1443} INFO - Heartbeating the executor 2017-09-05 14:30:50,192] {__init__.py:57} INFO - Using executor CeleryExecutor [2017-09-05 14:30:50,252] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt [2017-09-05 14:30:50,273] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt /home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py:23: FlaskWTFDeprecationWarning: "flask_wtf.CsrfProtect" has been renamed to "CSRFProtect" and will be removed in 1.0. csrf = CsrfProtect() /home/airflow/.local/lib/python2.7/site-packages/flask_cache/jinja2ext.py:33: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache instead. from flask.ext.cache import make_template_fragment_key _ |__( )_ __/__ / __ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / / ___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ / _/_/ |_/_/ /_//_//_/ \//|__/ Traceback (most recent call last): File "/home/airflow/bin/airflow", line 28, in args.func(args) File "/home/airflow/.local/lib/python2.7/site-packages/airflow/bin/cli.py", line 734, in webserver app = cached_app(conf) File "/home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py", line 161, in cached_app app = create_app(config) File "/home/airflow/.local/lib/python2.7/site-packages/airflow/www/app.py", line 60, in create_app from airflow.www import views File "/home/airflow/.local/lib/python2.7/site-packages/airflow/www/views.py", line 41, in from flask_login import flash ImportError: cannot import name flash $ python -c "from flask_login import flash" Traceback (most recent call last): File "", line 1, in ImportError: cannot import name flash {code} Downgrading to flask_login==0.3.2 fixes the issue. dud -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AIRFLOW-121) Documenting dag doc_md feature
[ https://issues.apache.org/jira/browse/AIRFLOW-121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15663602#comment-15663602 ] dud commented on AIRFLOW-121: - Hello. I think you should file a new bug issue with an example DAG that triggers this bug. Please also specify which version you are running. > Documenting dag doc_md feature > -- > > Key: AIRFLOW-121 > URL: https://issues.apache.org/jira/browse/AIRFLOW-121 > Project: Apache Airflow > Issue Type: Improvement > Components: docs >Reporter: dud >Assignee: dud >Priority: Trivial > > Dear Airflow Maintainers, > I added a note about DAG documentation. > I'd be glad if my PR would be merged : > https://github.com/apache/incubator-airflow/pull/1493 > Regards > dud -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AIRFLOW-469) Add MFG Labs as Airflow user
[ https://issues.apache.org/jira/browse/AIRFLOW-469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dud updated AIRFLOW-469: Description: Hello. We'd like to be added as official Airflow users. Regards GH PR : https://github.com/apache/incubator-airflow/pull/1768 was: Hello. We'd like to be added as official Airflow users. Regards > Add MFG Labs as Airflow user > > > Key: AIRFLOW-469 > URL: https://issues.apache.org/jira/browse/AIRFLOW-469 > Project: Apache Airflow > Issue Type: Bug >Reporter: dud >Priority: Trivial > > Hello. > We'd like to be added as official Airflow users. > Regards > GH PR : https://github.com/apache/incubator-airflow/pull/1768 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AIRFLOW-312) Airflow is not able to load hooks
[ https://issues.apache.org/jira/browse/AIRFLOW-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15364495#comment-15364495 ] dud commented on AIRFLOW-312: - Right now I was just fighting to get BigQueryToCloudStorageOperator working and you just helped me to get out of this mess by pointing me out airflow[xxx]. Thanks again. > Airflow is not able to load hooks > - > > Key: AIRFLOW-312 > URL: https://issues.apache.org/jira/browse/AIRFLOW-312 > Project: Apache Airflow > Issue Type: Improvement >Affects Versions: Airflow 1.7.1.2, Airflow 1.7.1.3 >Reporter: dud > > Hello > I installed latest version of Airflow via pip and I noticed that it is now > unable to use any hook : > {code} > pip install --user -U pip && pip install --user -U airflow > python /home/airflow/hooks.py > [2016-07-04 16:30:31,973] {__init__.py:36} INFO - Using executor > SequentialExecutor > [2016-07-04 16:30:32,044] {driver.py:120} INFO - Generating grammar tables > from /usr/lib/python2.7/lib2to3/Grammar.txt > [2016-07-04 16:30:32,064] {driver.py:120} INFO - Generating grammar tables > from /usr/lib/python2.7/lib2to3/PatternGrammar.txt > Traceback (most recent call last): > File "/home/airflow/hooks.py", line 2, in > hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id) > AttributeError: 'module' object has no attribute 'S3Hook' > {code} > Here is the content of this script : > {code} > from airflow import hooks > hook = hooks.S3Hook(s3_conn_id="test value") > {code} > Installing boto solves the issue. > Not sure if there isn't a configuration issue on my system, however maybe > boto should be added as a dependency ? > dud -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AIRFLOW-312) Airflow is not able to load hooks
[ https://issues.apache.org/jira/browse/AIRFLOW-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15364488#comment-15364488 ] dud commented on AIRFLOW-312: - OK, my bad. Thanks for your explanation. > Airflow is not able to load hooks > - > > Key: AIRFLOW-312 > URL: https://issues.apache.org/jira/browse/AIRFLOW-312 > Project: Apache Airflow > Issue Type: Improvement >Affects Versions: Airflow 1.7.1.2, Airflow 1.7.1.3 >Reporter: dud > > Hello > I installed latest version of Airflow via pip and I noticed that it is now > unable to use any hook : > {code} > pip install --user -U pip && pip install --user -U airflow > python /home/airflow/hooks.py > [2016-07-04 16:30:31,973] {__init__.py:36} INFO - Using executor > SequentialExecutor > [2016-07-04 16:30:32,044] {driver.py:120} INFO - Generating grammar tables > from /usr/lib/python2.7/lib2to3/Grammar.txt > [2016-07-04 16:30:32,064] {driver.py:120} INFO - Generating grammar tables > from /usr/lib/python2.7/lib2to3/PatternGrammar.txt > Traceback (most recent call last): > File "/home/airflow/hooks.py", line 2, in > hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id) > AttributeError: 'module' object has no attribute 'S3Hook' > {code} > Here is the content of this script : > {code} > from airflow import hooks > hook = hooks.S3Hook(s3_conn_id="test value") > {code} > Installing boto solves the issue. > Not sure if there isn't a configuration issue on my system, however maybe > boto should be added as a dependency ? > dud -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AIRFLOW-312) Airflow is not able to load hooks
[ https://issues.apache.org/jira/browse/AIRFLOW-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dud updated AIRFLOW-312: Priority: Major (was: Blocker) Description: Hello I installed latest version of Airflow via pip and I noticed that it is now unable to use any hook : {code} pip install --user -U pip && pip install --user -U airflow python /home/airflow/hooks.py [2016-07-04 16:30:31,973] {__init__.py:36} INFO - Using executor SequentialExecutor [2016-07-04 16:30:32,044] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt [2016-07-04 16:30:32,064] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt Traceback (most recent call last): File "/home/airflow/hooks.py", line 2, in hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id) AttributeError: 'module' object has no attribute 'S3Hook' {code} Here is the content of this script : {code} from airflow import hooks hook = hooks.S3Hook(s3_conn_id="test value") {code} Installing boto solves the issue. Not sure if there isn't a configuration issue on my system, however maybe boto should be added as a dependency ? dud was: Hello I installed latest version of Airflow via pip and I noticed that it is now unable to use any hook : {code} pip install --user -U pip && pip install --user -U airflow python /home/airflow/hooks.py [2016-07-04 16:30:31,973] {__init__.py:36} INFO - Using executor SequentialExecutor [2016-07-04 16:30:32,044] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt [2016-07-04 16:30:32,064] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt Traceback (most recent call last): File "/home/airflow/hooks.py", line 2, in hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id) AttributeError: 'module' object has no attribute 'S3Hook' {code} Here is the content of this script : {code} from airflow import hooks hook = hooks.S3Hook(s3_conn_id="test value") {code} I tried with 1.7.1.2 and it fails as well. dud Issue Type: Improvement (was: Bug) > Airflow is not able to load hooks > - > > Key: AIRFLOW-312 > URL: https://issues.apache.org/jira/browse/AIRFLOW-312 > Project: Apache Airflow > Issue Type: Improvement >Affects Versions: Airflow 1.7.1.2, Airflow 1.7.1.3 >Reporter: dud > > Hello > I installed latest version of Airflow via pip and I noticed that it is now > unable to use any hook : > {code} > pip install --user -U pip && pip install --user -U airflow > python /home/airflow/hooks.py > [2016-07-04 16:30:31,973] {__init__.py:36} INFO - Using executor > SequentialExecutor > [2016-07-04 16:30:32,044] {driver.py:120} INFO - Generating grammar tables > from /usr/lib/python2.7/lib2to3/Grammar.txt > [2016-07-04 16:30:32,064] {driver.py:120} INFO - Generating grammar tables > from /usr/lib/python2.7/lib2to3/PatternGrammar.txt > Traceback (most recent call last): > File "/home/airflow/hooks.py", line 2, in > hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id) > AttributeError: 'module' object has no attribute 'S3Hook' > {code} > Here is the content of this script : > {code} > from airflow import hooks > hook = hooks.S3Hook(s3_conn_id="test value") > {code} > Installing boto solves the issue. > Not sure if there isn't a configuration issue on my system, however maybe > boto should be added as a dependency ? > dud -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AIRFLOW-312) Airflow is not able to load hooks
[ https://issues.apache.org/jira/browse/AIRFLOW-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dud updated AIRFLOW-312: Description: Hello I installed latest version of Airflow via pip and I noticed that it is now unable to use any hook : {code} pip install --user -U pip && pip install --user -U airflow python /home/airflow/hooks.py [2016-07-04 16:30:31,973] {__init__.py:36} INFO - Using executor SequentialExecutor [2016-07-04 16:30:32,044] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt [2016-07-04 16:30:32,064] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt Traceback (most recent call last): File "/home/airflow/hooks.py", line 2, in hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id) AttributeError: 'module' object has no attribute 'S3Hook' {code} Here is the content of this script : {code} from airflow import hooks hook = hooks.S3Hook(s3_conn_id="test value") {code} I tried with 1.7.1.2 and it fails as well. dud was: Hello I installed latest version of Airflow via pip and I noticed that it is now unable to use any hook : {code} pip install --user -U pip && pip install --user -U airflow python /home/airflow/hooks.py [2016-07-04 16:30:31,973] {__init__.py:36} INFO - Using executor SequentialExecutor [2016-07-04 16:30:32,044] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt [2016-07-04 16:30:32,064] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt Traceback (most recent call last): File "/home/airflow/hooks.py", line 2, in hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id) AttributeError: 'module' object has no attribute 'S3Hook' {code} Here is the content of this script : {code} from airflow import hooks hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id) {code} I tried with 1.7.1.2 and it fails as well. dud > Airflow is not able to load hooks > - > > Key: AIRFLOW-312 > URL: https://issues.apache.org/jira/browse/AIRFLOW-312 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: Airflow 1.7.1.2, Airflow 1.7.1.3 >Reporter: dud >Priority: Blocker > > Hello > I installed latest version of Airflow via pip and I noticed that it is now > unable to use any hook : > {code} > pip install --user -U pip && pip install --user -U airflow > python /home/airflow/hooks.py > [2016-07-04 16:30:31,973] {__init__.py:36} INFO - Using executor > SequentialExecutor > [2016-07-04 16:30:32,044] {driver.py:120} INFO - Generating grammar tables > from /usr/lib/python2.7/lib2to3/Grammar.txt > [2016-07-04 16:30:32,064] {driver.py:120} INFO - Generating grammar tables > from /usr/lib/python2.7/lib2to3/PatternGrammar.txt > Traceback (most recent call last): > File "/home/airflow/hooks.py", line 2, in > hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id) > AttributeError: 'module' object has no attribute 'S3Hook' > {code} > Here is the content of this script : > {code} > from airflow import hooks > hook = hooks.S3Hook(s3_conn_id="test value") > {code} > I tried with 1.7.1.2 and it fails as well. > dud -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AIRFLOW-312) Airflow is not able to load hooks
dud created AIRFLOW-312: --- Summary: Airflow is not able to load hooks Key: AIRFLOW-312 URL: https://issues.apache.org/jira/browse/AIRFLOW-312 Project: Apache Airflow Issue Type: Bug Affects Versions: Airflow 1.7.1.2, Airflow 1.7.1.3 Reporter: dud Priority: Blocker Hello I installed latest version of Airflow via pip and I noticed that it is now unable to use any hook : {code} pip install --user -U pip && pip install --user -U airflow python /home/airflow/hooks.py [2016-07-04 16:30:31,973] {__init__.py:36} INFO - Using executor SequentialExecutor [2016-07-04 16:30:32,044] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt [2016-07-04 16:30:32,064] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt Traceback (most recent call last): File "/home/airflow/hooks.py", line 2, in hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id) AttributeError: 'module' object has no attribute 'S3Hook' {code} Here is the content of this script : {code} from airflow import hooks hook = hooks.S3Hook(s3_conn_id=self.s3_conn_id) {code} I tried with 1.7.1.2 and it fails as well. dud -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AIRFLOW-249) Refactor the SLA mechanism
[ https://issues.apache.org/jira/browse/AIRFLOW-249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dud updated AIRFLOW-249: Summary: Refactor the SLA mechanism (was: Refactor the SLA mecanism) > Refactor the SLA mechanism > -- > > Key: AIRFLOW-249 > URL: https://issues.apache.org/jira/browse/AIRFLOW-249 > Project: Apache Airflow > Issue Type: Improvement >Reporter: dud >Assignee: dud > > Hello > I've noticed the SLA feature is currently behaving as follow : > - it doesn't work on DAG scheduled @once or None because they have no > dag.followwing_schedule property > - it keeps endlessly checking for SLA misses without ever worrying about any > end_date. Worse I noticed that emails are still being sent for runs that are > never happening because of end_date > - it keeps checking for recent TIs even if SLA notification has been already > been sent for them > - the SLA logic is only being fired after following_schedule + sla has > elapsed, in other words one has to wait for the next TI before having a > chance of getting any email. Also the email reports dag.following_schedule > time (I guess because it is close of TI.start_date), but unfortunately that > doesn't match what the task instances shows nor the log filename > - the SLA logic is based on max(TI.execution_date) for the starting point of > its checks, that means that for a DAG whose SLA is longer than its schedule > period if half of the TIs are running longer than expected it will go > unnoticed. This could be demonstrated with a DAG like this one : > {code} > from airflow import DAG > from airflow.operators import * > from datetime import datetime, timedelta > from time import sleep > default_args = { > 'owner': 'airflow', > 'depends_on_past': False, > 'start_date': datetime(2016, 6, 16, 12, 20), > 'email': my_email > 'sla': timedelta(minutes=2), > } > dag = DAG('unnoticed_sla', default_args=default_args, > schedule_interval=timedelta(minutes=1)) > def alternating_sleep(**kwargs): > minute = kwargs['execution_date'].strftime("%M") > is_odd = int(minute) % 2 > if is_odd: > sleep(300) > else: > sleep(10) > return True > PythonOperator( > task_id='sla_miss', > python_callable=alternating_sleep, > provide_context=True, > dag=dag) > {code} > I've tried to rework the SLA triggering mechanism by addressing the above > points., please [have a look on > it|https://github.com/dud225/incubator-airflow/commit/972260354075683a8d55a1c960d839c37e629e7d] > I made some tests with this patch : > - the fluctuent DAG shown above no longer make Airflow skip any SLA event : > {code} > task_id |dag_id | execution_date| email_sent | > timestamp | description | notification_sent > --+---+-+++-+--- > sla_miss | dag_sla_miss1 | 2016-06-16 15:05:00 | t | 2016-06-16 > 15:08:26.058631 | | t > sla_miss | dag_sla_miss1 | 2016-06-16 15:07:00 | t | 2016-06-16 > 15:10:06.093253 | | t > sla_miss | dag_sla_miss1 | 2016-06-16 15:09:00 | t | 2016-06-16 > 15:12:06.241773 | | t > {code} > - on a normal DAG, the SLA is being triggred more quickly : > {code} > // start_date = 2016-06-16 15:55:00 > // end_date = 2016-06-16 16:00:00 > // schedule_interval = timedelta(minutes=1) > // sla = timedelta(minutes=2) > task_id |dag_id | execution_date| email_sent | > timestamp | description | notification_sent > --+---+-+++-+--- > sla_miss | dag_sla_miss1 | 2016-06-16 15:55:00 | t | 2016-06-16 > 15:58:11.832299 | | t > sla_miss | dag_sla_miss1 | 2016-06-16 15:56:00 | t | 2016-06-16 > 15:59:09.663778 | | t > sla_miss | dag_sla_miss1 | 2016-06-16 15:57:00 | t | 2016-06-16 > 16:00:13.651422 | | t > sla_miss | dag_sla_miss1 | 2016-06-16 15:58:00 | t | 2016-06-16 > 16:01:08.576399 | | t > sla_miss | dag_sla_miss1 | 2016-06-16 15:59:00 | t | 2016-06-16 > 16:02:08.523486 | | t > sla_miss | dag_sla_miss1 | 2016-06-16 16:00:00 | t | 2016-06-16 > 16:03:08.538593 | | t > (6 rows) > {code} > than before (current master branch) : > {code} > // start_date = 2016-06-16 15:40:00 > // end_date = 2016-06-16 15:45:00 > // schedule_interval = timedelta(minutes=1) > // sla = timedelta(minutes=2) > task_id |dag_id | execution_date| email_sent | > timestamp | description | notification_sent >
[jira] [Created] (AIRFLOW-249) Refactor the SLA mecanism
dud created AIRFLOW-249: --- Summary: Refactor the SLA mecanism Key: AIRFLOW-249 URL: https://issues.apache.org/jira/browse/AIRFLOW-249 Project: Apache Airflow Issue Type: Improvement Reporter: dud Hello I've noticed the SLA feature is currently behaving as follow : - it doesn't work on DAG scheduled @once or None because they have no dag.followwing_schedule property - it keeps endlessly checking for SLA misses without ever worrying about any end_date. Worse I noticed that emails are still being sent for runs that are never happening because of end_date - it keeps checking for recent TIs even if SLA notification has been already been sent for them - the SLA logic is only being fired after following_schedule + sla has elapsed, in other words one has to wait for the next TI before having a chance of getting any email. Also the email reports dag.following_schedule time (I guess because it is close of TI.start_date), but unfortunately that doesn't match what the task instances shows nor the log filename - the SLA logic is based on max(TI.execution_date) for the starting point of its checks, that means that for a DAG whose SLA is longer than its schedule period if half of the TIs are running longer than expected it will go unnoticed. This could be demonstrated with a DAG like this one : {code} from airflow import DAG from airflow.operators import * from datetime import datetime, timedelta from time import sleep default_args = { 'owner': 'airflow', 'depends_on_past': False, 'start_date': datetime(2016, 6, 16, 12, 20), 'email': my_email 'sla': timedelta(minutes=2), } dag = DAG('unnoticed_sla', default_args=default_args, schedule_interval=timedelta(minutes=1)) def alternating_sleep(**kwargs): minute = kwargs['execution_date'].strftime("%M") is_odd = int(minute) % 2 if is_odd: sleep(300) else: sleep(10) return True PythonOperator( task_id='sla_miss', python_callable=alternating_sleep, provide_context=True, dag=dag) {code} I've tried to rework the SLA triggering mechanism by addressing the above points., please [have a look on it|https://github.com/dud225/incubator-airflow/commit/972260354075683a8d55a1c960d839c37e629e7d] I made some tests with this patch : - the fluctuent DAG shown above no longer make Airflow skip any SLA event : {code} task_id |dag_id | execution_date| email_sent | timestamp | description | notification_sent --+---+-+++-+--- sla_miss | dag_sla_miss1 | 2016-06-16 15:05:00 | t | 2016-06-16 15:08:26.058631 | | t sla_miss | dag_sla_miss1 | 2016-06-16 15:07:00 | t | 2016-06-16 15:10:06.093253 | | t sla_miss | dag_sla_miss1 | 2016-06-16 15:09:00 | t | 2016-06-16 15:12:06.241773 | | t {code} - on a normal DAG, the SLA is being triggred more quickly : {code} // start_date = 2016-06-16 15:55:00 // end_date = 2016-06-16 16:00:00 // schedule_interval = timedelta(minutes=1) // sla = timedelta(minutes=2) task_id |dag_id | execution_date| email_sent | timestamp | description | notification_sent --+---+-+++-+--- sla_miss | dag_sla_miss1 | 2016-06-16 15:55:00 | t | 2016-06-16 15:58:11.832299 | | t sla_miss | dag_sla_miss1 | 2016-06-16 15:56:00 | t | 2016-06-16 15:59:09.663778 | | t sla_miss | dag_sla_miss1 | 2016-06-16 15:57:00 | t | 2016-06-16 16:00:13.651422 | | t sla_miss | dag_sla_miss1 | 2016-06-16 15:58:00 | t | 2016-06-16 16:01:08.576399 | | t sla_miss | dag_sla_miss1 | 2016-06-16 15:59:00 | t | 2016-06-16 16:02:08.523486 | | t sla_miss | dag_sla_miss1 | 2016-06-16 16:00:00 | t | 2016-06-16 16:03:08.538593 | | t (6 rows) {code} than before (current master branch) : {code} // start_date = 2016-06-16 15:40:00 // end_date = 2016-06-16 15:45:00 // schedule_interval = timedelta(minutes=1) // sla = timedelta(minutes=2) task_id |dag_id | execution_date| email_sent | timestamp | description | notification_sent --+---+-+++-+--- sla_miss | dag_sla_miss1 | 2016-06-16 15:41:00 | t | 2016-06-16 15:44:30.305287 | | t sla_miss | dag_sla_miss1 | 2016-06-16 15:42:00 | t | 2016-06-16 15:45:35.372118 | | t sla_miss | dag_sla_miss1 | 2016-06-16 15:43:00 | t | 2016-06-16 15:46:30.415744 | | t sla_miss | dag_sla_miss1 | 2016-06-16
[jira] [Resolved] (AIRFLOW-140) DagRun state not updated
[ https://issues.apache.org/jira/browse/AIRFLOW-140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dud resolved AIRFLOW-140. - Resolution: Fixed This issue has been solved by commit [b18c9959142f3f1e2cb031c8709225af01192e32|https://github.com/apache/incubator-airflow/commit/b18c9959142f3f1e2cb031c8709225af01192e32]. [~bolke] many thanks for your work :) > DagRun state not updated > > > Key: AIRFLOW-140 > URL: https://issues.apache.org/jira/browse/AIRFLOW-140 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler > Environment: Airflow latest Git version >Reporter: dud >Priority: Minor > > Hello > I've noticed a strange behaviour : when launching a DAG whose task execution > duration is alternatingly slower and longer, DagRun state is only updated if > all previous DagRuns have ended. > Here is DAG that can trigger this behaviour : > {code} > from airflow import DAG > from airflow.operators import * > from datetime import datetime, timedelta > from time import sleep > default_args = { > 'owner': 'airflow', > 'depends_on_past': False, > 'start_date': datetime(2016, 5, 19, 10, 15), > 'end_date': datetime(2016, 5, 19, 10, 20), > } > dag = DAG('dagrun_not_updated', default_args=default_args, > schedule_interval=timedelta(minutes=1)) > def alternating_sleep(**kwargs): > minute = kwargs['execution_date'].strftime("%M") > is_odd = int(minute) % 2 > if is_odd: > sleep(300) > else: > sleep(10) > return True > PythonOperator( > task_id='alt_sleep', > python_callable=alternating_sleep, > provide_context=True, > dag=dag) > {code} > When this operator is executed, being run at an even minute makes the TI runs > faster than an odd one. > I'm observing the following behaviour : > - after some time, the second DagRun is still i running state despites it has > ended for a while : > {code} > airflow=> SELECT * FROM task_instance WHERE dag_id = :dag_id ORDER BY > execution_date ; SELECT * FROM dag_run WHERE dag_id = :dag_id ; > task_id | dag_id | execution_date| start_date > | end_date | duration | state | try_number | > hostname | unixname | job_id | pool | queue | priority_weight | > operator| queued_dttm > --+---+-+++---+-++---+--++--+-+-++- > alt_sleep | dagrun_not_updated | 2016-05-19 10:15:00 | 2016-05-19 > 10:17:19.039565 || | running | > 1 | localhost | airflow | 3196 | | default | 1 | > PythonOperator | > alt_sleep | dagrun_not_updated | 2016-05-19 10:16:00 | 2016-05-19 > 10:17:23.698928 | 2016-05-19 10:17:33.823066 | 10.124138 | success | > 1 | localhost | airflow | 3197 | | default | 1 | > PythonOperator | > alt_sleep | dagrun_not_updated | 2016-05-19 10:17:00 | 2016-05-19 > 10:18:03.025546 || | running | > 1 | localhost | airflow | 3198 | | default | 1 | > PythonOperator | > (3 rows) > id | dag_id | execution_date| state | > run_id | external_trigger | conf | end_date | start_date > > --+---+-+-++--+--+--+ > 1479 | dagrun_not_updated | 2016-05-19 10:15:00 | running | > scheduled__2016-05-19T10:15:00 | f| | | > 2016-05-19 10:17:06.563842 > 1480 | dagrun_not_updated | 2016-05-19 10:16:00 | running | > scheduled__2016-05-19T10:16:00 | f| | | > 2016-05-19 10:17:12.188781 > 1481 | dagrun_not_updated | 2016-05-19 10:17:00 | running | > scheduled__2016-05-19T10:17:00 | f| | | > 2016-05-19 10:18:01.550625 > (3 rows) > {code} > - afer some time, all reportedly still running DagRuns are being marked as > successful at the same time : > {code} > 2016-05-19 10:23:11 UTC [12073-18] airflow@airflow LOG: duration: 0.168 ms > statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1479 > 2016-05-19 10:23:11 UTC [12073-19] airflow@airflow LOG: duration: 0.106 ms > statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1480 > 2016-05-19 10:23:11 UTC [12073-20] airflow@airflow LOG: duration: 0.083 ms > statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1481 > 2016-05-19 10:23:11 UTC [12073-21] airflow@airflow LOG: duration: 0.081 ms > statement: UPDATE dag_run SET state='success' WHERE
[jira] [Created] (AIRFLOW-223) Make parametrable the IP on which Flower binds to
dud created AIRFLOW-223: --- Summary: Make parametrable the IP on which Flower binds to Key: AIRFLOW-223 URL: https://issues.apache.org/jira/browse/AIRFLOW-223 Project: Apache Airflow Issue Type: Improvement Components: celery, webserver Reporter: dud Priority: Minor Hello I'd be glad if Flower could be set to bind to localhost only as could be done for Gunicorn. dud -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (AIRFLOW-140) DagRun state not updated
[ https://issues.apache.org/jira/browse/AIRFLOW-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292963#comment-15292963 ] dud edited comment on AIRFLOW-140 at 5/23/16 9:49 AM: -- Hello. I tried with the LocalExecutor as requested and I observed the same behaviour : {code} airflow=> SELECT * FROM task_instance WHERE dag_id = :dag_id ORDER BY execution_date ; SELECT * FROM dag_run WHERE dag_id = :dag_id ; SELECT * FROM job ORDER BY start_date DESC LIMIT 5; task_id | dag_id | execution_date| start_date | end_date | duration | state | try_number | hostname | unixname | job_id | pool | queue | priority_weight |operator| queued_dttm --+---+-+++---+-++---+--++--+-+-++- alt_sleep | dagrun_not_updated | 2016-05-20 07:45:00 | 2016-05-20 07:46:54.372843 || | running | 1 | localhost | airflow | 3203 | | default | 1 | PythonOperator | alt_sleep | dagrun_not_updated | 2016-05-20 07:46:00 | 2016-05-20 07:47:19.317705 | 2016-05-20 07:47:29.453316 | 10.135611 | success | 1 | localhost | airflow | 3204 | | default | 1 | PythonOperator | alt_sleep | dagrun_not_updated | 2016-05-20 07:47:00 | 2016-05-20 07:48:01.724885 || | running | 1 | localhost | airflow | 3205 | | default | 1 | PythonOperator | alt_sleep | dagrun_not_updated | 2016-05-20 07:48:00 | 2016-05-20 07:49:12.031225 | 2016-05-20 07:49:22.083763 | 10.052538 | success | 1 | localhost | airflow | 3206 | | default | 1 | PythonOperator | (4 rows) id |dag_id | execution_date| state | run_id | external_trigger | conf | end_date | start_date --+---+-+-++--+--+--+ 1485 | dagrun_not_updated | 2016-05-20 07:45:00 | running | scheduled__2016-05-20T07:45:00 | f| | | 2016-05-20 07:46:38.30924 1486 | dagrun_not_updated | 2016-05-20 07:46:00 | running | scheduled__2016-05-20T07:46:00 | f| | | 2016-05-20 07:47:01.563541 1487 | dagrun_not_updated | 2016-05-20 07:47:00 | running | scheduled__2016-05-20T07:47:00 | f| | | 2016-05-20 07:48:00.016718 1488 | dagrun_not_updated | 2016-05-20 07:48:00 | running | scheduled__2016-05-20T07:48:00 | f| | | 2016-05-20 07:49:00.203204 (4 rows) id | dag_id | state | job_type | start_date | end_date | latest_heartbeat | executor_class | hostname | unixname --++-+--+++++---+-- 3206 || success | LocalTaskJob | 2016-05-20 07:49:08.691714 | 2016-05-20 07:49:23.706144 | 2016-05-20 07:49:08.691725 | LocalExecutor | localhost | airflow 3205 || running | LocalTaskJob | 2016-05-20 07:48:01.155988 | | 2016-05-20 07:50:51.312164 | LocalExecutor | localhost | airflow 3204 || success | LocalTaskJob | 2016-05-20 07:47:16.153078 | 2016-05-20 07:47:31.168997 | 2016-05-20 07:47:16.153091 | LocalExecutor | localhost | airflow 3203 || running | LocalTaskJob | 2016-05-20 07:46:48.198379 | | 2016-05-20 07:50:53.42636 | LocalExecutor | localhost | airflow 3202 || running | SchedulerJob | 2016-05-20 07:45:31.43799 | | 2016-05-20 07:50:55.061958 | LocalExecutor | localhost | airflow {code} Extract of database logs : {code} 2016-05-20 07:47:31 UTC [24003-36] airflow@airflow LOG: duration: 38.731 ms statement: UPDATE job SET state='success', end_date='2016-05-20T07:47:31.168997'::timestamp, latest_heartbeat='2016-05-20T07:47:16.153091'::timestamp WHERE job.id = 3204 2016-05-20 07:49:23 UTC [24107-36] airflow@airflow LOG: duration: 0.179 ms statement: UPDATE job SET state='success', end_date='2016-05-20T07:49:23.706144'::timestamp, latest_heartbeat='2016-05-20T07:49:08.691725'::timestamp WHERE job.id = 3206 2016-05-20 07:52:03 UTC [23971-336] airflow@airflow LOG: duration: 0.291 ms statement: UPDATE job SET state='success', end_date='2016-05-20T07:52:03.526927'::timestamp, latest_heartbeat='2016-05-20T07:46:48.198389'::timestamp WHERE job.id = 3203 2016-05-20 07:53:06 UTC [24047-326] airflow@airflow LOG: duration: 0.179
[jira] [Comment Edited] (AIRFLOW-140) DagRun state not updated
[ https://issues.apache.org/jira/browse/AIRFLOW-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292963#comment-15292963 ] dud edited comment on AIRFLOW-140 at 5/20/16 8:02 AM: -- Hello. I tried with the LocalExecutor as requested and I observed the same behaviour : {code} airflow=> SELECT * FROM task_instance WHERE dag_id = :dag_id ORDER BY execution_date ; SELECT * FROM dag_run WHERE dag_id = :dag_id ; SELECT * FROM job ORDER BY start_date DESC LIMIT 5; task_id | dag_id | execution_date| start_date | end_date | duration | state | try_number | hostname | unixname | job_id | pool | queue | priority_weight |operator| queued_dttm --+---+-+++---+-++---+--++--+-+-++- alt_sleep | dagrun_not_updated | 2016-05-20 07:45:00 | 2016-05-20 07:46:54.372843 || | running | 1 | localhost | airflow | 3203 | | default | 1 | PythonOperator | alt_sleep | dagrun_not_updated | 2016-05-20 07:46:00 | 2016-05-20 07:47:19.317705 | 2016-05-20 07:47:29.453316 | 10.135611 | success | 1 | localhost | airflow | 3204 | | default | 1 | PythonOperator | alt_sleep | dagrun_not_updated | 2016-05-20 07:47:00 | 2016-05-20 07:48:01.724885 || | running | 1 | localhost | airflow | 3205 | | default | 1 | PythonOperator | alt_sleep | dagrun_not_updated | 2016-05-20 07:48:00 | 2016-05-20 07:49:12.031225 | 2016-05-20 07:49:22.083763 | 10.052538 | success | 1 | localhost | airflow | 3206 | | default | 1 | PythonOperator | (4 rows) id |dag_id | execution_date| state | run_id | external_trigger | conf | end_date | start_date --+---+-+-++--+--+--+ 1485 | dagrun_not_updated | 2016-05-20 07:45:00 | running | scheduled__2016-05-20T07:45:00 | f| | | 2016-05-20 07:46:38.30924 1486 | dagrun_not_updated | 2016-05-20 07:46:00 | running | scheduled__2016-05-20T07:46:00 | f| | | 2016-05-20 07:47:01.563541 1487 | dagrun_not_updated | 2016-05-20 07:47:00 | running | scheduled__2016-05-20T07:47:00 | f| | | 2016-05-20 07:48:00.016718 1488 | dagrun_not_updated | 2016-05-20 07:48:00 | running | scheduled__2016-05-20T07:48:00 | f| | | 2016-05-20 07:49:00.203204 (4 rows) id | dag_id | state | job_type | start_date | end_date | latest_heartbeat | executor_class | hostname | unixname --++-+--+++++---+-- 3206 || success | LocalTaskJob | 2016-05-20 07:49:08.691714 | 2016-05-20 07:49:23.706144 | 2016-05-20 07:49:08.691725 | LocalExecutor | localhost | airflow 3205 || running | LocalTaskJob | 2016-05-20 07:48:01.155988 | | 2016-05-20 07:50:51.312164 | LocalExecutor | localhost | airflow 3204 || success | LocalTaskJob | 2016-05-20 07:47:16.153078 | 2016-05-20 07:47:31.168997 | 2016-05-20 07:47:16.153091 | LocalExecutor | localhost | airflow 3203 || running | LocalTaskJob | 2016-05-20 07:46:48.198379 | | 2016-05-20 07:50:53.42636 | LocalExecutor | localhost | airflow 3202 || running | SchedulerJob | 2016-05-20 07:45:31.43799 | | 2016-05-20 07:50:55.061958 | LocalExecutor | localhost | airflow {code} Extract of database logs : {code} 2016-05-20 07:47:31 UTC [24003-36] airflow@airflow LOG: duration: 38.731 ms statement: UPDATE job SET state='success', end_date='2016-05-20T07:47:31.168997'::timestamp, latest_heartbeat='2016-05-20T07:47:16.153091'::timestamp WHERE job.id = 3204 2016-05-20 07:49:23 UTC [24107-36] airflow@airflow LOG: duration: 0.179 ms statement: UPDATE job SET state='success', end_date='2016-05-20T07:49:23.706144'::timestamp, latest_heartbeat='2016-05-20T07:49:08.691725'::timestamp WHERE job.id = 3206 2016-05-20 07:52:03 UTC [23971-336] airflow@airflow LOG: duration: 0.291 ms statement: UPDATE job SET state='success', end_date='2016-05-20T07:52:03.526927'::timestamp, latest_heartbeat='2016-05-20T07:46:48.198389'::timestamp WHERE job.id = 3203 2016-05-20 07:53:06 UTC [24047-326] airflow@airflow LOG: duration: 0.179
[jira] [Commented] (AIRFLOW-140) DagRun state not updated
[ https://issues.apache.org/jira/browse/AIRFLOW-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292963#comment-15292963 ] dud commented on AIRFLOW-140: - Hello. I tried with the LocalExecutor as requested and I observed the same behaviour : {code} airflow=> SELECT * FROM task_instance WHERE dag_id = :dag_id ORDER BY execution_date ; SELECT * FROM dag_run WHERE dag_id = :dag_id ; SELECT * FROM job ORDER BY start_date DESC LIMIT 5; task_id | dag_id | execution_date| start_date | end_date | duration | state | try_number | hostname | unixname | job_id | pool | queue | priority_weight |operator| queued_dttm --+---+-+++---+-++---+--++--+-+-++- alt_sleep | dagrun_not_updated | 2016-05-20 07:45:00 | 2016-05-20 07:46:54.372843 || | running | 1 | localhost | airflow | 3203 | | default | 1 | PythonOperator | alt_sleep | dagrun_not_updated | 2016-05-20 07:46:00 | 2016-05-20 07:47:19.317705 | 2016-05-20 07:47:29.453316 | 10.135611 | success | 1 | localhost | airflow | 3204 | | default | 1 | PythonOperator | alt_sleep | dagrun_not_updated | 2016-05-20 07:47:00 | 2016-05-20 07:48:01.724885 || | running | 1 | localhost | airflow | 3205 | | default | 1 | PythonOperator | alt_sleep | dagrun_not_updated | 2016-05-20 07:48:00 | 2016-05-20 07:49:12.031225 | 2016-05-20 07:49:22.083763 | 10.052538 | success | 1 | localhost | airflow | 3206 | | default | 1 | PythonOperator | (4 rows) id |dag_id | execution_date| state | run_id | external_trigger | conf | end_date | start_date --+---+-+-++--+--+--+ 1485 | dagrun_not_updated | 2016-05-20 07:45:00 | running | scheduled__2016-05-20T07:45:00 | f| | | 2016-05-20 07:46:38.30924 1486 | dagrun_not_updated | 2016-05-20 07:46:00 | running | scheduled__2016-05-20T07:46:00 | f| | | 2016-05-20 07:47:01.563541 1487 | dagrun_not_updated | 2016-05-20 07:47:00 | running | scheduled__2016-05-20T07:47:00 | f| | | 2016-05-20 07:48:00.016718 1488 | dagrun_not_updated | 2016-05-20 07:48:00 | running | scheduled__2016-05-20T07:48:00 | f| | | 2016-05-20 07:49:00.203204 (4 rows) id | dag_id | state | job_type | start_date | end_date | latest_heartbeat | executor_class | hostname | unixname --++-+--+++++---+-- 3206 || success | LocalTaskJob | 2016-05-20 07:49:08.691714 | 2016-05-20 07:49:23.706144 | 2016-05-20 07:49:08.691725 | LocalExecutor | localhost | airflow 3205 || running | LocalTaskJob | 2016-05-20 07:48:01.155988 | | 2016-05-20 07:50:51.312164 | LocalExecutor | localhost | airflow 3204 || success | LocalTaskJob | 2016-05-20 07:47:16.153078 | 2016-05-20 07:47:31.168997 | 2016-05-20 07:47:16.153091 | LocalExecutor | localhost | airflow 3203 || running | LocalTaskJob | 2016-05-20 07:46:48.198379 | | 2016-05-20 07:50:53.42636 | LocalExecutor | localhost | airflow 3202 || running | SchedulerJob | 2016-05-20 07:45:31.43799 | | 2016-05-20 07:50:55.061958 | LocalExecutor | localhost | airflow {code} Database logs : {code} 2016-05-20 07:47:31 UTC [24003-36] airflow@airflow LOG: duration: 38.731 ms statement: UPDATE job SET state='success', end_date='2016-05-20T07:47:31.168997'::timestamp, latest_heartbeat='2016-05-20T07:47:16.153091'::timestamp WHERE job.id = 3204 2016-05-20 07:49:23 UTC [24107-36] airflow@airflow LOG: duration: 0.179 ms statement: UPDATE job SET state='success', end_date='2016-05-20T07:49:23.706144'::timestamp, latest_heartbeat='2016-05-20T07:49:08.691725'::timestamp WHERE job.id = 3206 2016-05-20 07:52:03 UTC [23971-336] airflow@airflow LOG: duration: 0.291 ms statement: UPDATE job SET state='success', end_date='2016-05-20T07:52:03.526927'::timestamp, latest_heartbeat='2016-05-20T07:46:48.198389'::timestamp WHERE job.id = 3203 2016-05-20 07:53:06 UTC [24047-326] airflow@airflow LOG: duration: 0.179 ms statement: UPDATE job SET state='success',
[jira] [Commented] (AIRFLOW-140) DagRun state not updated
[ https://issues.apache.org/jira/browse/AIRFLOW-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291115#comment-15291115 ] dud commented on AIRFLOW-140: - Latest version from Git (8ae86814b3cf5cd65e3c6b04c6c700139d8cc20b) > DagRun state not updated > > > Key: AIRFLOW-140 > URL: https://issues.apache.org/jira/browse/AIRFLOW-140 > Project: Apache Airflow > Issue Type: Bug > Environment: Airflow latest Git version >Reporter: dud >Priority: Minor > > Hello > I've noticed a strange behaviour : when launching a DAG whose task execution > duration is alternatingly slower and longer, DagRun state is only updated if > all previous DagRuns have ended. > Here is DAG that can trigger this behaviour : > {code} > from airflow import DAG > from airflow.operators import * > from datetime import datetime, timedelta > from time import sleep > default_args = { > 'owner': 'airflow', > 'depends_on_past': False, > 'start_date': datetime(2016, 5, 19, 10, 15), > 'end_date': datetime(2016, 5, 19, 10, 20), > } > dag = DAG('dagrun_not_updated', default_args=default_args, > schedule_interval=timedelta(minutes=1)) > def alternating_sleep(**kwargs): > minute = kwargs['execution_date'].strftime("%M") > is_odd = int(minute) % 2 > if is_odd: > sleep(300) > else: > sleep(10) > return True > PythonOperator( > task_id='alt_sleep', > python_callable=alternating_sleep, > provide_context=True, > dag=dag) > {code} > When this operator is executed, being run at an even minute makes the TI runs > faster than an odd one. > I'm observing the following behaviour : > - after some time, the second DagRun is still i running state despites it has > ended for a while : > {code} > airflow=> SELECT * FROM task_instance WHERE dag_id = :dag_id ORDER BY > execution_date ; SELECT * FROM dag_run WHERE dag_id = :dag_id ; > task_id | dag_id | execution_date| start_date > | end_date | duration | state | try_number | > hostname | unixname | job_id | pool | queue | priority_weight | > operator| queued_dttm > --+---+-+++---+-++---+--++--+-+-++- > alt_sleep | dagrun_not_updated | 2016-05-19 10:15:00 | 2016-05-19 > 10:17:19.039565 || | running | > 1 | localhost | airflow | 3196 | | default | 1 | > PythonOperator | > alt_sleep | dagrun_not_updated | 2016-05-19 10:16:00 | 2016-05-19 > 10:17:23.698928 | 2016-05-19 10:17:33.823066 | 10.124138 | success | > 1 | localhost | airflow | 3197 | | default | 1 | > PythonOperator | > alt_sleep | dagrun_not_updated | 2016-05-19 10:17:00 | 2016-05-19 > 10:18:03.025546 || | running | > 1 | localhost | airflow | 3198 | | default | 1 | > PythonOperator | > (3 rows) > id | dag_id | execution_date| state | > run_id | external_trigger | conf | end_date | start_date > > --+---+-+-++--+--+--+ > 1479 | dagrun_not_updated | 2016-05-19 10:15:00 | running | > scheduled__2016-05-19T10:15:00 | f| | | > 2016-05-19 10:17:06.563842 > 1480 | dagrun_not_updated | 2016-05-19 10:16:00 | running | > scheduled__2016-05-19T10:16:00 | f| | | > 2016-05-19 10:17:12.188781 > 1481 | dagrun_not_updated | 2016-05-19 10:17:00 | running | > scheduled__2016-05-19T10:17:00 | f| | | > 2016-05-19 10:18:01.550625 > (3 rows) > {code} > - afer some time, all reportedly still running DagRuns are being marked as > successful at the same time : > {code} > 2016-05-19 10:23:11 UTC [12073-18] airflow@airflow LOG: duration: 0.168 ms > statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1479 > 2016-05-19 10:23:11 UTC [12073-19] airflow@airflow LOG: duration: 0.106 ms > statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1480 > 2016-05-19 10:23:11 UTC [12073-20] airflow@airflow LOG: duration: 0.083 ms > statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1481 > 2016-05-19 10:23:11 UTC [12073-21] airflow@airflow LOG: duration: 0.081 ms > statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1482 > {code} > So it waited till the 4th DagRun ended to update the dag_run table. > I've looked at the code I'm not sure whether the issue lies in
[jira] [Created] (AIRFLOW-140) DagRun state not updated
dud created AIRFLOW-140: --- Summary: DagRun state not updated Key: AIRFLOW-140 URL: https://issues.apache.org/jira/browse/AIRFLOW-140 Project: Apache Airflow Issue Type: Bug Environment: Airflow latest Git version Reporter: dud Priority: Minor Hello I've noticed a strange behaviour : when launching a DAG whose task execution duration is alternatingly slower and longer, DagRun state is only updated if all previous DagRuns have ended. Here is DAG that can trigger this behaviour : {code} from airflow import DAG from airflow.operators import * from datetime import datetime, timedelta from time import sleep default_args = { 'owner': 'airflow', 'depends_on_past': False, 'start_date': datetime(2016, 5, 19, 10, 15), 'end_date': datetime(2016, 5, 19, 10, 20), } dag = DAG('dagrun_not_updated', default_args=default_args, schedule_interval=timedelta(minutes=1)) def alternating_sleep(**kwargs): minute = kwargs['execution_date'].strftime("%M") is_odd = int(minute) % 2 if is_odd: sleep(300) else: sleep(10) return True PythonOperator( task_id='alt_sleep', python_callable=alternating_sleep, provide_context=True, dag=dag) {code} When this operator is executed, being run at an even minute makes the TI runs faster than an odd one. I'm observing the following behaviour : - after some time, the second DagRun is still i running state despites it has ended for a while : {code} airflow=> SELECT * FROM task_instance WHERE dag_id = :dag_id ORDER BY execution_date ; SELECT * FROM dag_run WHERE dag_id = :dag_id ; task_id | dag_id | execution_date| start_date | end_date | duration | state | try_number | hostname | unixname | job_id | pool | queue | priority_weight |operator| queued_dttm --+---+-+++---+-++---+--++--+-+-++- alt_sleep | dagrun_not_updated | 2016-05-19 10:15:00 | 2016-05-19 10:17:19.039565 || | running | 1 | localhost | airflow | 3196 | | default | 1 | PythonOperator | alt_sleep | dagrun_not_updated | 2016-05-19 10:16:00 | 2016-05-19 10:17:23.698928 | 2016-05-19 10:17:33.823066 | 10.124138 | success | 1 | localhost | airflow | 3197 | | default | 1 | PythonOperator | alt_sleep | dagrun_not_updated | 2016-05-19 10:17:00 | 2016-05-19 10:18:03.025546 || | running | 1 | localhost | airflow | 3198 | | default | 1 | PythonOperator | (3 rows) id | dag_id | execution_date| state | run_id | external_trigger | conf | end_date | start_date --+---+-+-++--+--+--+ 1479 | dagrun_not_updated | 2016-05-19 10:15:00 | running | scheduled__2016-05-19T10:15:00 | f| | | 2016-05-19 10:17:06.563842 1480 | dagrun_not_updated | 2016-05-19 10:16:00 | running | scheduled__2016-05-19T10:16:00 | f| | | 2016-05-19 10:17:12.188781 1481 | dagrun_not_updated | 2016-05-19 10:17:00 | running | scheduled__2016-05-19T10:17:00 | f| | | 2016-05-19 10:18:01.550625 (3 rows) {code} - afer some time, all reportedly still running DagRuns are being marked as successful at the same time : {code} 2016-05-19 10:23:11 UTC [12073-18] airflow@airflow LOG: duration: 0.168 ms statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1479 2016-05-19 10:23:11 UTC [12073-19] airflow@airflow LOG: duration: 0.106 ms statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1480 2016-05-19 10:23:11 UTC [12073-20] airflow@airflow LOG: duration: 0.083 ms statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1481 2016-05-19 10:23:11 UTC [12073-21] airflow@airflow LOG: duration: 0.081 ms statement: UPDATE dag_run SET state='success' WHERE dag_run.id = 1482 {code} So it waited till the 4th DagRun ended to update the dag_run table. I've looked at the code I'm not sure whether the issue lies in Airflow as the scheduler properly runs the code that updates the state to sucess : {code} May 19 10:17:36 airflow-ec2 airflow-scheduler[11543]: [2016-05-19 10:17:36,542] {models.py:2725} INFO - Marking run successful May 19 10:17:41 airflow-ec2 airflow-scheduler[11543]: [2016-05-19 10:17:41,666] {models.py:2725} INFO - Marking run successful May 19 10:17:51 airflow-ec2 airflow-scheduler[11543]: [2016-05-19
[jira] [Updated] (AIRFLOW-127) Makes filter_by_owner aware of multi-owner DAG
[ https://issues.apache.org/jira/browse/AIRFLOW-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dud updated AIRFLOW-127: Description: Hello When enabling the _webserver.filter_by_owner_ setting, Airflow issues a request like this one : {code} SELECT * FROM dag WHERE NOT dag.is_subdag AND dag.is_active AND dag.owners = username {code} Unfortunately that doesn't account for DAG authored by multiple users. Please see my PR : https://github.com/apache/incubator-airflow/pull/1508 dud was: Hello When enabling the _webserver.filter_by_owner_ setting, Airflow issues a request like this one : {code} SELECT * FROM dag WHERE NOT dag.is_subdag AND dag.is_active AND dag.owners = username {code} Unfortunately that doesn't account for DAG authored by multiple users. dud > Makes filter_by_owner aware of multi-owner DAG > -- > > Key: AIRFLOW-127 > URL: https://issues.apache.org/jira/browse/AIRFLOW-127 > Project: Apache Airflow > Issue Type: Improvement >Reporter: dud >Priority: Minor > > Hello > When enabling the _webserver.filter_by_owner_ setting, Airflow issues a > request like this one : > {code} > SELECT * FROM dag WHERE NOT dag.is_subdag AND dag.is_active AND dag.owners = > username > {code} > Unfortunately that doesn't account for DAG authored by multiple users. > Please see my PR : https://github.com/apache/incubator-airflow/pull/1508 > dud -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AIRFLOW-121) Documenting dag doc_md feature
[ https://issues.apache.org/jira/browse/AIRFLOW-121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dud updated AIRFLOW-121: Description: Dear Airflow Maintainers, I added a note about DAG documentation. I'd be glad if my PR would be merged : https://github.com/apache/incubator-airflow/pull/1493 Regards dud was: Dear Airflow Maintainers, I added a note about DAG documentation. I'd be glad if my PR would be merged. Regards dud > Documenting dag doc_md feature > -- > > Key: AIRFLOW-121 > URL: https://issues.apache.org/jira/browse/AIRFLOW-121 > Project: Apache Airflow > Issue Type: Improvement >Reporter: dud >Priority: Trivial > > Dear Airflow Maintainers, > I added a note about DAG documentation. > I'd be glad if my PR would be merged : > https://github.com/apache/incubator-airflow/pull/1493 > Regards > dud -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AIRFLOW-106) email_on_failure doesn't being triggered
[ https://issues.apache.org/jira/browse/AIRFLOW-106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dud updated AIRFLOW-106: Description: Hello. I created the following workflow : {code} from airflow import DAG from airflow.operators import PythonOperator from datetime import datetime, timedelta from airflow.models import Variable from time import sleep default_args = { 'depends_on_past': False, 'start_date': datetime(2016, 5, 11, 15, 20), 'email': 'email_on_failure': True, 'email_on_retry': False, 'retries': 1, 'retry_delay': timedelta(minutes=2), 'end_date': datetime(2016, 5, 11, 16, 00), } PARENT_DAG_NAME = 'test' dag = DAG(PARENT_DAG_NAME, default_args=default_args, schedule_interval=timedelta(minutes=10)) def sleep1_function(**kwargs): sleep(90) return Variable.get('test_var') sleep1 = PythonOperator( task_id='sleep1', python_callable=sleep1_function, dag=dag) {code} I forgot to declare test_var so when this DAG launched it failed quickly. However no failure email was ever sent. Clearing the failed task to make it rerun doesn't trigger any email. Here is the logs : {code} [2016-05-11 15:53:31,784] {models.py:157} INFO - Filling up the DagBag from /var/lib/airflow/airflow/dags/test.py [2016-05-11 15:53:32,272] {models.py:157} INFO - Filling up the DagBag from /var/lib/airflow/airflow/dags/test.py [2016-05-11 15:53:32,313] {models.py:1216} INFO - Starting attempt 1 of 2 [2016-05-11 15:53:32,333] {models.py:1239} INFO - Executingon 2016-05-11 15:20:00 [2016-05-11 15:55:03,450] {models.py:1306} ERROR - Variable test_var does not exist Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/airflow-1.7.0-py2.7.egg/airflow/models.py", line 1265, in run result = task_copy.execute(context=context) File "/usr/local/lib/python2.7/dist-packages/airflow-1.7.0-py2.7.egg/airflow/operators/python_operator.py", line 66, in execute return_value = self.python_callable(*self.op_args, **self.op_kwargs) File "/var/lib/airflow/airflow/dags/test.py", line 31, in sleep1_function return Variable.get('test_var') File "/usr/local/lib/python2.7/dist-packages/airflow-1.7.0-py2.7.egg/airflow/utils/db.py", line 53, in wrapper result = func(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/airflow-1.7.0-py2.7.egg/airflow/models.py", line 3145, in get raise ValueError('Variable {} does not exist'.format(key)) ValueError: Variable test_var does not exist [2016-05-11 15:55:03,581] {models.py:1318} INFO - Marking task as UP_FOR_RETRY [2016-05-11 15:55:03,759] {models.py:1347} ERROR - Variable test_var does not exist {code} In the DAG Runs page, the workflow is set as failed. In hte taks instance page, it is set as up_for_retry but no new run is ever scheduled. I tried incrementing the retires parameter, but nothing different happens, Airflow never retries after the first run. dud was: Hello. I created the following workflow : {code} from airflow import DAG from airflow.operators import PythonOperator from datetime import datetime, timedelta from airflow.models import Variable from time import sleep default_args = { 'depends_on_past': False, 'start_date': datetime(2016, 5, 11, 15, 20), 'email': 'email_on_failure': True, 'email_on_retry': False, 'retries': 1, 'retry_delay': timedelta(minutes=2), 'end_date': datetime(2016, 5, 11, 16, 00), } PARENT_DAG_NAME = 'test' dag = DAG(PARENT_DAG_NAME, default_args=default_args, schedule_interval=timedelta(minutes=10)) def sleep1_function(**kwargs): sleep(90) return Variable.get('test_var') sleep1 = PythonOperator( task_id='sleep1', python_callable=sleep1_function, dag=dag) {code} I forgot to declare test_var so when this DAG launched it failed quickly. However no failure email was ever sent. Clearing the failed task to make it rerun doesn't trigger any email. Here is the logs : {code} [2016-05-11 15:53:31,784] {models.py:157} INFO - Filling up the DagBag from /var/lib/airflow/airflow/dags/test.py [2016-05-11 15:53:32,272] {models.py:157} INFO - Filling up the DagBag from /var/lib/airflow/airflow/dags/test.py [2016-05-11 15:53:32,313] {models.py:1216} INFO - Starting attempt 1 of 2 [2016-05-11 15:53:32,333] {models.py:1239} INFO - Executing on 2016-05-11 15:20:00 [2016-05-11 15:55:03,450] {models.py:1306} ERROR - Variable test_var does not exist Traceback (most recent call last): File
[jira] [Created] (AIRFLOW-106) email_on_failure doesn't being triggered
dud created AIRFLOW-106: --- Summary: email_on_failure doesn't being triggered Key: AIRFLOW-106 URL: https://issues.apache.org/jira/browse/AIRFLOW-106 Project: Apache Airflow Issue Type: Bug Environment: Latest version from Git Reporter: dud Hello. I created the following workflow : {code} from airflow import DAG from airflow.operators import PythonOperator from datetime import datetime, timedelta from airflow.models import Variable from time import sleep default_args = { 'depends_on_past': False, 'start_date': datetime(2016, 5, 11, 15, 20), 'email': 'email_on_failure': True, 'email_on_retry': False, 'retries': 1, 'retry_delay': timedelta(minutes=2), 'end_date': datetime(2016, 5, 11, 16, 00), } PARENT_DAG_NAME = 'test' dag = DAG(PARENT_DAG_NAME, default_args=default_args, schedule_interval=timedelta(minutes=10)) def sleep1_function(**kwargs): sleep(90) return Variable.get('test_var') sleep1 = PythonOperator( task_id='sleep1', python_callable=sleep1_function, dag=dag) {code} I forgot to declare test_var so when this DAG launched it failed quickly. However no failure email was ever sent. Clearing the failed task to make it rerun doesn't trigger any email. Here is the logs : {code} [2016-05-11 15:53:31,784] {models.py:157} INFO - Filling up the DagBag from /var/lib/airflow/airflow/dags/test.py [2016-05-11 15:53:32,272] {models.py:157} INFO - Filling up the DagBag from /var/lib/airflow/airflow/dags/test.py [2016-05-11 15:53:32,313] {models.py:1216} INFO - Starting attempt 1 of 2 [2016-05-11 15:53:32,333] {models.py:1239} INFO - Executingon 2016-05-11 15:20:00 [2016-05-11 15:55:03,450] {models.py:1306} ERROR - Variable test_var does not exist Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/airflow-1.7.0-py2.7.egg/airflow/models.py", line 1265, in run result = task_copy.execute(context=context) File "/usr/local/lib/python2.7/dist-packages/airflow-1.7.0-py2.7.egg/airflow/operators/python_operator.py", line 66, in execute return_value = self.python_callable(*self.op_args, **self.op_kwargs) File "/var/lib/airflow/airflow/dags/test.py", line 31, in sleep1_function return Variable.get('test_var') File "/usr/local/lib/python2.7/dist-packages/airflow-1.7.0-py2.7.egg/airflow/utils/db.py", line 53, in wrapper result = func(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/airflow-1.7.0-py2.7.egg/airflow/models.py", line 3145, in get raise ValueError('Variable {} does not exist'.format(key)) ValueError: Variable test_var does not exist [2016-05-11 15:55:03,581] {models.py:1318} INFO - Marking task as UP_FOR_RETRY [2016-05-11 15:55:03,759] {models.py:1347} ERROR - Variable test_var does not exist {code} In the DAG Runs page, the workflow is set as failed. In hte taks instance page, it is set as up_for_retry but no new run is ever scheduled. I tried incrementing the retires parameter, but nothing different happens, Airflow ever retries after the first run. dud -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AIRFLOW-39) DagRun scheduled after end_date
[ https://issues.apache.org/jira/browse/AIRFLOW-39?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273774#comment-15273774 ] dud commented on AIRFLOW-39: Great news. Thanks. > DagRun scheduled after end_date > --- > > Key: AIRFLOW-39 > URL: https://issues.apache.org/jira/browse/AIRFLOW-39 > Project: Apache Airflow > Issue Type: Wish >Affects Versions: Airflow 1.7.0 > Environment: - Airflow version: v1.7.0 > - Airflow components: webserver and scheduler with a postgres database and > LocalExecutor > - Python Version: 2.7.9 >Reporter: dud >Assignee: Siddharth Anand >Priority: Minor > Attachments: 1_7_0_dag_runs_past_date_running.png, > Head_master_dag_runs_past_date_failed.png, > Jobs_view_only_shows_runs_within_start_end_range.png, > Task_instance_view_only_shows_TIs_within_start_end_range.png, > Tree_view_only_shows_runs_within_start_end_range.png, tutorial4.png, > tutorial4_grpah.png > > > Hello > When I write a new DAG, it is automatically picked by the scheduler as > intended. However the end_date parameter is not accounted when creating a > DagRun, so that DagRuns scheduled after end_date are being displayed in the > web interface and are staying in the running state forever as the scheduler > doesn't launch them. > From my reading of the sources, the end_date paremeter is not used when > creating a DagRun. > Is there any reason for this behaviour ? > Regards > dud -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AIRFLOW-39) DagRun scheduled after end_date
[ https://issues.apache.org/jira/browse/AIRFLOW-39?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dud updated AIRFLOW-39: --- Attachment: tutorial4_grpah.png tutorial4.png > DagRun scheduled after end_date > --- > > Key: AIRFLOW-39 > URL: https://issues.apache.org/jira/browse/AIRFLOW-39 > Project: Apache Airflow > Issue Type: Wish > Environment: - Airflow version: v1.7.0 > - Airflow components: webserver and scheduler with a postgres database and > LocalExecutor > - Python Version: 2.7.9 >Reporter: dud >Assignee: Siddharth Anand >Priority: Minor > Attachments: tutorial4.png, tutorial4_grpah.png > > > Hello > When I write a new DAG, it is automatically picked by the scheduler as > intended. However the end_date parameter is not accounted when creating a > DagRun, so that DagRuns scheduled after end_date are being displayed in the > web interface and are staying in the running state forever as the scheduler > doesn't launch them. > From my reading of the sources, the end_date paremeter is not used when > creating a DagRun. > Is there any reason for this behaviour ? > Regards > dud -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AIRFLOW-39) DagRun scheduled after end_date
[ https://issues.apache.org/jira/browse/AIRFLOW-39?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15271009#comment-15271009 ] dud commented on AIRFLOW-39: {code} from airflow import DAG from airflow.operators import BashOperator from datetime import datetime, timedelta default_args = { 'owner': 'airflow', 'depends_on_past': False, 'start_date': datetime(2016, 5, 4, 16, 40), 'end_date': datetime(2016, 5, 4, 16, 50), 'email': ['airf...@airflow.com'], 'email_on_failure': False, 'email_on_retry': False, 'retries': 1, 'retry_delay': timedelta(minutes=5), # 'queue': 'bash_queue', # 'pool': 'backfill', # 'priority_weight': 10, # 'end_date': datetime(2016, 1, 1), } dag = DAG('tutorial4', default_args=default_args, schedule_interval=timedelta(minutes=5)) # t1, t2 and t3 are examples of tasks created by instatiating operators t1 = BashOperator( task_id='print_date', bash_command='date', dag=dag) t2 = BashOperator( task_id='sleep', bash_command='sleep 5', retries=3, dag=dag) templated_command = """ {% for i in range(5) %} echo "{{ ds }}" echo "{{ macros.ds_add(ds, 7)}}" echo "{{ params.my_param }}" {% endfor %} """ t3 = BashOperator( task_id='templated', bash_command=templated_command, params={'my_param': 'Parameter I passed in'}, dag=dag) t2.set_upstream(t1) t3.set_upstream(t1) {code} Tree view just shows runs between 16:40 and 16:50. Please note that I tried with latest version from Github. > DagRun scheduled after end_date > --- > > Key: AIRFLOW-39 > URL: https://issues.apache.org/jira/browse/AIRFLOW-39 > Project: Apache Airflow > Issue Type: Wish > Environment: - Airflow version: v1.7.0 > - Airflow components: webserver and scheduler with a postgres database and > LocalExecutor > - Python Version: 2.7.9 >Reporter: dud >Assignee: Siddharth Anand >Priority: Minor > > Hello > When I write a new DAG, it is automatically picked by the scheduler as > intended. However the end_date parameter is not accounted when creating a > DagRun, so that DagRuns scheduled after end_date are being displayed in the > web interface and are staying in the running state forever as the scheduler > doesn't launch them. > From my reading of the sources, the end_date paremeter is not used when > creating a DagRun. > Is there any reason for this behaviour ? > Regards > dud -- This message was sent by Atlassian JIRA (v6.3.4#6332)