[jira] [Created] (AIRFLOW-578) BaseJob does not check return code of a process

2016-10-17 Thread Ze Wang (JIRA)
Ze Wang created AIRFLOW-578:
---

 Summary: BaseJob does not check return code of a process
 Key: AIRFLOW-578
 URL: https://issues.apache.org/jira/browse/AIRFLOW-578
 Project: Apache Airflow
  Issue Type: Bug
  Components: scheduler
Reporter: Ze Wang


BaseJob ignores the return code of the spawned process. which makes even that 
process is killed or returned abnormally, it will think it finishes with success




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (AIRFLOW-575) Improve tutorial information about default_args

2016-10-17 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-575.

Resolution: Fixed

> Improve tutorial information about default_args
> ---
>
> Key: AIRFLOW-575
> URL: https://issues.apache.org/jira/browse/AIRFLOW-575
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Laura Lorenz
>Assignee: Laura Lorenz
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-575) Improve tutorial information about default_args

2016-10-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582719#comment-15582719
 ] 

ASF subversion and git services commented on AIRFLOW-575:
-

Commit 80d3c8d461f1c95d173aa72a055737d8ad379ae1 in incubator-airflow's branch 
refs/heads/master from [~lauralorenz]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=80d3c8d ]

[AIRFLOW-575] Clarify tutorial and FAQ about `schedule_interval` always 
inheriting from DAG object

- Update the tutorial with a comment helping to explain the use of default_args 
and
include all the possible parameters in line
- Clarify in the FAQ the possibility of an unexpected default 
`schedule_interval`in case
airflow users mistakenly try to overwrite the default `schedule_interval` in a 
DAG's
`default_args` parameter


> Improve tutorial information about default_args
> ---
>
> Key: AIRFLOW-575
> URL: https://issues.apache.org/jira/browse/AIRFLOW-575
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Laura Lorenz
>Assignee: Laura Lorenz
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/2] incubator-airflow git commit: [AIRFLOW-575] Clarify tutorial and FAQ about `schedule_interval` always inheriting from DAG object

2016-10-17 Thread arthur
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 0235d59d0 -> 916f1eb2f


[AIRFLOW-575] Clarify tutorial and FAQ about `schedule_interval` always 
inheriting from DAG object

- Update the tutorial with a comment helping to explain the use of default_args 
and
include all the possible parameters in line
- Clarify in the FAQ the possibility of an unexpected default 
`schedule_interval`in case
airflow users mistakenly try to overwrite the default `schedule_interval` in a 
DAG's
`default_args` parameter


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/80d3c8d4
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/80d3c8d4
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/80d3c8d4

Branch: refs/heads/master
Commit: 80d3c8d461f1c95d173aa72a055737d8ad379ae1
Parents: 11ad53a
Author: lauralorenz 
Authored: Tue Apr 19 17:03:46 2016 -0400
Committer: lauralorenz 
Committed: Mon Oct 17 12:36:38 2016 -0400

--
 airflow/example_dags/tutorial.py | 14 --
 docs/faq.rst |  5 +
 2 files changed, 17 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/80d3c8d4/airflow/example_dags/tutorial.py
--
diff --git a/airflow/example_dags/tutorial.py b/airflow/example_dags/tutorial.py
index 9462463..e929389 100644
--- a/airflow/example_dags/tutorial.py
+++ b/airflow/example_dags/tutorial.py
@@ -10,6 +10,8 @@ from datetime import datetime, timedelta
 seven_days_ago = datetime.combine(datetime.today() - timedelta(7),
   datetime.min.time())
 
+# these args will get passed on to each operator
+# you can override them on a per-task basis during operator initialization
 default_args = {
 'owner': 'airflow',
 'depends_on_past': False,
@@ -22,11 +24,19 @@ default_args = {
 # 'queue': 'bash_queue',
 # 'pool': 'backfill',
 # 'priority_weight': 10,
-# 'schedule_interval': timedelta(1),
 # 'end_date': datetime(2016, 1, 1),
+# 'wait_for_downstream': False,
+# 'dag': dag,
+# 'adhoc':False,
+# 'sla': timedelta(hours=2),
+# 'execution_timeout': timedelta(seconds=300),
+# 'on_failure_callback': some_function,
+# 'on_success_callback': some_other_function,
+# 'on_retry_callback': another_function,
+# 'trigger_rule': u'all_success'
 }
 
-dag = DAG('tutorial', default_args=default_args)
+dag = DAG('tutorial', default_args=default_args, 
schedule_interval=timedelta(days=1))
 
 # t1, t2 and t3 are examples of tasks created by instantiating operators
 t1 = BashOperator(

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/80d3c8d4/docs/faq.rst
--
diff --git a/docs/faq.rst b/docs/faq.rst
index 6418dcb..b5b28af 100644
--- a/docs/faq.rst
+++ b/docs/faq.rst
@@ -17,6 +17,11 @@ Here are some of the common causes:
 - Is your ``start_date`` set properly? The Airflow scheduler triggers the
   task soon after the ``start_date + scheduler_interval`` is passed.
 
+- Is your ``schedule_interval`` set properly? The default ``schedule_interval``
+  is one day (``datetime.timedelta(1)``). You must specify a different 
``schedule_interval``
+  directly to the DAG object you instantiate, not as a ``default_param``, as 
task instances
+  do not override their parent DAG's ``schedule_interval``.
+
 - Is your ``start_date`` beyond where you can see it in the UI? If you
   set your it to some time say 3 months ago, you won't be able to see
   it in the main view in the UI, but you should be able to see it in the



[2/2] incubator-airflow git commit: Merge pull request #1402 from lauralorenz/schedule_interval_default_args_docs

2016-10-17 Thread arthur
Merge pull request #1402 from lauralorenz/schedule_interval_default_args_docs


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/916f1eb2
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/916f1eb2
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/916f1eb2

Branch: refs/heads/master
Commit: 916f1eb2feedae4f4d827466cfe91821ef30f885
Parents: 0235d59 80d3c8d
Author: Arthur Wiedmer 
Authored: Mon Oct 17 09:46:57 2016 -0700
Committer: Arthur Wiedmer 
Committed: Mon Oct 17 09:46:57 2016 -0700

--
 airflow/example_dags/tutorial.py | 14 --
 docs/faq.rst |  5 +
 2 files changed, 17 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/916f1eb2/airflow/example_dags/tutorial.py
--

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/916f1eb2/docs/faq.rst
--



[jira] [Updated] (AIRFLOW-575) Improve tutorial information about default_args

2016-10-17 Thread Laura Lorenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laura Lorenz updated AIRFLOW-575:
-
External issue URL: https://github.com/apache/incubator-airflow/pull/1402

> Improve tutorial information about default_args
> ---
>
> Key: AIRFLOW-575
> URL: https://issues.apache.org/jira/browse/AIRFLOW-575
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Laura Lorenz
>Assignee: Laura Lorenz
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (AIRFLOW-575) Improve tutorial information about default_args

2016-10-17 Thread Laura Lorenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-575 started by Laura Lorenz.

> Improve tutorial information about default_args
> ---
>
> Key: AIRFLOW-575
> URL: https://issues.apache.org/jira/browse/AIRFLOW-575
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Laura Lorenz
>Assignee: Laura Lorenz
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (AIRFLOW-577) BigQuery Hook failure message too opaque

2016-10-17 Thread Chris Riccomini (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini closed AIRFLOW-577.
---
   Resolution: Fixed
Fix Version/s: Airflow 1.8

Merged!

> BigQuery Hook failure message too opaque
> 
>
> Key: AIRFLOW-577
> URL: https://issues.apache.org/jira/browse/AIRFLOW-577
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Georg Walther
> Fix For: Airflow 1.8
>
>
> The BigQuery service returns routinely opaque error messages such as "Too 
> many errors ..." - the Airflow BigQuery hook returns this opaque error 
> message by accessing the respective keys in the job dictionary:
> "job['status']['errorResult']"
> When debugging BigQuery issues in Airflow we routinely need to try and step 
> into the BigQuery hook to inspect the job dictionary for further hints at 
> what caused the error. Therefore it would help to output the BigQuery hook 
> job dictionary in the first place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


incubator-airflow git commit: [AIRFLOW-577] Output BigQuery job for improved debugging

2016-10-17 Thread criccomini
Repository: incubator-airflow
Updated Branches:
  refs/heads/master e36f9a750 -> 0235d59d0


[AIRFLOW-577] Output BigQuery job for improved debugging

Closes #1838 from waltherg/fix/bq_error_message


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/0235d59d
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/0235d59d
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/0235d59d

Branch: refs/heads/master
Commit: 0235d59d052524d0d773e07b13867691223f9904
Parents: e36f9a7
Author: Georg Walther 
Authored: Mon Oct 17 08:51:56 2016 -0700
Committer: Chris Riccomini 
Committed: Mon Oct 17 08:51:56 2016 -0700

--
 airflow/contrib/hooks/bigquery_hook.py | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/0235d59d/airflow/contrib/hooks/bigquery_hook.py
--
diff --git a/airflow/contrib/hooks/bigquery_hook.py 
b/airflow/contrib/hooks/bigquery_hook.py
index c5b57a9..e8528ac 100644
--- a/airflow/contrib/hooks/bigquery_hook.py
+++ b/airflow/contrib/hooks/bigquery_hook.py
@@ -435,7 +435,10 @@ class BigQueryBaseCursor(object):
 # Check if job had errors.
 if 'errorResult' in job['status']:
 raise Exception(
-'BigQuery job failed. Final error was: %s', 
job['status']['errorResult'])
+'BigQuery job failed. Final error was: {}. The job was: 
{}'.format(
+job['status']['errorResult'], job
+)
+)
 
 return job_id
 



[jira] [Comment Edited] (AIRFLOW-139) Executing VACUUM with PostgresOperator

2016-10-17 Thread Daniel Zohar (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582253#comment-15582253
 ] 

Daniel Zohar edited comment on AIRFLOW-139 at 10/17/16 1:24 PM:


Looking back at the original commit - 
https://github.com/apache/incubator-airflow/commit/28da05d860147b5e0df37d998f437af6a5d4d178
No tests were added, and no real justification for the fix aside for a link to 
PG release notes. 
I'd think with the current project standards it wouldn't have been merged in 
that state.
[~underyx] could you please provide more insight into why this was added maybe 
I'm missing something here.


was (Author: dan...@memrise.com):
Looking back at the original commit - 
https://github.com/Memrise/incubator-airflow/commit/28da05d860147b5e0df37d998f437af6a5d4d178
No tests were added, and no real justification for the fix aside for a link to 
PG release notes. 
I'd think with the current project standards it wouldn't have been merged in 
that state.
[~underyx] could you please provide more insight into why this was added maybe 
I'm missing something here.

> Executing VACUUM with PostgresOperator
> --
>
> Key: AIRFLOW-139
> URL: https://issues.apache.org/jira/browse/AIRFLOW-139
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: Airflow 1.7.0
>Reporter: Rafael
>
> Dear Airflow Maintainers,
> h1. Environment
> * Airflow version: *v1.7.0*
> * Airflow components: *PostgresOperator*
> * Python Version: *Python 3.5.1*
> * Operating System: *15.4.0 Darwin*
> h1. Description of Issue
> I am trying to execute a `VACUUM` command as part of DAG with the 
> `PostgresOperator`, which fails with the following error:
> {quote}
> [2016-05-14 16:14:01,849] {__init__.py:36} INFO - Using executor 
> SequentialExecutor
> Traceback (most recent call last):
>   File "/usr/local/bin/airflow", line 15, in 
> args.func(args)
>   File 
> "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/airflow/bin/cli.py",
>  line 203, in run
> pool=args.pool,
>   File 
> "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/airflow/models.py",
>  line 1067, in run
> result = task_copy.execute(context=context)
>   File 
> "/usr/local/lib/python3.5/site-packages/airflow/operators/postgres_operator.py",
>  line 39, in execute
> self.hook.run(self.sql, self.autocommit, parameters=self.parameters)
>   File 
> "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/airflow/hooks/dbapi_hook.py",
>  line 109, in run
> cur.execute(s)
> psycopg2.InternalError: VACUUM cannot run inside a transaction block
> {quote}
> I could create a small python script that performs the operation, as 
> explained in [this stackoverflow 
> entry](http://stackoverflow.com/questions/1017463/postgresql-how-to-run-vacuum-from-code-outside-transaction-block).
>  However, I would like to know first if the `VACUUM` command should be 
> supported by the `PostgresOperator`.
> h1. Reproducing the Issue
> The operator can be declared as follows:
> {quote}
> conn = ('postgres_default')
> t4 = PostgresOperator(
> task_id='vacuum',
> postgres_conn_id=conn,
> sql=("VACUUM public.table"),
> dag=dag
> )
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-139) Executing VACUUM with PostgresOperator

2016-10-17 Thread Daniel Zohar (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582253#comment-15582253
 ] 

Daniel Zohar commented on AIRFLOW-139:
--

Looking back at the original commit - 
https://github.com/Memrise/incubator-airflow/commit/28da05d860147b5e0df37d998f437af6a5d4d178
No tests were added, and no real justification for the fix aside for a link to 
PG release notes. 
I'd think with the current project standards it wouldn't have been merged in 
that state.
[~underyx] could you please provide more insight into why this was added maybe 
I'm missing something here.

> Executing VACUUM with PostgresOperator
> --
>
> Key: AIRFLOW-139
> URL: https://issues.apache.org/jira/browse/AIRFLOW-139
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: Airflow 1.7.0
>Reporter: Rafael
>
> Dear Airflow Maintainers,
> h1. Environment
> * Airflow version: *v1.7.0*
> * Airflow components: *PostgresOperator*
> * Python Version: *Python 3.5.1*
> * Operating System: *15.4.0 Darwin*
> h1. Description of Issue
> I am trying to execute a `VACUUM` command as part of DAG with the 
> `PostgresOperator`, which fails with the following error:
> {quote}
> [2016-05-14 16:14:01,849] {__init__.py:36} INFO - Using executor 
> SequentialExecutor
> Traceback (most recent call last):
>   File "/usr/local/bin/airflow", line 15, in 
> args.func(args)
>   File 
> "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/airflow/bin/cli.py",
>  line 203, in run
> pool=args.pool,
>   File 
> "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/airflow/models.py",
>  line 1067, in run
> result = task_copy.execute(context=context)
>   File 
> "/usr/local/lib/python3.5/site-packages/airflow/operators/postgres_operator.py",
>  line 39, in execute
> self.hook.run(self.sql, self.autocommit, parameters=self.parameters)
>   File 
> "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/airflow/hooks/dbapi_hook.py",
>  line 109, in run
> cur.execute(s)
> psycopg2.InternalError: VACUUM cannot run inside a transaction block
> {quote}
> I could create a small python script that performs the operation, as 
> explained in [this stackoverflow 
> entry](http://stackoverflow.com/questions/1017463/postgresql-how-to-run-vacuum-from-code-outside-transaction-block).
>  However, I would like to know first if the `VACUUM` command should be 
> supported by the `PostgresOperator`.
> h1. Reproducing the Issue
> The operator can be declared as follows:
> {quote}
> conn = ('postgres_default')
> t4 = PostgresOperator(
> task_id='vacuum',
> postgres_conn_id=conn,
> sql=("VACUUM public.table"),
> dag=dag
> )
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AIRFLOW-577) BigQuery Hook failure message too opaque

2016-10-17 Thread Georg Walther (JIRA)
Georg Walther created AIRFLOW-577:
-

 Summary: BigQuery Hook failure message too opaque
 Key: AIRFLOW-577
 URL: https://issues.apache.org/jira/browse/AIRFLOW-577
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Georg Walther


The BigQuery service returns routinely opaque error messages such as "Too many 
errors ..." - the Airflow BigQuery hook returns this opaque error message by 
accessing the respective keys in the job dictionary:

"job['status']['errorResult']"

When debugging BigQuery issues in Airflow we routinely need to try and step 
into the BigQuery hook to inspect the job dictionary for further hints at what 
caused the error. Therefore it would help to output the BigQuery hook job 
dictionary in the first place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)