[jira] [Commented] (AIRFLOW-168) schedule_interval @once scheduling dag atleast twice

2016-05-26 Thread Bolke de Bruin (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15301664#comment-15301664
 ] 

Bolke de Bruin commented on AIRFLOW-168:


The issue with double dagruns is due to this line:

# don't ever schedule prior to the dag's start_date
if dag.start_date:
next_run_date = dag.start_date if not next_run_date else 
max(next_run_date, dag.start_date)

which doesn't check if it was scheduled before. Will have a fix shortly.

> schedule_interval @once scheduling dag atleast twice
> 
>
> Key: AIRFLOW-168
> URL: https://issues.apache.org/jira/browse/AIRFLOW-168
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: Airflow 1.7.1.2
>Reporter: Sumit Maheshwari
> Attachments: Screen Shot 2016-05-24 at 9.51.50 PM.png, 
> screenshot-1.png
>
>
> I was looking at example_xcom example and found that it got scheduled twice. 
> Ones at the start_time and ones at the current time. To be correct I tried 
> multiple times (by reloading db) and its same. 
> I am on airflow master, using sequential executor with sqlite3. Though it 
> works as expected on a prod env which is running v1.7 with celery workers and 
> mysql backend.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (AIRFLOW-168) schedule_interval @once scheduling dag atleast twice

2016-05-26 Thread Bolke de Bruin (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15301664#comment-15301664
 ] 

Bolke de Bruin edited comment on AIRFLOW-168 at 5/26/16 7:20 AM:
-

The issue with double dagruns is due to this line:
{code}
# don't ever schedule prior to the dag's start_date
if dag.start_date:
next_run_date = dag.start_date if not next_run_date else 
max(next_run_date, dag.start_date)
{code}

which doesn't check if it was scheduled before. Will have a fix shortly.


was (Author: bolke):
The issue with double dagruns is due to this line:

# don't ever schedule prior to the dag's start_date
if dag.start_date:
next_run_date = dag.start_date if not next_run_date else 
max(next_run_date, dag.start_date)

which doesn't check if it was scheduled before. Will have a fix shortly.

> schedule_interval @once scheduling dag atleast twice
> 
>
> Key: AIRFLOW-168
> URL: https://issues.apache.org/jira/browse/AIRFLOW-168
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: Airflow 1.7.1.2
>Reporter: Sumit Maheshwari
> Attachments: Screen Shot 2016-05-24 at 9.51.50 PM.png, 
> screenshot-1.png
>
>
> I was looking at example_xcom example and found that it got scheduled twice. 
> Ones at the start_time and ones at the current time. To be correct I tried 
> multiple times (by reloading db) and its same. 
> I am on airflow master, using sequential executor with sqlite3. Though it 
> works as expected on a prod env which is running v1.7 with celery workers and 
> mysql backend.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (AIRFLOW-168) schedule_interval @once scheduling dag atleast twice

2016-05-26 Thread Bolke de Bruin (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15301539#comment-15301539
 ] 

Bolke de Bruin edited comment on AIRFLOW-168 at 5/26/16 10:30 AM:
--

The double scheduling is indeed a bug on master, also with the updated 
scheduler from 124, that I will need to fix. It is also a bug in the current 
scheduler but does not show as tasks are not eagerly created.


was (Author: bolke):
The double scheduling is indeed a bug on master, also with the updated 
scheduler from 124, that I will need to fix.

> schedule_interval @once scheduling dag atleast twice
> 
>
> Key: AIRFLOW-168
> URL: https://issues.apache.org/jira/browse/AIRFLOW-168
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: Airflow 1.7.1.2
>Reporter: Sumit Maheshwari
> Attachments: Screen Shot 2016-05-24 at 9.51.50 PM.png, 
> screenshot-1.png
>
>
> I was looking at example_xcom example and found that it got scheduled twice. 
> Ones at the start_time and ones at the current time. To be correct I tried 
> multiple times (by reloading db) and its same. 
> I am on airflow master, using sequential executor with sqlite3. Though it 
> works as expected on a prod env which is running v1.7 with celery workers and 
> mysql backend.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-168) schedule_interval @once scheduling dag atleast twice

2016-05-26 Thread Bolke de Bruin (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15301922#comment-15301922
 ] 

Bolke de Bruin commented on AIRFLOW-168:


Fix is in PR of #128, will see if it applies to master as well (with a separate 
Jira issue)

> schedule_interval @once scheduling dag atleast twice
> 
>
> Key: AIRFLOW-168
> URL: https://issues.apache.org/jira/browse/AIRFLOW-168
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: Airflow 1.7.1.2
>Reporter: Sumit Maheshwari
> Attachments: Screen Shot 2016-05-24 at 9.51.50 PM.png, 
> screenshot-1.png
>
>
> I was looking at example_xcom example and found that it got scheduled twice. 
> Ones at the start_time and ones at the current time. To be correct I tried 
> multiple times (by reloading db) and its same. 
> I am on airflow master, using sequential executor with sqlite3. Though it 
> works as expected on a prod env which is running v1.7 with celery workers and 
> mysql backend.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/2] incubator-airflow git commit: [AIRFLOW-176] Improve PR Tool JIRA workflow

2016-05-26 Thread jlowin
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 456dada69 -> 387f08cd0


[AIRFLOW-176] Improve PR Tool JIRA workflow

- Fix crash when non-integer IDs are passed
- Improve workflow by always asking user if they
  want to resolve another issue before exiting


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/beb95a5c
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/beb95a5c
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/beb95a5c

Branch: refs/heads/master
Commit: beb95a5c67cf25d4110416b3998e89f012ae0489
Parents: 7332c40
Author: jlowin 
Authored: Wed May 25 11:51:20 2016 -0400
Committer: jlowin 
Committed: Wed May 25 18:11:16 2016 -0400

--
 dev/airflow-pr | 101 
 1 file changed, 79 insertions(+), 22 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/beb95a5c/dev/airflow-pr
--
diff --git a/dev/airflow-pr b/dev/airflow-pr
index 8dd8df7..77b9e3b 100755
--- a/dev/airflow-pr
+++ b/dev/airflow-pr
@@ -296,7 +296,55 @@ def fix_version_from_branch(branch, versions):
 return versions[-1]
 
 
+def validate_jira_id(jira_id):
+if not jira_id:
+return
+
+# first look for AIRFLOW-X
+ids = re.findall("AIRFLOW-[0-9]{1,6}", jira_id)
+if len(ids) > 1:
+raise click.UsageError('Found multiple issue ids: {}'.format(ids))
+elif len(ids) == 1:
+jira_id = ids[0]
+elif not ids:
+# if we don't find AIRFLOW-X, see if jira_id is an int
+try:
+jira_id = 'AIRFLOW-{}'.format(abs(int(jira_id)))
+except ValueError:
+raise click.UsageError(
+'JIRA id must be an integer or have the form AIRFLOW-X')
+
+return jira_id
+
+
+def resolve_jira_issues_loop(comment=None, merge_branches=None):
+"""
+Resolves a JIRA issue, then asks the user if he/she would like to close
+another one. Repeats until the user indicates they are finished.
+"""
+while True:
+try:
+resolve_jira_issue(
+comment=comment,
+jira_id=None,
+merge_branches=merge_branches)
+except Exception as e:
+click.echo("ERROR: {}".format(e))
+
+if not click.confirm('Would you like to resolve another JIRA issue?'):
+return
+
+
 def resolve_jira_issue(comment=None, jira_id=None, merge_branches=None):
+"""
+Resolves a JIRA issue
+
+comment: a comment for the issue. The user will always be prompted for one;
+if provided, this will be the default.
+
+jira_id: an Airflow JIRA id, either an integer or a string with the form
+AIRFLOW-X. If not provided, the user will be prompted to provide one.
+"""
 if merge_branches is None:
 merge_branches = []
 
@@ -314,20 +362,17 @@ def resolve_jira_issue(comment=None, jira_id=None, 
merge_branches=None):
 {'server': JIRA_API_BASE},
 basic_auth=(JIRA_USERNAME, JIRA_PASSWORD))
 
-jira_id = 'AIRFLOW-{}'.format(abs(click.prompt(
-'Enter an Airflow JIRA id', default=jira_id, type=int)))
+if jira_id is None:
+jira_id = click.prompt(
+'Enter an Airflow JIRA id', value_proc=validate_jira_id)
+else:
+jira_id = validate_jira_id(jira_id)
+
 try:
 issue = asf_jira.issue(jira_id)
 except Exception as e:
-fail("ASF JIRA could not find issue {}\n{}".format(jira_id, e))
-
-if comment is None:
-comment = click.prompt(
-'Please enter a comment to explain why the issue is being closed',
-default='',
-show_default=False)
-if not comment:
-comment = None
+raise ValueError(
+"ASF JIRA could not find issue {}\n{}".format(jira_id, e))
 
 cur_status = issue.fields.status.name
 cur_summary = issue.fields.summary
@@ -338,10 +383,17 @@ def resolve_jira_issue(comment=None, jira_id=None, 
merge_branches=None):
 cur_assignee = cur_assignee.displayName
 
 if cur_status == "Resolved" or cur_status == "Closed":
-fail("JIRA issue %s already has status '%s'" % (jira_id, cur_status))
-click.echo ("=== JIRA %s ===" % jira_id)
+raise ValueError(
+"JIRA issue %s already has status '%s'" % (jira_id, cur_status))
+click.echo ("\n=== JIRA %s ===" % jira_id)
 click.echo ("summary\t\t%s\nassignee\t%s\nstatus\t\t%s\nurl\t\t%s/%s\n" % (
 cur_summary, cur_assignee, cur_status, JIRA_BASE, jira_id))
+continue_maybe('Proceed with AIRFLOW-{}?'.format(jira_id))
+
+comment = click.prompt(
+'Please enter a comment to explain why {jid} is b

[2/2] incubator-airflow git commit: Merge pull request #1544 from jlowin/pr-tool-3

2016-05-26 Thread jlowin
Merge pull request #1544 from jlowin/pr-tool-3


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/387f08cd
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/387f08cd
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/387f08cd

Branch: refs/heads/master
Commit: 387f08cd01d64bc4ba9e112c1528691dcc80b400
Parents: 456dada beb95a5
Author: jlowin 
Authored: Thu May 26 10:35:45 2016 -0400
Committer: jlowin 
Committed: Thu May 26 10:35:45 2016 -0400

--
 dev/airflow-pr | 101 
 1 file changed, 79 insertions(+), 22 deletions(-)
--




[jira] [Commented] (AIRFLOW-176) PR tool crashes with non-integer JIRA ids

2016-05-26 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302156#comment-15302156
 ] 

ASF subversion and git services commented on AIRFLOW-176:
-

Commit beb95a5c67cf25d4110416b3998e89f012ae0489 in incubator-airflow's branch 
refs/heads/master from [~jlowin]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=beb95a5 ]

[AIRFLOW-176] Improve PR Tool JIRA workflow

- Fix crash when non-integer IDs are passed
- Improve workflow by always asking user if they
  want to resolve another issue before exiting


> PR tool crashes with non-integer JIRA ids
> -
>
> Key: AIRFLOW-176
> URL: https://issues.apache.org/jira/browse/AIRFLOW-176
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: PR tool
>Affects Versions: Airflow 1.7.1.2
>Reporter: Jeremiah Lowin
>Assignee: Jeremiah Lowin
>
> The PR tool crashes if a non-integer id is passed. This includes the default 
> ID  (AIRFLOW-XXX) so it affects folks who don't type in a new ID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-178) Zip files in DAG folder does not get picked up by Ariflow

2016-05-26 Thread Chris Riccomini (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302233#comment-15302233
 ] 

Chris Riccomini commented on AIRFLOW-178:
-

cc [~bolke]

> Zip files in DAG folder does not get picked up by Ariflow
> -
>
> Key: AIRFLOW-178
> URL: https://issues.apache.org/jira/browse/AIRFLOW-178
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Joy Gao
>Assignee: Joy Gao
>Priority: Minor
>
> The collect_dags method in DagBag class currently skips any file that does 
> not end in '.py', thereby skipping potential zip files in the DAG folder.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/2] incubator-airflow git commit: Merge pull request #1545 from jgao54/zip-bug-fix

2016-05-26 Thread criccomini
Merge pull request #1545 from jgao54/zip-bug-fix


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/01b3291c
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/01b3291c
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/01b3291c

Branch: refs/heads/master
Commit: 01b3291ccebce7ee032a56845b12da28bde98582
Parents: 387f08c 50f911c
Author: Chris Riccomini 
Authored: Thu May 26 08:23:25 2016 -0700
Committer: Chris Riccomini 
Committed: Thu May 26 08:23:25 2016 -0700

--
 airflow/models.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




[1/2] incubator-airflow git commit: [AIRFLOW-178] Fix bug so that zip file is detected in DAG folder

2016-05-26 Thread criccomini
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 387f08cd0 -> 01b3291cc


[AIRFLOW-178] Fix bug so that zip file is detected in DAG folder


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/50f911cd
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/50f911cd
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/50f911cd

Branch: refs/heads/master
Commit: 50f911cd39e9fd9abf3a1d3283cf2e6307ab2540
Parents: 456dada
Author: Joy Gao 
Authored: Wed May 25 21:10:14 2016 -0700
Committer: Joy Gao 
Committed: Wed May 25 22:18:35 2016 -0700

--
 airflow/models.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/50f911cd/airflow/models.py
--
diff --git a/airflow/models.py b/airflow/models.py
index 67958f2..b23e1ea 100644
--- a/airflow/models.py
+++ b/airflow/models.py
@@ -406,7 +406,7 @@ class DagBag(LoggingMixin):
 continue
 mod_name, file_ext = os.path.splitext(
 os.path.split(filepath)[-1])
-if file_ext != '.py':
+if file_ext != '.py' and not 
zipfile.is_zipfile(filepath):
 continue
 if not any(
 [re.findall(p, filepath) for p in patterns]):



[jira] [Commented] (AIRFLOW-178) Zip files in DAG folder does not get picked up by Ariflow

2016-05-26 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302238#comment-15302238
 ] 

ASF subversion and git services commented on AIRFLOW-178:
-

Commit 50f911cd39e9fd9abf3a1d3283cf2e6307ab2540 in incubator-airflow's branch 
refs/heads/master from [~joy.gao54]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=50f911c ]

[AIRFLOW-178] Fix bug so that zip file is detected in DAG folder


> Zip files in DAG folder does not get picked up by Ariflow
> -
>
> Key: AIRFLOW-178
> URL: https://issues.apache.org/jira/browse/AIRFLOW-178
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Joy Gao
>Assignee: Joy Gao
>Priority: Minor
>
> The collect_dags method in DagBag class currently skips any file that does 
> not end in '.py', thereby skipping potential zip files in the DAG folder.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (AIRFLOW-176) PR tool crashes with non-integer JIRA ids

2016-05-26 Thread Chris Riccomini (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini closed AIRFLOW-176.
---
Resolution: Fixed

Closing this, since ASF bot is saying it's committed.

> PR tool crashes with non-integer JIRA ids
> -
>
> Key: AIRFLOW-176
> URL: https://issues.apache.org/jira/browse/AIRFLOW-176
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: PR tool
>Affects Versions: Airflow 1.7.1.2
>Reporter: Jeremiah Lowin
>Assignee: Jeremiah Lowin
>
> The PR tool crashes if a non-integer id is passed. This includes the default 
> ID  (AIRFLOW-XXX) so it affects folks who don't type in a new ID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (AIRFLOW-176) PR tool crashes with non-integer JIRA ids

2016-05-26 Thread Chris Riccomini (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini reopened AIRFLOW-176:
-

Re-opening. Saw that theres a second PR for this here:

https://github.com/apache/incubator-airflow/pull/1546

> PR tool crashes with non-integer JIRA ids
> -
>
> Key: AIRFLOW-176
> URL: https://issues.apache.org/jira/browse/AIRFLOW-176
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: PR tool
>Affects Versions: Airflow 1.7.1.2
>Reporter: Jeremiah Lowin
>Assignee: Jeremiah Lowin
>
> The PR tool crashes if a non-integer id is passed. This includes the default 
> ID  (AIRFLOW-XXX) so it affects folks who don't type in a new ID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-178) Zip files in DAG folder does not get picked up by Ariflow

2016-05-26 Thread Chris Riccomini (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302242#comment-15302242
 ] 

Chris Riccomini commented on AIRFLOW-178:
-

+1 merged and committed.

> Zip files in DAG folder does not get picked up by Ariflow
> -
>
> Key: AIRFLOW-178
> URL: https://issues.apache.org/jira/browse/AIRFLOW-178
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Joy Gao
>Assignee: Joy Gao
>Priority: Minor
>
> The collect_dags method in DagBag class currently skips any file that does 
> not end in '.py', thereby skipping potential zip files in the DAG folder.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-176) PR tool crashes with non-integer JIRA ids

2016-05-26 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302265#comment-15302265
 ] 

ASF subversion and git services commented on AIRFLOW-176:
-

Commit ff7e03bc6b06c9eebb14305272f5cb951a991641 in incubator-airflow's branch 
refs/heads/master from [~jlowin]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=ff7e03b ]

[AIRFLOW-176] remove unused formatting key


> PR tool crashes with non-integer JIRA ids
> -
>
> Key: AIRFLOW-176
> URL: https://issues.apache.org/jira/browse/AIRFLOW-176
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: PR tool
>Affects Versions: Airflow 1.7.1.2
>Reporter: Jeremiah Lowin
>Assignee: Jeremiah Lowin
>
> The PR tool crashes if a non-integer id is passed. This includes the default 
> ID  (AIRFLOW-XXX) so it affects folks who don't type in a new ID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/2] incubator-airflow git commit: [AIRFLOW-176] remove unused formatting key

2016-05-26 Thread jlowin
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 01b3291cc -> fedbacb0e


[AIRFLOW-176] remove unused formatting key


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/ff7e03bc
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/ff7e03bc
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/ff7e03bc

Branch: refs/heads/master
Commit: ff7e03bc6b06c9eebb14305272f5cb951a991641
Parents: 387f08c
Author: jlowin 
Authored: Thu May 26 10:40:07 2016 -0400
Committer: jlowin 
Committed: Thu May 26 10:40:07 2016 -0400

--
 dev/airflow-pr | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/ff7e03bc/dev/airflow-pr
--
diff --git a/dev/airflow-pr b/dev/airflow-pr
index 77b9e3b..d20a8e1 100755
--- a/dev/airflow-pr
+++ b/dev/airflow-pr
@@ -391,7 +391,7 @@ def resolve_jira_issue(comment=None, jira_id=None, 
merge_branches=None):
 continue_maybe('Proceed with AIRFLOW-{}?'.format(jira_id))
 
 comment = click.prompt(
-'Please enter a comment to explain why {jid} is being closed'.format(
+'Please enter a comment to explain why {} is being closed'.format(
 jira_id),
 default=comment)
 



[2/2] incubator-airflow git commit: Merge pull request #1546 from jlowin/pr-tool-4

2016-05-26 Thread jlowin
Merge pull request #1546 from jlowin/pr-tool-4


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/fedbacb0
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/fedbacb0
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/fedbacb0

Branch: refs/heads/master
Commit: fedbacb0e42e78a94fdbb770ad237a2541004692
Parents: 01b3291 ff7e03b
Author: jlowin 
Authored: Thu May 26 11:36:56 2016 -0400
Committer: jlowin 
Committed: Thu May 26 11:36:56 2016 -0400

--
 dev/airflow-pr | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




[jira] [Closed] (AIRFLOW-176) PR tool crashes with non-integer JIRA ids

2016-05-26 Thread Jeremiah Lowin (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremiah Lowin closed AIRFLOW-176.
--
Resolution: Fixed

Merged in https://github.com/apache/incubator-airflow/pull/1546

> PR tool crashes with non-integer JIRA ids
> -
>
> Key: AIRFLOW-176
> URL: https://issues.apache.org/jira/browse/AIRFLOW-176
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: PR tool
>Affects Versions: Airflow 1.7.1.2
>Reporter: Jeremiah Lowin
>Assignee: Jeremiah Lowin
>
> The PR tool crashes if a non-integer id is passed. This includes the default 
> ID  (AIRFLOW-XXX) so it affects folks who don't type in a new ID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (AIRFLOW-177) Resume a failed dag

2016-05-26 Thread Chris Riccomini (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini closed AIRFLOW-177.
---
Resolution: Information Provided

Closing. Re-open if you've got more questions.

> Resume a failed dag
> ---
>
> Key: AIRFLOW-177
> URL: https://issues.apache.org/jira/browse/AIRFLOW-177
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: core
>Reporter: Sumit Maheshwari
>
> Say I've a dag with 10 nodes and one of the dag run got failed at 5th node. 
> Now if I want to resume that dag, I can go and run individual task one by 
> one. Is there any way by which I can just tell dag_id and execution_date (or 
> run_id) and it automatically retries only failed tasks?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-177) Resume a failed dag

2016-05-26 Thread Chris Riccomini (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302275#comment-15302275
 ] 

Chris Riccomini commented on AIRFLOW-177:
-

I think what you want is clear > downstream. If you go to the tree view for a 
DAG, click the the 5th node, a click 'Downstream', and then 'Clear'. This will 
effectively resume the DAG from the failed task.

> Resume a failed dag
> ---
>
> Key: AIRFLOW-177
> URL: https://issues.apache.org/jira/browse/AIRFLOW-177
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: core
>Reporter: Sumit Maheshwari
>
> Say I've a dag with 10 nodes and one of the dag run got failed at 5th node. 
> Now if I want to resume that dag, I can go and run individual task one by 
> one. Is there any way by which I can just tell dag_id and execution_date (or 
> run_id) and it automatically retries only failed tasks?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/2] incubator-airflow git commit: Merge pull request #1541 from msumit/AIRFLOW-167

2016-05-26 Thread criccomini
Merge pull request #1541 from msumit/AIRFLOW-167


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/b9efdc62
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/b9efdc62
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/b9efdc62

Branch: refs/heads/master
Commit: b9efdc6202e5cb456e55755f2e2ba823fc83a41b
Parents: fedbacb 9db0051
Author: Chris Riccomini 
Authored: Thu May 26 08:51:23 2016 -0700
Committer: Chris Riccomini 
Committed: Thu May 26 08:51:23 2016 -0700

--
 airflow/bin/cli.py | 16 
 airflow/models.py  |  7 ++-
 tests/core.py  |  4 
 3 files changed, 26 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/b9efdc62/airflow/models.py
--



[jira] [Commented] (AIRFLOW-167) Get dag state for a given execution date.

2016-05-26 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302288#comment-15302288
 ] 

ASF subversion and git services commented on AIRFLOW-167:
-

Commit 9db00511da22c731d10cdc4ea40942c77b1b4008 in incubator-airflow's branch 
refs/heads/master from [~sumitm]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=9db0051 ]

AIRFLOW-167: Add dag_state option in cli


> Get dag state for a given execution date.
> -
>
> Key: AIRFLOW-167
> URL: https://issues.apache.org/jira/browse/AIRFLOW-167
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: cli
>Reporter: Sumit Maheshwari
>Assignee: Sumit Maheshwari
>
> I was trying to get state for a particular dag-run programmatically, but 
> couldn't find a way. 
> If we could have a rest call like 
> `/admin/dagrun?dag_id=&execution_date=` and get the output that 
> would be best. Currently we've to do html parsing to get the same. 
> Other (and easier) way is to add a cli support like we have for `task_state`.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-167) Get dag state for a given execution date.

2016-05-26 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302289#comment-15302289
 ] 

ASF subversion and git services commented on AIRFLOW-167:
-

Commit b9efdc6202e5cb456e55755f2e2ba823fc83a41b in incubator-airflow's branch 
refs/heads/master from [~criccomini]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=b9efdc6 ]

Merge pull request #1541 from msumit/AIRFLOW-167


> Get dag state for a given execution date.
> -
>
> Key: AIRFLOW-167
> URL: https://issues.apache.org/jira/browse/AIRFLOW-167
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: cli
>Reporter: Sumit Maheshwari
>Assignee: Sumit Maheshwari
>
> I was trying to get state for a particular dag-run programmatically, but 
> couldn't find a way. 
> If we could have a rest call like 
> `/admin/dagrun?dag_id=&execution_date=` and get the output that 
> would be best. Currently we've to do html parsing to get the same. 
> Other (and easier) way is to add a cli support like we have for `task_state`.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/2] incubator-airflow git commit: AIRFLOW-167: Add dag_state option in cli

2016-05-26 Thread criccomini
Repository: incubator-airflow
Updated Branches:
  refs/heads/master fedbacb0e -> b9efdc620


AIRFLOW-167: Add dag_state option in cli


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/9db00511
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/9db00511
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/9db00511

Branch: refs/heads/master
Commit: 9db00511da22c731d10cdc4ea40942c77b1b4008
Parents: 456dada
Author: Sumit Maheshwari 
Authored: Tue May 24 21:25:26 2016 +0530
Committer: Sumit Maheshwari 
Committed: Thu May 26 13:14:27 2016 +0530

--
 airflow/bin/cli.py | 16 
 airflow/models.py  |  7 ++-
 tests/core.py  |  4 
 3 files changed, 26 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/9db00511/airflow/bin/cli.py
--
diff --git a/airflow/bin/cli.py b/airflow/bin/cli.py
index 3184455..840c375 100755
--- a/airflow/bin/cli.py
+++ b/airflow/bin/cli.py
@@ -325,6 +325,18 @@ def task_state(args):
 print(ti.current_state())
 
 
+def dag_state(args):
+"""
+Returns the state of a DagRun at the command line.
+
+>>> airflow dag_state tutorial 2015-01-01T00:00:00.00
+running
+"""
+dag = get_dag(args)
+dr = DagRun.find(dag.dag_id, execution_date=args.execution_date)
+print(dr[0].state if len(dr) > 0 else None)
+
+
 def list_dags(args):
 dagbag = DagBag(process_subdir(args.subdir))
 s = textwrap.dedent("""\n
@@ -886,6 +898,10 @@ class CLIFactory(object):
 'help': "List all the DAGs",
 'args': ('subdir', 'report'),
 }, {
+'func': dag_state,
+'help': "Get the status of a dag run",
+'args': ('dag_id', 'execution_date', 'subdir'),
+}, {
 'func': task_state,
 'help': "Get the status of a task instance",
 'args': ('dag_id', 'task_id', 'execution_date', 'subdir'),

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/9db00511/airflow/models.py
--
diff --git a/airflow/models.py b/airflow/models.py
index 67958f2..461f3c3 100644
--- a/airflow/models.py
+++ b/airflow/models.py
@@ -3438,7 +3438,8 @@ class DagRun(Base):
 
 @staticmethod
 @provide_session
-def find(dag_id, run_id=None, state=None, external_trigger=None, 
session=None):
+def find(dag_id, run_id=None, state=None, external_trigger=None, 
session=None,
+ execution_date=None):
 """
 Returns a set of dag runs for the given search criteria.
 :param run_id: defines the the run id for this dag run
@@ -3449,6 +3450,8 @@ class DagRun(Base):
 :type external_trigger: bool
 :param session: database session
 :type session: Session
+:param execution_date: execution date for the dag
+:type execution_date: string
 """
 DR = DagRun
 
@@ -3459,6 +3462,8 @@ class DagRun(Base):
 qry = qry.filter(DR.state == state)
 if external_trigger:
 qry = qry.filter(DR.external_trigger == external_trigger)
+if execution_date:
+qry = qry.filter(DR.execution_date == execution_date)
 dr = qry.all()
 
 return dr

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/9db00511/tests/core.py
--
diff --git a/tests/core.py b/tests/core.py
index 4b5d563..80ad477 100644
--- a/tests/core.py
+++ b/tests/core.py
@@ -678,6 +678,10 @@ class CliTests(unittest.TestCase):
 'task_state', 'example_bash_operator', 'runme_0',
 DEFAULT_DATE.isoformat()]))
 
+def test_dag_state(self):
+self.assertEqual(None, cli.dag_state(self.parser.parse_args([
+'dag_state', 'example_bash_operator', DEFAULT_DATE.isoformat()])))
+
 def test_pause(self):
 args = self.parser.parse_args([
 'pause', 'example_bash_operator'])



[jira] [Closed] (AIRFLOW-167) Get dag state for a given execution date.

2016-05-26 Thread Chris Riccomini (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini closed AIRFLOW-167.
---
   Resolution: Fixed
Fix Version/s: Airflow 1.8

+1 merged and committed

> Get dag state for a given execution date.
> -
>
> Key: AIRFLOW-167
> URL: https://issues.apache.org/jira/browse/AIRFLOW-167
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: cli
>Reporter: Sumit Maheshwari
>Assignee: Sumit Maheshwari
> Fix For: Airflow 1.8
>
>
> I was trying to get state for a particular dag-run programmatically, but 
> couldn't find a way. 
> If we could have a rest call like 
> `/admin/dagrun?dag_id=&execution_date=` and get the output that 
> would be best. Currently we've to do html parsing to get the same. 
> Other (and easier) way is to add a cli support like we have for `task_state`.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-178) Zip files in DAG folder does not get picked up by Ariflow

2016-05-26 Thread Chris Riccomini (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini updated AIRFLOW-178:

Fix Version/s: Airflow 1.8

> Zip files in DAG folder does not get picked up by Ariflow
> -
>
> Key: AIRFLOW-178
> URL: https://issues.apache.org/jira/browse/AIRFLOW-178
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: Airflow 1.7.1.2
>Reporter: Joy Gao
>Assignee: Joy Gao
>Priority: Minor
> Fix For: Airflow 1.8
>
>
> The collect_dags method in DagBag class currently skips any file that does 
> not end in '.py', thereby skipping potential zip files in the DAG folder.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (AIRFLOW-178) Zip files in DAG folder does not get picked up by Ariflow

2016-05-26 Thread Chris Riccomini (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini closed AIRFLOW-178.
---
Resolution: Fixed

> Zip files in DAG folder does not get picked up by Ariflow
> -
>
> Key: AIRFLOW-178
> URL: https://issues.apache.org/jira/browse/AIRFLOW-178
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: Airflow 1.7.1.2
>Reporter: Joy Gao
>Assignee: Joy Gao
>Priority: Minor
> Fix For: Airflow 1.8
>
>
> The collect_dags method in DagBag class currently skips any file that does 
> not end in '.py', thereby skipping potential zip files in the DAG folder.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-178) Zip files in DAG folder does not get picked up by Ariflow

2016-05-26 Thread Chris Riccomini (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini updated AIRFLOW-178:

Affects Version/s: Airflow 1.7.1.2

> Zip files in DAG folder does not get picked up by Ariflow
> -
>
> Key: AIRFLOW-178
> URL: https://issues.apache.org/jira/browse/AIRFLOW-178
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: Airflow 1.7.1.2
>Reporter: Joy Gao
>Assignee: Joy Gao
>Priority: Minor
> Fix For: Airflow 1.8
>
>
> The collect_dags method in DagBag class currently skips any file that does 
> not end in '.py', thereby skipping potential zip files in the DAG folder.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-168) schedule_interval @once scheduling dag atleast twice

2016-05-26 Thread Chris Riccomini (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini updated AIRFLOW-168:

Assignee: Bolke de Bruin

> schedule_interval @once scheduling dag atleast twice
> 
>
> Key: AIRFLOW-168
> URL: https://issues.apache.org/jira/browse/AIRFLOW-168
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: Airflow 1.7.1.2
>Reporter: Sumit Maheshwari
>Assignee: Bolke de Bruin
> Attachments: Screen Shot 2016-05-24 at 9.51.50 PM.png, 
> screenshot-1.png
>
>
> I was looking at example_xcom example and found that it got scheduled twice. 
> Ones at the start_time and ones at the current time. To be correct I tried 
> multiple times (by reloading db) and its same. 
> I am on airflow master, using sequential executor with sqlite3. Though it 
> works as expected on a prod env which is running v1.7 with celery workers and 
> mysql backend.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-177) Resume a failed dag

2016-05-26 Thread Sumit Maheshwari (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302317#comment-15302317
 ] 

Sumit Maheshwari commented on AIRFLOW-177:
--

Thanks Chris.



> Resume a failed dag
> ---
>
> Key: AIRFLOW-177
> URL: https://issues.apache.org/jira/browse/AIRFLOW-177
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: core
>Reporter: Sumit Maheshwari
>
> Say I've a dag with 10 nodes and one of the dag run got failed at 5th node. 
> Now if I want to resume that dag, I can go and run individual task one by 
> one. Is there any way by which I can just tell dag_id and execution_date (or 
> run_id) and it automatically retries only failed tasks?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AIRFLOW-180) Sensors do not test timeout correctly if timeout is more than a day/

2016-05-26 Thread Arthur Wiedmer (JIRA)
Arthur Wiedmer created AIRFLOW-180:
--

 Summary: Sensors do not test timeout correctly if timeout is more 
than a day/
 Key: AIRFLOW-180
 URL: https://issues.apache.org/jira/browse/AIRFLOW-180
 Project: Apache Airflow
  Issue Type: Bug
Affects Versions: Airflow 1.7.1, Airflow 1.7.0, Airflow 1.6.2, Airflow 
1.7.1.2
Reporter: Arthur Wiedmer
Assignee: Arthur Wiedmer
 Fix For: Airflow 1.8


Currently the sensors tests the timedelta seconds, instead of the timedelta 
total_seconds



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-168) schedule_interval @once scheduling dag atleast twice

2016-05-26 Thread Bolke de Bruin (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302703#comment-15302703
 ] 

Bolke de Bruin commented on AIRFLOW-168:


I confirmed that this is actually also an issue on 1.7.1, but you will not see 
it:

{code}
mysql> select * from dag_run;
++--+-+-+---+--+--+--+-+
| id | dag_id   | execution_date  | state   | run_id
| external_trigger | conf | end_date | start_date  |
++--+-+-+---+--+--+--+-+
|  1 | example_xcom | 2016-05-26 19:01:39 | running | 
scheduled__2016-05-26T19:01:39.071011 |0 | NULL | NULL | 
2016-05-26 19:01:39 |
|  2 | example_xcom | 2015-01-01 00:00:00 | running | 
scheduled__2015-01-01T00:00:00|0 | NULL | NULL | 
2016-05-26 19:01:44 |
++--+-+-+---+--+--+--+-+
{code}

> schedule_interval @once scheduling dag atleast twice
> 
>
> Key: AIRFLOW-168
> URL: https://issues.apache.org/jira/browse/AIRFLOW-168
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: Airflow 1.7.1.2
>Reporter: Sumit Maheshwari
>Assignee: Bolke de Bruin
> Attachments: Screen Shot 2016-05-24 at 9.51.50 PM.png, 
> screenshot-1.png
>
>
> I was looking at example_xcom example and found that it got scheduled twice. 
> Ones at the start_time and ones at the current time. To be correct I tried 
> multiple times (by reloading db) and its same. 
> I am on airflow master, using sequential executor with sqlite3. Though it 
> works as expected on a prod env which is running v1.7 with celery workers and 
> mysql backend.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AIRFLOW-181) Travis builds fail due to corrupt cache

2016-05-26 Thread Bolke de Bruin (JIRA)
Bolke de Bruin created AIRFLOW-181:
--

 Summary: Travis builds fail due to corrupt cache
 Key: AIRFLOW-181
 URL: https://issues.apache.org/jira/browse/AIRFLOW-181
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Bolke de Bruin


Corrupt cache is preventing from unpacking hadoop. It needs to redownload the 
distribution without checking the cache



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-181) Travis builds fail due to corrupt cache

2016-05-26 Thread Bolke de Bruin (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bolke de Bruin updated AIRFLOW-181:
---
External issue URL: https://github.com/apache/incubator-airflow/pull/1548

> Travis builds fail due to corrupt cache
> ---
>
> Key: AIRFLOW-181
> URL: https://issues.apache.org/jira/browse/AIRFLOW-181
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Bolke de Bruin
>
> Corrupt cache is preventing from unpacking hadoop. It needs to redownload the 
> distribution without checking the cache



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


incubator-airflow git commit: AIRFLOW-181 Fix failing unpacking of hadoop by redownloading

2016-05-26 Thread bolke
Repository: incubator-airflow
Updated Branches:
  refs/heads/master b9efdc620 -> afcd4fcf0


AIRFLOW-181 Fix failing unpacking of hadoop by redownloading

curl compares timestamps, but if the file is corrupt this can
result in hadoop tars that are never updated. This adds a retry
without using the cache.


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/afcd4fcf
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/afcd4fcf
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/afcd4fcf

Branch: refs/heads/master
Commit: afcd4fcf01696ee26911640cdeb481defd93c3aa
Parents: b9efdc6
Author: Bolke de Bruin 
Authored: Thu May 26 21:37:27 2016 +0200
Committer: Bolke de Bruin 
Committed: Thu May 26 21:54:49 2016 +0200

--
 scripts/ci/setup_env.sh | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/afcd4fcf/scripts/ci/setup_env.sh
--
diff --git a/scripts/ci/setup_env.sh b/scripts/ci/setup_env.sh
index b04de2f..1eca814 100755
--- a/scripts/ci/setup_env.sh
+++ b/scripts/ci/setup_env.sh
@@ -94,7 +94,14 @@ tar zxf ${TRAVIS_CACHE}/${HADOOP_DISTRO}/hadoop.tar.gz 
--strip-components 1 -C $
 
 if [ $? != 0 ]; then
 echo "Failed to extract Hadoop from ${HADOOP_HOME}/hadoop.tar.gz to 
${HADOOP_HOME} - abort" >&2
-exit 1
+echo "Trying again..." >&2
+# dont use cache
+curl -o ${TRAVIS_CACHE}/${HADOOP_DISTRO}/hadoop.tar.gz -L $URL
+tar zxf ${TRAVIS_CACHE}/${HADOOP_DISTRO}/hadoop.tar.gz --strip-components 
1 -C $HADOOP_HOME
+if [ $? != 0 ]; then
+echo "Failed twice in downloading and unpacking hadoop!" >&2
+exit 1
+fi
 fi
 
 echo "Downloading and unpacking hive"
@@ -108,4 +115,4 @@ unzip ${TRAVIS_CACHE}/minicluster/minicluster.zip -d /tmp
 
 echo "Path = ${PATH}"
 
-java -cp "/tmp/minicluster-1.1-SNAPSHOT/*" com.ing.minicluster.MiniCluster &
+java -cp "/tmp/minicluster-1.1-SNAPSHOT/*" com.ing.minicluster.MiniCluster > 
/dev/null &



[jira] [Commented] (AIRFLOW-181) Travis builds fail due to corrupt cache

2016-05-26 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302826#comment-15302826
 ] 

ASF subversion and git services commented on AIRFLOW-181:
-

Commit afcd4fcf01696ee26911640cdeb481defd93c3aa in incubator-airflow's branch 
refs/heads/master from [~bolke]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=afcd4fc ]

AIRFLOW-181 Fix failing unpacking of hadoop by redownloading

curl compares timestamps, but if the file is corrupt this can
result in hadoop tars that are never updated. This adds a retry
without using the cache.


> Travis builds fail due to corrupt cache
> ---
>
> Key: AIRFLOW-181
> URL: https://issues.apache.org/jira/browse/AIRFLOW-181
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Bolke de Bruin
>
> Corrupt cache is preventing from unpacking hadoop. It needs to redownload the 
> distribution without checking the cache



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-179) DbApiHook string serialization fails when string contains non-ASCII characters

2016-05-26 Thread John Bodley (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Bodley updated AIRFLOW-179:

Description: 
The DbApiHook.insert_rows(...) method tries to serialize all values to strings 
using the ASCII codec,  this is problematic if the cell contains non-ASCII 
characters, i.e.

>>> from airflow.hooks import DbApiHook
>>> DbApiHook._serialize_cell('Nguyễn Tấn Dũng')
Traceback (most recent call last):
  File "", line 1, in 
  File 
"/usr/local/lib/python2.7/dist-packages/airflow/hooks/dbapi_hook.py", line 196, 
in _serialize_cell
return "'" + str(cell).replace("'", "''") + "'"
  File "/usr/local/lib/python2.7/dist-packages/future/types/newstr.py", 
line 102, in __new__
return super(newstr, cls).__new__(cls, value)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 4: 
ordinal not in range(128)

Rather than manually trying to serialize and escape values to an ASCII string 
one should try to serialize the value to string using the character set of the 
corresponding target database leveraging the connection to mutate the object to 
the SQL string literal.

Additionally the escaping logic for single quotes (') within the 
_serialize_cell method seems wrong, i.e. 

str(cell).replace("'", "''")

would escape the string "you're" to be "'you''ve'" as opposed to "'you\'ve'".

Note an exception should still be thrown if the target encoding is not 
compatible with the source encoding.

  was:
The DbApiHook.insert_rows(...) method tries to serialize all values to strings 
using the ASCII codec,  this is problematic if the cell contains non-ASCII 
characters, i.e.

>>> from airflow.hooks import DbApiHook
>>> DbApiHook._serialize_cell('Nguyễn Tấn Dũng')
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/local/lib/python2.7/dist-packages/airflow/hooks/dbapi_hook.py", 
line 196, in _serialize_cell
return "'" + str(cell).replace("'", "''") + "'"
  File "/usr/local/lib/python2.7/dist-packages/future/types/newstr.py", line 
102, in __new__
return super(newstr, cls).__new__(cls, value)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 4: ordinal 
not in range(128)


Rather than manually trying to serialize and escape values to an ASCII string 
one should try to serialize the value to string using the character set of the 
corresponding target database leveraging the connection to mutate the object to 
the SQL string literal.

Note an exception should still be thrown if the target encoding is not 
compatible with the source encoding.


> DbApiHook string serialization fails when string contains non-ASCII characters
> --
>
> Key: AIRFLOW-179
> URL: https://issues.apache.org/jira/browse/AIRFLOW-179
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Reporter: John Bodley
>Assignee: John Bodley
>
> The DbApiHook.insert_rows(...) method tries to serialize all values to 
> strings using the ASCII codec,  this is problematic if the cell contains 
> non-ASCII characters, i.e.
> >>> from airflow.hooks import DbApiHook
> >>> DbApiHook._serialize_cell('Nguyễn Tấn Dũng')
> Traceback (most recent call last):
>   File "", line 1, in 
>   File 
> "/usr/local/lib/python2.7/dist-packages/airflow/hooks/dbapi_hook.py", line 
> 196, in _serialize_cell
> return "'" + str(cell).replace("'", "''") + "'"
>   File "/usr/local/lib/python2.7/dist-packages/future/types/newstr.py", 
> line 102, in __new__
> return super(newstr, cls).__new__(cls, value)
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 4: 
> ordinal not in range(128)
> Rather than manually trying to serialize and escape values to an ASCII string 
> one should try to serialize the value to string using the character set of 
> the corresponding target database leveraging the connection to mutate the 
> object to the SQL string literal.
> Additionally the escaping logic for single quotes (') within the 
> _serialize_cell method seems wrong, i.e. 
> str(cell).replace("'", "''")
> would escape the string "you're" to be "'you''ve'" as opposed to "'you\'ve'".
> Note an exception should still be thrown if the target encoding is not 
> compatible with the source encoding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-181) Travis builds fail due to corrupt cache

2016-05-26 Thread Chris Riccomini (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini updated AIRFLOW-181:

Assignee: Bolke de Bruin

> Travis builds fail due to corrupt cache
> ---
>
> Key: AIRFLOW-181
> URL: https://issues.apache.org/jira/browse/AIRFLOW-181
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Bolke de Bruin
>Assignee: Bolke de Bruin
>
> Corrupt cache is preventing from unpacking hadoop. It needs to redownload the 
> distribution without checking the cache



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-179) DbApiHook string serialization fails when string contains non-ASCII characters

2016-05-26 Thread Chris Riccomini (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303076#comment-15303076
 ] 

Chris Riccomini commented on AIRFLOW-179:
-

Derp, just saw https://github.com/apache/incubator-airflow/pull/1550

> DbApiHook string serialization fails when string contains non-ASCII characters
> --
>
> Key: AIRFLOW-179
> URL: https://issues.apache.org/jira/browse/AIRFLOW-179
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Reporter: John Bodley
>Assignee: John Bodley
>
> The DbApiHook.insert_rows(...) method tries to serialize all values to 
> strings using the ASCII codec,  this is problematic if the cell contains 
> non-ASCII characters, i.e.
> >>> from airflow.hooks import DbApiHook
> >>> DbApiHook._serialize_cell('Nguyễn Tấn Dũng')
> Traceback (most recent call last):
>   File "", line 1, in 
>   File 
> "/usr/local/lib/python2.7/dist-packages/airflow/hooks/dbapi_hook.py", line 
> 196, in _serialize_cell
> return "'" + str(cell).replace("'", "''") + "'"
>   File "/usr/local/lib/python2.7/dist-packages/future/types/newstr.py", 
> line 102, in __new__
> return super(newstr, cls).__new__(cls, value)
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 4: 
> ordinal not in range(128)
> Rather than manually trying to serialize and escape values to an ASCII string 
> one should try to serialize the value to string using the character set of 
> the corresponding target database leveraging the connection to mutate the 
> object to the SQL string literal.
> Additionally the escaping logic for single quotes (') within the 
> _serialize_cell method seems wrong, i.e. 
> str(cell).replace("'", "''")
> would escape the string "you're" to be "'you''ve'" as opposed to "'you\'ve'".
> Note an exception should still be thrown if the target encoding is not 
> compatible with the source encoding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-179) DbApiHook string serialization fails when string contains non-ASCII characters

2016-05-26 Thread Chris Riccomini (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303074#comment-15303074
 ] 

Chris Riccomini commented on AIRFLOW-179:
-

[~john.bod...@gmail.com], would you be up for sending a PR in to fix this?

> DbApiHook string serialization fails when string contains non-ASCII characters
> --
>
> Key: AIRFLOW-179
> URL: https://issues.apache.org/jira/browse/AIRFLOW-179
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Reporter: John Bodley
>Assignee: John Bodley
>
> The DbApiHook.insert_rows(...) method tries to serialize all values to 
> strings using the ASCII codec,  this is problematic if the cell contains 
> non-ASCII characters, i.e.
> >>> from airflow.hooks import DbApiHook
> >>> DbApiHook._serialize_cell('Nguyễn Tấn Dũng')
> Traceback (most recent call last):
>   File "", line 1, in 
>   File 
> "/usr/local/lib/python2.7/dist-packages/airflow/hooks/dbapi_hook.py", line 
> 196, in _serialize_cell
> return "'" + str(cell).replace("'", "''") + "'"
>   File "/usr/local/lib/python2.7/dist-packages/future/types/newstr.py", 
> line 102, in __new__
> return super(newstr, cls).__new__(cls, value)
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 4: 
> ordinal not in range(128)
> Rather than manually trying to serialize and escape values to an ASCII string 
> one should try to serialize the value to string using the character set of 
> the corresponding target database leveraging the connection to mutate the 
> object to the SQL string literal.
> Additionally the escaping logic for single quotes (') within the 
> _serialize_cell method seems wrong, i.e. 
> str(cell).replace("'", "''")
> would escape the string "you're" to be "'you''ve'" as opposed to "'you\'ve'".
> Note an exception should still be thrown if the target encoding is not 
> compatible with the source encoding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-179) DbApiHook string serialization fails when string contains non-ASCII characters

2016-05-26 Thread Chris Riccomini (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini updated AIRFLOW-179:

External issue URL: https://github.com/apache/incubator-airflow/pull/1550

> DbApiHook string serialization fails when string contains non-ASCII characters
> --
>
> Key: AIRFLOW-179
> URL: https://issues.apache.org/jira/browse/AIRFLOW-179
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Reporter: John Bodley
>Assignee: John Bodley
>
> The DbApiHook.insert_rows(...) method tries to serialize all values to 
> strings using the ASCII codec,  this is problematic if the cell contains 
> non-ASCII characters, i.e.
> >>> from airflow.hooks import DbApiHook
> >>> DbApiHook._serialize_cell('Nguyễn Tấn Dũng')
> Traceback (most recent call last):
>   File "", line 1, in 
>   File 
> "/usr/local/lib/python2.7/dist-packages/airflow/hooks/dbapi_hook.py", line 
> 196, in _serialize_cell
> return "'" + str(cell).replace("'", "''") + "'"
>   File "/usr/local/lib/python2.7/dist-packages/future/types/newstr.py", 
> line 102, in __new__
> return super(newstr, cls).__new__(cls, value)
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 4: 
> ordinal not in range(128)
> Rather than manually trying to serialize and escape values to an ASCII string 
> one should try to serialize the value to string using the character set of 
> the corresponding target database leveraging the connection to mutate the 
> object to the SQL string literal.
> Additionally the escaping logic for single quotes (') within the 
> _serialize_cell method seems wrong, i.e. 
> str(cell).replace("'", "''")
> would escape the string "you're" to be "'you''ve'" as opposed to "'you\'ve'".
> Note an exception should still be thrown if the target encoding is not 
> compatible with the source encoding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-180) Sensors do not test timeout correctly if timeout is more than a day/

2016-05-26 Thread Chris Riccomini (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini updated AIRFLOW-180:

External issue URL: https://github.com/apache/incubator-airflow/pull/1547

> Sensors do not test timeout correctly if timeout is more than a day/
> 
>
> Key: AIRFLOW-180
> URL: https://issues.apache.org/jira/browse/AIRFLOW-180
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: Airflow 1.7.1, Airflow 1.7.0, Airflow 1.6.2, Airflow 
> 1.7.1.2
>Reporter: Arthur Wiedmer
>Assignee: Arthur Wiedmer
>  Labels: sensors
> Fix For: Airflow 1.8
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Currently the sensors tests the timedelta seconds, instead of the timedelta 
> total_seconds



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AIRFLOW-182) CLI command `airflow backfill` fails while CLI `airflow run` succeeds

2016-05-26 Thread Hariharan Mohanraj (JIRA)
Hariharan Mohanraj created AIRFLOW-182:
--

 Summary: CLI command `airflow backfill` fails while CLI `airflow 
run` succeeds
 Key: AIRFLOW-182
 URL: https://issues.apache.org/jira/browse/AIRFLOW-182
 Project: Apache Airflow
  Issue Type: Bug
  Components: celery
Affects Versions: Airflow 1.7.0
 Environment: Heroku Cedar 14, Heroku Redis as Celery Broker
Reporter: Hariharan Mohanraj


When I run the backfill command, I get an error that claims there is no dag in 
my dag folder with the name "unusual_prefix_dag1", although my dag is actually 
named dag1. However when I run the run command, the task is scheduled and it 
works flawlessly.

{code}
$ airflow backfill -t task1 -s 2016-05-01 -e 2016-05-07 dag1

2016-05-26T23:22:28.816908+00:00 app[worker.1]: [2016-05-26 23:22:28,816] 
{__init__.py:36} INFO - Using executor CeleryExecutor
2016-05-26T23:22:29.214006+00:00 app[worker.1]: Traceback (most recent call 
last):
2016-05-26T23:22:29.214083+00:00 app[worker.1]:   File 
"/app/.heroku/python/bin/airflow", line 15, in 
2016-05-26T23:22:29.214121+00:00 app[worker.1]: args.func(args)
2016-05-26T23:22:29.214151+00:00 app[worker.1]:   File 
"/app/.heroku/python/lib/python2.7/site-packages/airflow/bin/cli.py", line 174, 
in run
2016-05-26T23:22:29.214207+00:00 app[worker.1]: 
DagPickle).filter(DagPickle.id == args.pickle).first()
2016-05-26T23:22:29.214230+00:00 app[worker.1]:   File 
"/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 
2634, in first
2016-05-26T23:22:29.214616+00:00 app[worker.1]: ret = list(self[0:1])
2016-05-26T23:22:29.214626+00:00 app[worker.1]:   File 
"/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 
2457, in __getitem__
2016-05-26T23:22:29.214984+00:00 app[worker.1]: return list(res)
2016-05-26T23:22:29.214992+00:00 app[worker.1]:   File 
"/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/orm/loading.py", 
line 86, in instances
2016-05-26T23:22:29.215053+00:00 app[worker.1]: util.raise_from_cause(err)
2016-05-26T23:22:29.215074+00:00 app[worker.1]:   File 
"/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/util/compat.py", 
line 200, in raise_from_cause
2016-05-26T23:22:29.215121+00:00 app[worker.1]: reraise(type(exception), 
exception, tb=exc_tb, cause=cause)
2016-05-26T23:22:29.215142+00:00 app[worker.1]:   File 
"/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/orm/loading.py", 
line 71, in instances
2016-05-26T23:22:29.215175+00:00 app[worker.1]: rows = [proc(row) for row 
in fetch]
2016-05-26T23:22:29.215200+00:00 app[worker.1]:   File 
"/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/orm/loading.py", 
line 428, in _instance
2016-05-26T23:22:29.215274+00:00 app[worker.1]: loaded_instance, 
populate_existing, populators)
2016-05-26T23:22:29.215282+00:00 app[worker.1]:   File 
"/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/orm/loading.py", 
line 486, in _populate_full
2016-05-26T23:22:29.215369+00:00 app[worker.1]: dict_[key] = getter(row)
2016-05-26T23:22:29.215406+00:00 app[worker.1]:   File 
"/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py", 
line 1253, in process
2016-05-26T23:22:29.215574+00:00 app[worker.1]: return loads(value)
2016-05-26T23:22:29.215595+00:00 app[worker.1]:   File 
"/app/.heroku/python/lib/python2.7/site-packages/dill/dill.py", line 260, in 
loads
2016-05-26T23:22:29.215657+00:00 app[worker.1]: return load(file)
2016-05-26T23:22:29.215678+00:00 app[worker.1]:   File 
"/app/.heroku/python/lib/python2.7/site-packages/dill/dill.py", line 250, in 
load
2016-05-26T23:22:29.215738+00:00 app[worker.1]: obj = pik.load()
2016-05-26T23:22:29.215758+00:00 app[worker.1]:   File 
"/app/.heroku/python/lib/python2.7/pickle.py", line 858, in load
2016-05-26T23:22:29.215895+00:00 app[worker.1]: dispatch[key](self)
2016-05-26T23:22:29.215902+00:00 app[worker.1]:   File 
"/app/.heroku/python/lib/python2.7/pickle.py", line 1090, in load_global
2016-05-26T23:22:29.216069+00:00 app[worker.1]: klass = 
self.find_class(module, name)
2016-05-26T23:22:29.216077+00:00 app[worker.1]:   File 
"/app/.heroku/python/lib/python2.7/site-packages/dill/dill.py", line 406, in 
find_class
2016-05-26T23:22:29.216181+00:00 app[worker.1]: return 
StockUnpickler.find_class(self, module, name)
2016-05-26T23:22:29.216190+00:00 app[worker.1]:   File 
"/app/.heroku/python/lib/python2.7/pickle.py", line 1124, in find_class
2016-05-26T23:22:29.216360+00:00 app[worker.1]: __import__(module)
2016-05-26T23:22:29.216412+00:00 app[worker.1]: ImportError: No module named 
unusual_prefix_dag1

# runs flawlessly
$ airflow run dag1 task1 2016-05-07
{code}

Apologies if the format is wrong or if I haven't provided enough information, 
this is the first time I've ever submitted an issue .. a

[jira] [Updated] (AIRFLOW-182) CLI command `airflow backfill` fails while CLI `airflow run` succeeds

2016-05-26 Thread Hariharan Mohanraj (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hariharan Mohanraj updated AIRFLOW-182:
---
Priority: Minor  (was: Major)

> CLI command `airflow backfill` fails while CLI `airflow run` succeeds
> -
>
> Key: AIRFLOW-182
> URL: https://issues.apache.org/jira/browse/AIRFLOW-182
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: celery
>Affects Versions: Airflow 1.7.0
> Environment: Heroku Cedar 14, Heroku Redis as Celery Broker
>Reporter: Hariharan Mohanraj
>Priority: Minor
>
> When I run the backfill command, I get an error that claims there is no dag 
> in my dag folder with the name "unusual_prefix_dag1", although my dag is 
> actually named dag1. However when I run the run command, the task is 
> scheduled and it works flawlessly.
> {code}
> $ airflow backfill -t task1 -s 2016-05-01 -e 2016-05-07 dag1
> 2016-05-26T23:22:28.816908+00:00 app[worker.1]: [2016-05-26 23:22:28,816] 
> {__init__.py:36} INFO - Using executor CeleryExecutor
> 2016-05-26T23:22:29.214006+00:00 app[worker.1]: Traceback (most recent call 
> last):
> 2016-05-26T23:22:29.214083+00:00 app[worker.1]:   File 
> "/app/.heroku/python/bin/airflow", line 15, in 
> 2016-05-26T23:22:29.214121+00:00 app[worker.1]: args.func(args)
> 2016-05-26T23:22:29.214151+00:00 app[worker.1]:   File 
> "/app/.heroku/python/lib/python2.7/site-packages/airflow/bin/cli.py", line 
> 174, in run
> 2016-05-26T23:22:29.214207+00:00 app[worker.1]: 
> DagPickle).filter(DagPickle.id == args.pickle).first()
> 2016-05-26T23:22:29.214230+00:00 app[worker.1]:   File 
> "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/orm/query.py", 
> line 2634, in first
> 2016-05-26T23:22:29.214616+00:00 app[worker.1]: ret = list(self[0:1])
> 2016-05-26T23:22:29.214626+00:00 app[worker.1]:   File 
> "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/orm/query.py", 
> line 2457, in __getitem__
> 2016-05-26T23:22:29.214984+00:00 app[worker.1]: return list(res)
> 2016-05-26T23:22:29.214992+00:00 app[worker.1]:   File 
> "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/orm/loading.py", 
> line 86, in instances
> 2016-05-26T23:22:29.215053+00:00 app[worker.1]: util.raise_from_cause(err)
> 2016-05-26T23:22:29.215074+00:00 app[worker.1]:   File 
> "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/util/compat.py", 
> line 200, in raise_from_cause
> 2016-05-26T23:22:29.215121+00:00 app[worker.1]: reraise(type(exception), 
> exception, tb=exc_tb, cause=cause)
> 2016-05-26T23:22:29.215142+00:00 app[worker.1]:   File 
> "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/orm/loading.py", 
> line 71, in instances
> 2016-05-26T23:22:29.215175+00:00 app[worker.1]: rows = [proc(row) for row 
> in fetch]
> 2016-05-26T23:22:29.215200+00:00 app[worker.1]:   File 
> "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/orm/loading.py", 
> line 428, in _instance
> 2016-05-26T23:22:29.215274+00:00 app[worker.1]: loaded_instance, 
> populate_existing, populators)
> 2016-05-26T23:22:29.215282+00:00 app[worker.1]:   File 
> "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/orm/loading.py", 
> line 486, in _populate_full
> 2016-05-26T23:22:29.215369+00:00 app[worker.1]: dict_[key] = getter(row)
> 2016-05-26T23:22:29.215406+00:00 app[worker.1]:   File 
> "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py", 
> line 1253, in process
> 2016-05-26T23:22:29.215574+00:00 app[worker.1]: return loads(value)
> 2016-05-26T23:22:29.215595+00:00 app[worker.1]:   File 
> "/app/.heroku/python/lib/python2.7/site-packages/dill/dill.py", line 260, in 
> loads
> 2016-05-26T23:22:29.215657+00:00 app[worker.1]: return load(file)
> 2016-05-26T23:22:29.215678+00:00 app[worker.1]:   File 
> "/app/.heroku/python/lib/python2.7/site-packages/dill/dill.py", line 250, in 
> load
> 2016-05-26T23:22:29.215738+00:00 app[worker.1]: obj = pik.load()
> 2016-05-26T23:22:29.215758+00:00 app[worker.1]:   File 
> "/app/.heroku/python/lib/python2.7/pickle.py", line 858, in load
> 2016-05-26T23:22:29.215895+00:00 app[worker.1]: dispatch[key](self)
> 2016-05-26T23:22:29.215902+00:00 app[worker.1]:   File 
> "/app/.heroku/python/lib/python2.7/pickle.py", line 1090, in load_global
> 2016-05-26T23:22:29.216069+00:00 app[worker.1]: klass = 
> self.find_class(module, name)
> 2016-05-26T23:22:29.216077+00:00 app[worker.1]:   File 
> "/app/.heroku/python/lib/python2.7/site-packages/dill/dill.py", line 406, in 
> find_class
> 2016-05-26T23:22:29.216181+00:00 app[worker.1]: return 
> StockUnpickler.find_class(self, module, name)
> 2016-05-26T23:22:29.216190+00:00 app[worker.1]:   File 
> "/app/.heroku/python/lib/python2.7/pickle

[jira] [Created] (AIRFLOW-183) webserver not retrieving from remote s3/gcs if a log file has been deleted from a remote worker

2016-05-26 Thread Yap Sok Ann (JIRA)
Yap Sok Ann created AIRFLOW-183:
---

 Summary: webserver not retrieving from remote s3/gcs if a log file 
has been deleted from a remote worker
 Key: AIRFLOW-183
 URL: https://issues.apache.org/jira/browse/AIRFLOW-183
 Project: Apache Airflow
  Issue Type: Bug
  Components: webserver
Affects Versions: Airflow 1.7.1
Reporter: Yap Sok Ann
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-177) Resume a failed dag

2016-05-26 Thread Sumit Maheshwari (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303644#comment-15303644
 ] 

Sumit Maheshwari commented on AIRFLOW-177:
--

[~criccomini] I was thinking that would it be helpful to have a cli support for 
that.. say {{ resume_dag }} which takes a execution_date as input and rerun 
only failed tasks?

> Resume a failed dag
> ---
>
> Key: AIRFLOW-177
> URL: https://issues.apache.org/jira/browse/AIRFLOW-177
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: core
>Reporter: Sumit Maheshwari
>
> Say I've a dag with 10 nodes and one of the dag run got failed at 5th node. 
> Now if I want to resume that dag, I can go and run individual task one by 
> one. Is there any way by which I can just tell dag_id and execution_date (or 
> run_id) and it automatically retries only failed tasks?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (AIRFLOW-177) Resume a failed dag

2016-05-26 Thread Sumit Maheshwari (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sumit Maheshwari reopened AIRFLOW-177:
--

> Resume a failed dag
> ---
>
> Key: AIRFLOW-177
> URL: https://issues.apache.org/jira/browse/AIRFLOW-177
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: core
>Reporter: Sumit Maheshwari
>
> Say I've a dag with 10 nodes and one of the dag run got failed at 5th node. 
> Now if I want to resume that dag, I can go and run individual task one by 
> one. Is there any way by which I can just tell dag_id and execution_date (or 
> run_id) and it automatically retries only failed tasks?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)