[jira] [Created] (AIRFLOW-6804) Add the basic test for example DAGs

2020-02-13 Thread Kamil Bregula (Jira)
Kamil Bregula created AIRFLOW-6804:
--

 Summary: Add the basic test for example DAGs
 Key: AIRFLOW-6804
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6804
 Project: Apache Airflow
  Issue Type: Bug
  Components: documentation
Affects Versions: 1.10.9
Reporter: Kamil Bregula






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6590) Use batch db operations in jobs

2020-02-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036762#comment-17036762
 ] 

ASF GitHub Bot commented on AIRFLOW-6590:
-

nuclearpinguin commented on pull request #7370: [AIRFLOW-6590] Use batch db 
operations in jobs
URL: https://github.com/apache/airflow/pull/7370
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Use batch db operations in jobs
> ---
>
> Key: AIRFLOW-6590
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6590
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: logging
>Affects Versions: 2.0.0
>Reporter: Tomasz Urbaszek
>Assignee: Tomasz Urbaszek
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6590) Use batch db operations in jobs

2020-02-13 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036766#comment-17036766
 ] 

ASF subversion and git services commented on AIRFLOW-6590:
--

Commit fb00c687b6108a6ac621d603589490edbdda in airflow's branch 
refs/heads/master from Tomek Urbaszek
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=fb00c68 ]

[AIRFLOW-6590] Use batch db operations in jobs (#7370)

* [AIRFLOW-6590] Use batch db operations in jobs

The PR changes numerous single selects / updates in base,
scheduler, and backfill jobs to bulk operations.

* fixup! [AIRFLOW-6590] Use batch db operations in jobs

* fixup! fixup! [AIRFLOW-6590] Use batch db operations in jobs


> Use batch db operations in jobs
> ---
>
> Key: AIRFLOW-6590
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6590
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: logging
>Affects Versions: 2.0.0
>Reporter: Tomasz Urbaszek
>Assignee: Tomasz Urbaszek
>Priority: Minor
> Fix For: 1.10.10
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6590) Use batch db operations in jobs

2020-02-13 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036765#comment-17036765
 ] 

ASF subversion and git services commented on AIRFLOW-6590:
--

Commit fb00c687b6108a6ac621d603589490edbdda in airflow's branch 
refs/heads/master from Tomek Urbaszek
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=fb00c68 ]

[AIRFLOW-6590] Use batch db operations in jobs (#7370)

* [AIRFLOW-6590] Use batch db operations in jobs

The PR changes numerous single selects / updates in base,
scheduler, and backfill jobs to bulk operations.

* fixup! [AIRFLOW-6590] Use batch db operations in jobs

* fixup! fixup! [AIRFLOW-6590] Use batch db operations in jobs


> Use batch db operations in jobs
> ---
>
> Key: AIRFLOW-6590
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6590
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: logging
>Affects Versions: 2.0.0
>Reporter: Tomasz Urbaszek
>Assignee: Tomasz Urbaszek
>Priority: Minor
> Fix For: 1.10.10
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6590) Use batch db operations in jobs

2020-02-13 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036763#comment-17036763
 ] 

ASF subversion and git services commented on AIRFLOW-6590:
--

Commit fb00c687b6108a6ac621d603589490edbdda in airflow's branch 
refs/heads/master from Tomek Urbaszek
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=fb00c68 ]

[AIRFLOW-6590] Use batch db operations in jobs (#7370)

* [AIRFLOW-6590] Use batch db operations in jobs

The PR changes numerous single selects / updates in base,
scheduler, and backfill jobs to bulk operations.

* fixup! [AIRFLOW-6590] Use batch db operations in jobs

* fixup! fixup! [AIRFLOW-6590] Use batch db operations in jobs


> Use batch db operations in jobs
> ---
>
> Key: AIRFLOW-6590
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6590
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: logging
>Affects Versions: 2.0.0
>Reporter: Tomasz Urbaszek
>Assignee: Tomasz Urbaszek
>Priority: Minor
> Fix For: 1.10.10
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6590) Use batch db operations in jobs

2020-02-13 Thread Tomasz Urbaszek (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomasz Urbaszek resolved AIRFLOW-6590.
--
Fix Version/s: 1.10.10
   Resolution: Done

> Use batch db operations in jobs
> ---
>
> Key: AIRFLOW-6590
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6590
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: logging
>Affects Versions: 2.0.0
>Reporter: Tomasz Urbaszek
>Assignee: Tomasz Urbaszek
>Priority: Minor
> Fix For: 1.10.10
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6590) Use batch db operations in jobs

2020-02-13 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036764#comment-17036764
 ] 

ASF subversion and git services commented on AIRFLOW-6590:
--

Commit fb00c687b6108a6ac621d603589490edbdda in airflow's branch 
refs/heads/master from Tomek Urbaszek
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=fb00c68 ]

[AIRFLOW-6590] Use batch db operations in jobs (#7370)

* [AIRFLOW-6590] Use batch db operations in jobs

The PR changes numerous single selects / updates in base,
scheduler, and backfill jobs to bulk operations.

* fixup! [AIRFLOW-6590] Use batch db operations in jobs

* fixup! fixup! [AIRFLOW-6590] Use batch db operations in jobs


> Use batch db operations in jobs
> ---
>
> Key: AIRFLOW-6590
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6590
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: logging
>Affects Versions: 2.0.0
>Reporter: Tomasz Urbaszek
>Assignee: Tomasz Urbaszek
>Priority: Minor
> Fix For: 1.10.10
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] nuclearpinguin merged pull request #7370: [AIRFLOW-6590] Use batch db operations in jobs

2020-02-13 Thread GitBox
nuclearpinguin merged pull request #7370: [AIRFLOW-6590] Use batch db 
operations in jobs
URL: https://github.com/apache/airflow/pull/7370
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj merged pull request #7418: [AIRFLOW-XXXX] Add ShopBack as an Airflow user

2020-02-13 Thread GitBox
mik-laj merged pull request #7418: [AIRFLOW-] Add ShopBack as an Airflow 
user
URL: https://github.com/apache/airflow/pull/7418
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #7418: [AIRFLOW-XXXX] Add ShopBack as an Airflow user

2020-02-13 Thread GitBox
mik-laj commented on issue #7418: [AIRFLOW-] Add ShopBack as an Airflow user
URL: https://github.com/apache/airflow/pull/7418#issuecomment-586137010
 
 
   Thanks! I merge it. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] boring-cyborg[bot] commented on issue #7418: [AIRFLOW-XXXX] Add ShopBack as an Airflow user

2020-02-13 Thread GitBox
boring-cyborg[bot] commented on issue #7418: [AIRFLOW-] Add ShopBack as an 
Airflow user
URL: https://github.com/apache/airflow/pull/7418#issuecomment-586137058
 
 
   Awesome work, congrats on your first merged pull request!
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] davidshopback commented on issue #7418: [AIRFLOW-XXXX] Add ShopBack as an Airflow user

2020-02-13 Thread GitBox
davidshopback commented on issue #7418: [AIRFLOW-] Add ShopBack as an 
Airflow user
URL: https://github.com/apache/airflow/pull/7418#issuecomment-586135776
 
 
   Thanks for taking a look!
   Sorry for the surprise, it's an aggressive security filter as the product is 
country specific.
   I've asked the security team to relax the filters a tad, so it might work 
for PL now.
   
   We have the following websites in these countries:
   Singapore: https://www.shopback.sg/
   Malaysia: https://www.shopback.my/
   Philippines: https://www.shopback.ph/
   Australia: https://www.shopback.com.au/
   Indonesia: https://www.shopback.co.id/
   Taiwan: https://www.shopback.com.tw/
   Thailand: https://www.shopback.co.th/
   Vietnam: https://www.goshopback.vn/
   
   Corporate?: https://corporate.shopback.com/
   Play store: 
https://play.google.com/store/apps/details?id=com.shopback.app=en_US
   App Store: 
https://apps.apple.com/sg/app/shopback-cashback-coupons/id1086505626
   
   Please let me know how I can help.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #7418: [AIRFLOW-XXXX] Add ShopBack as an Airflow user

2020-02-13 Thread GitBox
mik-laj commented on issue #7418: [AIRFLOW-] Add ShopBack as an Airflow user
URL: https://github.com/apache/airflow/pull/7418#issuecomment-586129084
 
 
   https://user-images.githubusercontent.com/12058428/74509438-c6f92600-4f01-11ea-9965-837d9415798b.png;>
   A surprising message. Could I ask someone to check what is on this website?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #7415: [AIRFLOW-6800] Close file object after parsing ssh config

2020-02-13 Thread GitBox
mik-laj commented on issue #7415: [AIRFLOW-6800] Close file object after 
parsing ssh config
URL: https://github.com/apache/airflow/pull/7415#issuecomment-586128226
 
 
   Travis is sad. I restarted a failed job.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] davidshopback opened a new pull request #7418: [AIRFLOW-XXXX] Add ShopBack as an Airflow user

2020-02-13 Thread GitBox
davidshopback opened a new pull request #7418: [AIRFLOW-] Add ShopBack as 
an Airflow user
URL: https://github.com/apache/airflow/pull/7418
 
 
   Adding ShopBack to the Airflow user list in README.md
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   Document only change, no JIRA issue
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (AIRFLOW-6803) Add ShopBack as an Airflow user

2020-02-13 Thread David Chua (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Chua closed AIRFLOW-6803.
---
Resolution: Invalid

> Add ShopBack as an Airflow user
> ---
>
> Key: AIRFLOW-6803
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6803
> Project: Apache Airflow
>  Issue Type: Task
>  Components: project-management
>Affects Versions: 2.0.0
>Reporter: David Chua
>Priority: Trivial
>
> -Add ShopBack as an Airflow user-
> Sorry, I should read through the documentation more carefully 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6803) Add ShopBack as an Airflow user

2020-02-13 Thread David Chua (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Chua updated AIRFLOW-6803:

Description: 
-Add ShopBack as an Airflow user-

Sorry, I should read through the documentation more carefully 

  was:Add ShopBack as an Airflow user


> Add ShopBack as an Airflow user
> ---
>
> Key: AIRFLOW-6803
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6803
> Project: Apache Airflow
>  Issue Type: Task
>  Components: project-management
>Affects Versions: 2.0.0
>Reporter: David Chua
>Priority: Trivial
>
> -Add ShopBack as an Airflow user-
> Sorry, I should read through the documentation more carefully 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-6803) Add ShopBack as an Airflow user

2020-02-13 Thread David Chua (Jira)
David Chua created AIRFLOW-6803:
---

 Summary: Add ShopBack as an Airflow user
 Key: AIRFLOW-6803
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6803
 Project: Apache Airflow
  Issue Type: Task
  Components: project-management
Affects Versions: 2.0.0
Reporter: David Chua


Add ShopBack as an Airflow user



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] davidshopback closed pull request #7417: Add ShopBack as official user of airflow

2020-02-13 Thread GitBox
davidshopback closed pull request #7417: Add ShopBack as official user of 
airflow
URL: https://github.com/apache/airflow/pull/7417
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io commented on issue #7416: [AIRFLOW-6802] honor dag.max_active_run in scheduler

2020-02-13 Thread GitBox
codecov-io commented on issue #7416: [AIRFLOW-6802] honor dag.max_active_run in 
scheduler
URL: https://github.com/apache/airflow/pull/7416#issuecomment-586112833
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/7416?src=pr=h1) 
Report
   > Merging 
[#7416](https://codecov.io/gh/apache/airflow/pull/7416?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/967930c0cb6e2293f2a49e5c9add5aa1917f3527?src=pr=desc)
 will **decrease** coverage by `0.27%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/7416/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/7416?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#7416  +/-   ##
   ==
   - Coverage   86.61%   86.33%   -0.28% 
   ==
 Files 873  874   +1 
 Lines   4075740921 +164 
   ==
   + Hits3530035329  +29 
   - Misses   5457 5592 +135
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/7416?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/jobs/scheduler\_job.py](https://codecov.io/gh/apache/airflow/pull/7416/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL3NjaGVkdWxlcl9qb2IucHk=)
 | `89.65% <100%> (+0.3%)` | :arrow_up: |
   | 
[airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/7416/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==)
 | `44.44% <0%> (-55.56%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/7416/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==)
 | `52.94% <0%> (-47.06%)` | :arrow_down: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/7416/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `47.18% <0%> (-45.08%)` | :arrow_down: |
   | 
[...viders/cncf/kubernetes/operators/kubernetes\_pod.py](https://codecov.io/gh/apache/airflow/pull/7416/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvY25jZi9rdWJlcm5ldGVzL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZC5weQ==)
 | `69.38% <0%> (-25.52%)` | :arrow_down: |
   | 
[airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/7416/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5)
 | `50.98% <0%> (-23.53%)` | :arrow_down: |
   | 
[airflow/config\_templates/airflow\_local\_settings.py](https://codecov.io/gh/apache/airflow/pull/7416/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWdfdGVtcGxhdGVzL2FpcmZsb3dfbG9jYWxfc2V0dGluZ3MucHk=)
 | `65.38% <0%> (-6.36%)` | :arrow_down: |
   | 
[airflow/stats.py](https://codecov.io/gh/apache/airflow/pull/7416/diff?src=pr=tree#diff-YWlyZmxvdy9zdGF0cy5weQ==)
 | `85.29% <0%> (-5.19%)` | :arrow_down: |
   | 
[airflow/www/api/experimental/endpoints.py](https://codecov.io/gh/apache/airflow/pull/7416/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvYXBpL2V4cGVyaW1lbnRhbC9lbmRwb2ludHMucHk=)
 | `89.81% <0%> (ø)` | :arrow_up: |
   | 
[...low/providers/google/cloud/operators/sql\_to\_gcs.py](https://codecov.io/gh/apache/airflow/pull/7416/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Nsb3VkL29wZXJhdG9ycy9zcWxfdG9fZ2NzLnB5)
 | `92.07% <0%> (ø)` | :arrow_up: |
   | ... and [3 
more](https://codecov.io/gh/apache/airflow/pull/7416/diff?src=pr=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/7416?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/7416?src=pr=footer). 
Last update 
[967930c...d3b5cc3](https://codecov.io/gh/apache/airflow/pull/7416?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] davidshopback opened a new pull request #7417: Add ShopBack as official user of airflow

2020-02-13 Thread GitBox
davidshopback opened a new pull request #7417: Add ShopBack as official user of 
airflow
URL: https://github.com/apache/airflow/pull/7417
 
 
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [ ] Description above provides context of the change
   - [ ] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [ ] Unit tests coverage for changes (not needed for documentation changes)
   - [ ] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [ ] Relevant documentation is updated including usage instructions.
   - [ ] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] boring-cyborg[bot] commented on issue #7417: Add ShopBack as official user of airflow

2020-02-13 Thread GitBox
boring-cyborg[bot] commented on issue #7417: Add ShopBack as official user of 
airflow
URL: https://github.com/apache/airflow/pull/7417#issuecomment-586112668
 
 
   Congratulations on your first Pull Request and welcome to the Apache Airflow 
community! If you have any issues or are unsure about any anything please check 
our Contribution Guide 
(https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, pylint and type 
annotations). Our [pre-commits]( 
https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks)
 will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in 
`docs/` directory). Adding a new operator? Check this short 
[guide](https://github.com/apache/airflow/blob/master/docs/howto/custom-operator.rst)
 Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze 
environment](https://github.com/apache/airflow/blob/master/BREEZE.rst) for 
testing locally, it’s a heavy docker but it ships with a working Airflow and a 
lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get 
the final approval from Committers.
   Apache Airflow is a community-driven project and together we are making it 
better .
   In case of doubts contact the developers at:
   Mailing List: d...@airflow.apache.org
   Slack: https://apache-airflow-slack.herokuapp.com/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #7395: [AIRFLOW-5629] Implement K8s priorityClassName in KubernetesPo…

2020-02-13 Thread GitBox
codecov-io edited a comment on issue #7395: [AIRFLOW-5629] Implement K8s 
priorityClassName in KubernetesPo…
URL: https://github.com/apache/airflow/pull/7395#issuecomment-585611569
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/7395?src=pr=h1) 
Report
   > Merging 
[#7395](https://codecov.io/gh/apache/airflow/pull/7395?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/58c3542ed25061320ce61dbe0adf451a44c738dd?src=pr=desc)
 will **decrease** coverage by `0.02%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/7395/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/7395?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#7395  +/-   ##
   ==
   - Coverage   86.53%   86.51%   -0.03% 
   ==
 Files 874  874  
 Lines   4086840922  +54 
   ==
   + Hits3536535402  +37 
   - Misses   5503 5520  +17
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/7395?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[...viders/cncf/kubernetes/operators/kubernetes\_pod.py](https://codecov.io/gh/apache/airflow/pull/7395/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvY25jZi9rdWJlcm5ldGVzL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZC5weQ==)
 | `94.94% <100%> (+0.05%)` | :arrow_up: |
   | 
[airflow/kubernetes/pod\_generator.py](https://codecov.io/gh/apache/airflow/pull/7395/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9nZW5lcmF0b3IucHk=)
 | `96.51% <100%> (+0.01%)` | :arrow_up: |
   | 
[airflow/stats.py](https://codecov.io/gh/apache/airflow/pull/7395/diff?src=pr=tree#diff-YWlyZmxvdy9zdGF0cy5weQ==)
 | `85.29% <0%> (-5.19%)` | :arrow_down: |
   | 
[airflow/jobs/backfill\_job.py](https://codecov.io/gh/apache/airflow/pull/7395/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL2JhY2tmaWxsX2pvYi5weQ==)
 | `90.43% <0%> (-1.16%)` | :arrow_down: |
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/7395/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `87.93% <0%> (-0.2%)` | :arrow_down: |
   | 
[airflow/www/api/experimental/endpoints.py](https://codecov.io/gh/apache/airflow/pull/7395/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvYXBpL2V4cGVyaW1lbnRhbC9lbmRwb2ludHMucHk=)
 | `89.81% <0%> (ø)` | :arrow_up: |
   | 
[...low/providers/google/cloud/operators/sql\_to\_gcs.py](https://codecov.io/gh/apache/airflow/pull/7395/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Nsb3VkL29wZXJhdG9ycy9zcWxfdG9fZ2NzLnB5)
 | `92.07% <0%> (ø)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/7395?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/7395?src=pr=footer). 
Last update 
[58c3542...a6dea34](https://codecov.io/gh/apache/airflow/pull/7395?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #7395: [AIRFLOW-5629] Implement K8s priorityClassName in KubernetesPo…

2020-02-13 Thread GitBox
codecov-io edited a comment on issue #7395: [AIRFLOW-5629] Implement K8s 
priorityClassName in KubernetesPo…
URL: https://github.com/apache/airflow/pull/7395#issuecomment-585611569
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/7395?src=pr=h1) 
Report
   > Merging 
[#7395](https://codecov.io/gh/apache/airflow/pull/7395?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/58c3542ed25061320ce61dbe0adf451a44c738dd?src=pr=desc)
 will **decrease** coverage by `0.02%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/7395/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/7395?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#7395  +/-   ##
   ==
   - Coverage   86.53%   86.51%   -0.03% 
   ==
 Files 874  874  
 Lines   4086840922  +54 
   ==
   + Hits3536535402  +37 
   - Misses   5503 5520  +17
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/7395?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[...viders/cncf/kubernetes/operators/kubernetes\_pod.py](https://codecov.io/gh/apache/airflow/pull/7395/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvY25jZi9rdWJlcm5ldGVzL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZC5weQ==)
 | `94.94% <100%> (+0.05%)` | :arrow_up: |
   | 
[airflow/kubernetes/pod\_generator.py](https://codecov.io/gh/apache/airflow/pull/7395/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9nZW5lcmF0b3IucHk=)
 | `96.51% <100%> (+0.01%)` | :arrow_up: |
   | 
[airflow/stats.py](https://codecov.io/gh/apache/airflow/pull/7395/diff?src=pr=tree#diff-YWlyZmxvdy9zdGF0cy5weQ==)
 | `85.29% <0%> (-5.19%)` | :arrow_down: |
   | 
[airflow/jobs/backfill\_job.py](https://codecov.io/gh/apache/airflow/pull/7395/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL2JhY2tmaWxsX2pvYi5weQ==)
 | `90.43% <0%> (-1.16%)` | :arrow_down: |
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/7395/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `87.93% <0%> (-0.2%)` | :arrow_down: |
   | 
[airflow/www/api/experimental/endpoints.py](https://codecov.io/gh/apache/airflow/pull/7395/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvYXBpL2V4cGVyaW1lbnRhbC9lbmRwb2ludHMucHk=)
 | `89.81% <0%> (ø)` | :arrow_up: |
   | 
[...low/providers/google/cloud/operators/sql\_to\_gcs.py](https://codecov.io/gh/apache/airflow/pull/7395/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Nsb3VkL29wZXJhdG9ycy9zcWxfdG9fZ2NzLnB5)
 | `92.07% <0%> (ø)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/7395?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/7395?src=pr=footer). 
Last update 
[58c3542...a6dea34](https://codecov.io/gh/apache/airflow/pull/7395?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-6802) scheduler not honoring dag.max_active_run config

2020-02-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036694#comment-17036694
 ] 

ASF GitHub Bot commented on AIRFLOW-6802:
-

houqp commented on pull request #7416: [AIRFLOW-6802] honor dag.max_active_run 
in scheduler
URL: https://github.com/apache/airflow/pull/7416
 
 
   commit 50efda5c69c1ddfaa869b408540182fb19f1a286 introduced a bug that
   prevents the scheduler from enforcing max active run config for all
   DAGs.
   
   this commit fixes the regression as well as the test.
   
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> scheduler not honoring dag.max_active_run config
> 
>
> Key: AIRFLOW-6802
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6802
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.0.0
>Reporter: QP Hou
>Assignee: QP Hou
>Priority: Critical
>
> commit 
> https://github.com/apache/airflow/commit/50efda5c69c1ddfaa869b408540182fb19f1a286
>  introduced a bug that prevents the scheduler from enforcing max active run 
> config for all DAGs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] houqp opened a new pull request #7416: [AIRFLOW-6802] honor dag.max_active_run in scheduler

2020-02-13 Thread GitBox
houqp opened a new pull request #7416: [AIRFLOW-6802] honor dag.max_active_run 
in scheduler
URL: https://github.com/apache/airflow/pull/7416
 
 
   commit 50efda5c69c1ddfaa869b408540182fb19f1a286 introduced a bug that
   prevents the scheduler from enforcing max active run config for all
   DAGs.
   
   this commit fixes the regression as well as the test.
   
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-6802) scheduler not honoring dag.max_active_run config

2020-02-13 Thread QP Hou (Jira)
QP Hou created AIRFLOW-6802:
---

 Summary: scheduler not honoring dag.max_active_run config
 Key: AIRFLOW-6802
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6802
 Project: Apache Airflow
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.0.0
Reporter: QP Hou
Assignee: QP Hou


commit 
https://github.com/apache/airflow/commit/50efda5c69c1ddfaa869b408540182fb19f1a286
 introduced a bug that prevents the scheduler from enforcing max active run 
config for all DAGs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] vsoch commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-13 Thread GitBox
vsoch commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-586067049
 
 
   High level feedback since this was a topic earlier:
- the extra linting for the Dockerfile, which comes down to fairly trivial 
things like using cd in a command for one layer and using curl instead of wget, 
has led to several additional errors that are frustrating to deal with. If I 
knew the project perfectly sure I might have known these in advance, but for a 
new contributor that gets these bugs (which are somewhat subjective) that then 
has to figure them out and fix, it's an extra pain that simply doesn't need to 
exist.
- given the long duration of a PR, the addition of new linters / tests 
makes it even more confusing.
   
   I'm happy to keep providing this feedback, and if you don't want it let me 
know!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kuromt commented on issue #7365: [AIRFLOW-5221] add host_aliases to KubernetesPodOperator

2020-02-13 Thread GitBox
kuromt commented on issue #7365: [AIRFLOW-5221] add host_aliases to 
KubernetesPodOperator
URL: https://github.com/apache/airflow/pull/7365#issuecomment-586053550
 
 
   ```
   No output has been received in the last 10m0s, this potentially indicates a 
stalled build or something wrong with the build itself.
   Check the details on how to adjust your build configuration on: 
https://docs.travis-ci.com/user/common-build-problems/#build-times-out-because-no-output-was-received
   The build has been terminated
   ```
   
   @ashb I merged your request, but It seems that travis has a problem.
   Could you retry the building job?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] yuqian90 opened a new pull request #7415: [AIRFLOW-6800] Close file object after parsing ssh config

2020-02-13 Thread GitBox
yuqian90 opened a new pull request #7415: [AIRFLOW-6800] Close file object 
after parsing ssh config
URL: https://github.com/apache/airflow/pull/7415
 
 
   This line of code in ssh/hooks.py is opening a file object without closing 
it in the scope where it is opened.
   This causes unnecessary noise e.g. when py.test warns about unclosed file 
objects. This PR fixes it by making sure file object is closed properly.

   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x ] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [x ] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-6800) SSHHook: Close file object when reading ssh config

2020-02-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036617#comment-17036617
 ] 

ASF GitHub Bot commented on AIRFLOW-6800:
-

yuqian90 commented on pull request #7415: [AIRFLOW-6800] Close file object 
after parsing ssh config
URL: https://github.com/apache/airflow/pull/7415
 
 
   This line of code in ssh/hooks.py is opening a file object without closing 
it in the scope where it is opened.
   This causes unnecessary noise e.g. when py.test warns about unclosed file 
objects. This PR fixes it by making sure file object is closed properly.

   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x ] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [x ] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> SSHHook: Close file object when reading ssh config
> --
>
> Key: AIRFLOW-6800
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6800
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: hooks
>Affects Versions: 1.10.9
>Reporter: Qian Yu
>Assignee: Qian Yu
>Priority: Trivial
>
> This line of code in ssh/hooks.py is opening a file object without closing it 
> in the scope where it is opened.
> This causes unnecessary noise e.g. when py.test warns about unclosed file 
> objects:
> {code}
> ResourceWarning: unclosed file <_io.TextIOWrapper 
> name='/home/test/.ssh/config' mode='r' encoding='UTF-8'>
> Exception ignored in: <_io.FileIO name='/home/test/.ssh/config' mode='rb' 
> closefd=True>
> {code}
> Using context manager fixes this easily.
> {code}
> ssh_conf = paramiko.SSHConfig()
> ssh_conf.parse(open(user_ssh_config_filename))
> host_info = ssh_conf.lookup(self.remote_host)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-6800) SSHHook: Close file object when reading ssh config

2020-02-13 Thread Qian Yu (Jira)
Qian Yu created AIRFLOW-6800:


 Summary: SSHHook: Close file object when reading ssh config
 Key: AIRFLOW-6800
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6800
 Project: Apache Airflow
  Issue Type: Improvement
  Components: hooks
Affects Versions: 1.10.9
Reporter: Qian Yu
Assignee: Qian Yu


This line of code in ssh/hooks.py is opening a file object without closing it 
in the scope where it is opened.
This causes unnecessary noise e.g. when py.test warns about unclosed file 
objects:
{code}
ResourceWarning: unclosed file <_io.TextIOWrapper name='/home/test/.ssh/config' 
mode='r' encoding='UTF-8'>
Exception ignored in: <_io.FileIO name='/home/test/.ssh/config' mode='rb' 
closefd=True>
{code}

Using context manager fixes this easily.

{code}
ssh_conf = paramiko.SSHConfig()
ssh_conf.parse(open(user_ssh_config_filename))
host_info = ssh_conf.lookup(self.remote_host)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-6801) Make use of timestamp field in ImportError object

2020-02-13 Thread Chris McLennon (Jira)
Chris McLennon created AIRFLOW-6801:
---

 Summary: Make use of timestamp field in ImportError object
 Key: AIRFLOW-6801
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6801
 Project: Apache Airflow
  Issue Type: Improvement
  Components: models
Affects Versions: 1.10.7
Reporter: Chris McLennon


The ImportError object has a `timestamp` field that goes unused. I believe this 
field is intended to report when an ImportError has been inserted into the 
table(?). When we add a row to the import_error table it seems that we should 
add the current time as the timestamp field as a result.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] codecov-io edited a comment on issue #6652: [AIRFLOW-5548] [AIRFLOW-5550] REST API enhancement - dag info, task …

2020-02-13 Thread GitBox
codecov-io edited a comment on issue #6652: [AIRFLOW-5548] [AIRFLOW-5550] REST 
API enhancement - dag info, task …
URL: https://github.com/apache/airflow/pull/6652#issuecomment-558277657
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6652?src=pr=h1) 
Report
   > Merging 
[#6652](https://codecov.io/gh/apache/airflow/pull/6652?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/67463c3d8e5fe1618117244364d8a49f80536820?src=pr=desc)
 will **decrease** coverage by `0.31%`.
   > The diff coverage is `33.66%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6652/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6652?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#6652  +/-   ##
   ==
   - Coverage   86.63%   86.31%   -0.32% 
   ==
 Files 874  877   +3 
 Lines   4086841020 +152 
   ==
   + Hits3540535406   +1 
   - Misses   5463 5614 +151
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6652?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/www/api/experimental/endpoints.py](https://codecov.io/gh/apache/airflow/pull/6652/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvYXBpL2V4cGVyaW1lbnRhbC9lbmRwb2ludHMucHk=)
 | `75.54% <23.72%> (-14.27%)` | :arrow_down: |
   | 
[airflow/api/common/experimental/get\_task.py](https://codecov.io/gh/apache/airflow/pull/6652/diff?src=pr=tree#diff-YWlyZmxvdy9hcGkvY29tbW9uL2V4cGVyaW1lbnRhbC9nZXRfdGFzay5weQ==)
 | `50% <28.57%> (-50%)` | :arrow_down: |
   | 
[airflow/api/common/experimental/get\_tasks.py](https://codecov.io/gh/apache/airflow/pull/6652/diff?src=pr=tree#diff-YWlyZmxvdy9hcGkvY29tbW9uL2V4cGVyaW1lbnRhbC9nZXRfdGFza3MucHk=)
 | `50% <50%> (ø)` | |
   | 
[airflow/api/common/experimental/get\_dags.py](https://codecov.io/gh/apache/airflow/pull/6652/diff?src=pr=tree#diff-YWlyZmxvdy9hcGkvY29tbW9uL2V4cGVyaW1lbnRhbC9nZXRfZGFncy5weQ==)
 | `58.33% <58.33%> (ø)` | |
   | 
[airflow/api/common/experimental/get\_dag.py](https://codecov.io/gh/apache/airflow/pull/6652/diff?src=pr=tree#diff-YWlyZmxvdy9hcGkvY29tbW9uL2V4cGVyaW1lbnRhbC9nZXRfZGFnLnB5)
 | `66.66% <66.66%> (ø)` | |
   | 
[...w/providers/apache/hive/operators/mysql\_to\_hive.py](https://codecov.io/gh/apache/airflow/pull/6652/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvYXBhY2hlL2hpdmUvb3BlcmF0b3JzL215c3FsX3RvX2hpdmUucHk=)
 | `35.84% <0%> (-64.16%)` | :arrow_down: |
   | 
[airflow/security/kerberos.py](https://codecov.io/gh/apache/airflow/pull/6652/diff?src=pr=tree#diff-YWlyZmxvdy9zZWN1cml0eS9rZXJiZXJvcy5weQ==)
 | `30.43% <0%> (-45.66%)` | :arrow_down: |
   | 
[airflow/providers/mysql/operators/mysql.py](https://codecov.io/gh/apache/airflow/pull/6652/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvbXlzcWwvb3BlcmF0b3JzL215c3FsLnB5)
 | `55% <0%> (-45%)` | :arrow_down: |
   | 
[airflow/stats.py](https://codecov.io/gh/apache/airflow/pull/6652/diff?src=pr=tree#diff-YWlyZmxvdy9zdGF0cy5weQ==)
 | `85.29% <0%> (-5.19%)` | :arrow_down: |
   | 
[airflow/providers/apache/hive/hooks/hive.py](https://codecov.io/gh/apache/airflow/pull/6652/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvYXBhY2hlL2hpdmUvaG9va3MvaGl2ZS5weQ==)
 | `76.02% <0%> (-1.54%)` | :arrow_down: |
   | ... and [5 
more](https://codecov.io/gh/apache/airflow/pull/6652/diff?src=pr=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6652?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6652?src=pr=footer). 
Last update 
[67463c3...1b198e3](https://codecov.io/gh/apache/airflow/pull/6652?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-6685) Add ThresholdCheckOperator for Data Quality Checking

2020-02-13 Thread alex l (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

alex l updated AIRFLOW-6685:

Description: 
This PR includes a new operator in *{{CheckOperator}}* that allows users to 
perform a threshold data quality check.

*{{ThresholdCheckOperator}}* will check a single value, sql result against a 
threshold range, and will fail a task if it is outside this range. The lower 
and upper bound of the threshold can be defined as either a numeric values, or 
sql-statements that returns a numeric value.

  was:
This PR includes a new operator in `CheckOperator` that allows users to perform 
a threshold data quality check.

`ThresholdCheckOperator` will check a single value, sql result against a 
threshold range, and will fail a task if it is outside this range. The lower 
and upper bound of the threshold can be defined as either a numeric values, or 
sql-statements that returns a numeric value.


> Add ThresholdCheckOperator for Data Quality Checking 
> -
>
> Key: AIRFLOW-6685
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6685
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: operators
>Affects Versions: 2.0.0
>Reporter: alex l
>Assignee: alex l
>Priority: Major
>
> This PR includes a new operator in *{{CheckOperator}}* that allows users to 
> perform a threshold data quality check.
> *{{ThresholdCheckOperator}}* will check a single value, sql result against a 
> threshold range, and will fail a task if it is outside this range. The lower 
> and upper bound of the threshold can be defined as either a numeric values, 
> or sql-statements that returns a numeric value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6685) Add ThresholdCheckOperator for Data Quality Checking

2020-02-13 Thread alex l (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

alex l updated AIRFLOW-6685:

Summary: Add ThresholdCheckOperator for Data Quality Checking   (was: Add 
Data Quality Operators )

> Add ThresholdCheckOperator for Data Quality Checking 
> -
>
> Key: AIRFLOW-6685
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6685
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: operators
>Affects Versions: 2.0.0
>Reporter: alex l
>Assignee: alex l
>Priority: Major
>
> This PR includes a new operator in `CheckOperator` that allows users to 
> perform a threshold data quality check.
> `ThresholdCheckOperator` will check a single value, sql result against a 
> threshold range, and will fail a task if it is outside this range. The lower 
> and upper bound of the threshold can be defined as either a numeric values, 
> or sql-statements that returns a numeric value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6685) Add Data Quality Operators

2020-02-13 Thread alex l (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

alex l updated AIRFLOW-6685:

Description: 
This PR includes a new operator in `CheckOperator` that allows users to perform 
a threshold data quality check.

`ThresholdCheckOperator` will check a single value, sql result against a 
threshold range, and will fail a task if it is outside this range. The lower 
and upper bound of the threshold can be defined as either a numeric values, or 
sql-statements that returns a numeric value.

  was:
Add Data Quality Operators to improve data quality testing on data 
workflows/pipelines. This includes 3 operators:
 * BaseDataQualityOperator
 ** contains shared attributes and methods that data quality check operators 
utilize
 ** a base class that can be used to create other dq operators
 * DataQualityThresholdCheckOperator
 ** will check a single value, sql result against a threshold range, and will 
fail a task if it is outside this range.
 * DataQulaityThresholdSQLCheckOperator
 ** Similar to DataQualityThresholdCheckOperator, but thresholds are 
sql-evaluated values, for dynamic threshold ranging.


> Add Data Quality Operators 
> ---
>
> Key: AIRFLOW-6685
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6685
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: operators
>Affects Versions: 2.0.0
>Reporter: alex l
>Assignee: alex l
>Priority: Major
>
> This PR includes a new operator in `CheckOperator` that allows users to 
> perform a threshold data quality check.
> `ThresholdCheckOperator` will check a single value, sql result against a 
> threshold range, and will fail a task if it is outside this range. The lower 
> and upper bound of the threshold can be defined as either a numeric values, 
> or sql-statements that returns a numeric value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] vsoch commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-13 Thread GitBox
vsoch commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-586039637
 
 
   okay I'm going to assume that this "no pragma" is new and just remove the 
lines from the top of those files. I also used the docker linting container 
locally with the linting configuration and container, and it seems to produce 
no output after some fixes so I'm hoping this works. I don't necessarily think 
the changes are better - it didn't let me use cd in a multi line statement, and 
instead I had to change to artificially add a WORKDIR between, so now there are 
two layers instead of just one. But I guess the linting passes, lol!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] vsoch commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-13 Thread GitBox
vsoch commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-586036255
 
 
   What are these new errors about encoding  / pragma? I literally didn't 
change any of these files
   ```
   
pragma.Failed
   
   - hook id: fix-encoding-pragma
   
   - duration: 0.14s
   
   - exit code: 1
   
   - files were modified by this hook
   
   Removed encoding pragma from 
airflow/contrib/operators/singularity_operator.py
   
   Removed encoding pragma from 
airflow/contrib/example_dags/example_singularity_operator.py
   
   Removed encoding pragma from 
tests/contrib/operators/test_singularity_operator.py
   ```
   The Dockerlint errors are a bit much, but I'll jump through the hoops :/


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] simis2626 commented on issue #7395: [AIRFLOW-5629] Implement K8s priorityClassName in KubernetesPo…

2020-02-13 Thread GitBox
simis2626 commented on issue #7395: [AIRFLOW-5629] Implement K8s 
priorityClassName in KubernetesPo…
URL: https://github.com/apache/airflow/pull/7395#issuecomment-586034102
 
 
   Thank you @potiuk appreciate the diagnosis! 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] vsoch commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-13 Thread GitBox
vsoch commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-586021498
 
 
   okay to so to update everyone following (and future me reading back to 
remember this) we are first trying a simple approach to install Singularity in 
the base container, and then the tests will run for all the various 
environments without a special runtime. The reason is because Singularity is 
not a service (like Docker has a daemon) but rather just a binary that can 
create instances / containers / otherwise. This first test will just be to see 
if singularity is detected in the host container when the executors run, 
granted that the resolved merge for setup.py didn't result in linting errors or 
what not.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (AIRFLOW-1947) airflow json file created i /tmp get wrong permission when using run_as_user

2020-02-13 Thread Soeren Laursen (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Soeren Laursen closed AIRFLOW-1947.
---
Resolution: Fixed

> airflow json file created i /tmp get wrong permission when using run_as_user
> 
>
> Key: AIRFLOW-1947
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1947
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: 1.8.0
> Environment: ubuntu 16.04 LTS
>Reporter: Soeren Laursen
>Priority: Critical
>
> We are using run_as_user on two specific task, to make sure that the 
> resulting files are assigned to the correct user.
> If we are running the task as the Airflow user the task get done as expected.
> *DAG START*
> from airflow import DAG
> from airflow.operators.bash_operator import BashOperator
> from datetime import datetime, timedelta
> default_args = {
> 'owner': 'airflow',
> 'depends_on_past': False,
> 'start_date': datetime(2015, 6, 1),
> 'email': ['s...@fcoo.dk'],
> 'email_on_failure': False,
> 'email_on_retry': False,
> 'retries': 1,
> 'retry_delay': timedelta(minutes=5),
> 'queue': 'storage-arch03',
> 'dagrun_timeout' : timedelta(minutes=60)
> # 'pool': 'backfill',
> # 'priority_weight': 10,
> # 'end_date': datetime(2016, 1, 1),
> }
> dag = DAG('Archive_Sentinel-1_data_from_FCOO_ftp_server', 
> default_args=default_args, schedule_interval=timedelta(1))
> archivingTodaysData = BashOperator(
> task_id='Archive_todays_data',
> bash_command='/home/airflow/airflowScripts/archive-Sentinel-1-data.sh 0 ',
> dag=dag)
> archivingYesterdaysData = BashOperator(
> task_id='Archive_yesterdays_data',
> bash_command='/home/airflow/airflowScripts/archive-Sentinel-1-data.sh 1 ',
> dag=dag)
> # First archive the newest data, then the data from yesterday.
> archivingYesterdaysData.set_upstream( archivingTodaysData )
> *DAG END*
> When we run the tast with a user called prod by using the run_as_user, the 
> file(s) are generated In the /tmp
> -rw---  1 airflow airflow 2205 dec 19 11:46 tmpicu87_au
> But the prod user cannot read the file. From the log file we have:
> [2017-12-19 11:46:31,803] {base_task_runner.py:112} INFO - Running: ['bash', 
> '-c', 'sudo -H -u prod airflow run 
> Archive_Sentinel-1_data_from_FCOO_ftp_server Archive_yesterdays_data 
> 2017-12-19T00:00:00 --job_id 1047 --raw -sd 
> DAGS_FOLDER/archive-Sentinel-1-data-from-ftp-server.py --cfg_path 
> /tmp/tmpicu87_au']
> [2017-12-19 11:46:32,463] {base_task_runner.py:95} INFO - Subtask: 
> [2017-12-19 11:46:32,462] {__init__.py:57} INFO - Using executor 
> SequentialExecutor
> [2017-12-19 11:46:32,587] {base_task_runner.py:95} INFO - Subtask: 
> [2017-12-19 11:46:32,587] {driver.py:120} INFO - Generating grammar tables 
> from /usr/lib/python3.5/lib2to3/Grammar.txt
> [2017-12-19 11:46:32,630] {base_task_runner.py:95} INFO - Subtask: 
> [2017-12-19 11:46:32,630] {driver.py:120} INFO - Generating grammar tables 
> from /usr/lib/python3.5/lib2to3/PatternGrammar.txt
> [2017-12-19 11:46:33,124] {base_task_runner.py:95} INFO - Subtask: 
> /usr/local/lib/python3.5/dist-packages/airflow/www/app.py:23: 
> FlaskWTFDeprecationWarning: "flask_wtf.CsrfProtect" has been renamed to 
> "CSRFProtect" and will be removed in 1.0.
> [2017-12-19 11:46:33,124] {base_task_runner.py:95} INFO - Subtask:   csrf = 
> CsrfProtect()
> [2017-12-19 11:46:33,344] {base_task_runner.py:95} INFO - Subtask: Traceback 
> (most recent call last):
> [2017-12-19 11:46:33,344] {base_task_runner.py:95} INFO - Subtask:   File 
> "/usr/local/bin/airflow", line 28, in 
> [2017-12-19 11:46:33,344] {base_task_runner.py:95} INFO - Subtask: 
> args.func(args)
> [2017-12-19 11:46:33,344] {base_task_runner.py:95} INFO - Subtask:   File 
> "/usr/local/lib/python3.5/dist-packages/airflow/bin/cli.py", line 329, in run
> [2017-12-19 11:46:33,344] {base_task_runner.py:95} INFO - Subtask: with 
> open(args.cfg_path, 'r') as conf_file:
> [2017-12-19 11:46:33,344] {base_task_runner.py:95} INFO - Subtask: 
> PermissionError: [Errno 13] Permission denied: '/tmp/tmpicu87_au'
> [2017-12-19 11:46:36,770] {jobs.py:2125} INFO - Task exited with return code 1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-6799) webgui cannot display all tasks

2020-02-13 Thread Soeren Laursen (Jira)
Soeren Laursen created AIRFLOW-6799:
---

 Summary: webgui cannot display all tasks
 Key: AIRFLOW-6799
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6799
 Project: Apache Airflow
  Issue Type: Bug
  Components: webserver
Affects Versions: 1.10.9, 1.10.7
 Environment: linux in a docker container.
Reporter: Soeren Laursen


The we have "to many" task the graph rendering stops with an 
Edge 'undefined' is not in graph javascript error,

There is no graph in the webgui. Lowering the number of task will enable the 
the rendering again.

Examplecode:
from airflow import DAG
from airflow.models import DAG
from airflow.operators.dummy_operator import DummyOperator

from datetime import datetime
from datetime import timedelta

DAG_task_concurrency = 30
DAG_max_active_runs = 10

MAIN_DAG_ID = 'BUG_IN_GRAPH_DISPLAY'

default_args = {
'owner':'prod', 
'depends_on_past':False, 
'email':['s...@fcoo.dk'], 
'email_on_failure':False, 
'email_on_retry':False, 
'retries':3, 
'retry_delay':timedelta(seconds=30),
'queue':'default'}

BUG_DAG = DAG(MAIN_DAG_ID,
  default_args=default_args,
  catchup=False,
  orientation='LR',
  concurrency=DAG_task_concurrency,
  schedule_interval='@once',
  max_active_runs=DAG_max_active_runs,
  start_date=(datetime(2020, 2, 5))
)

# To many tasks
max_task = 160
# 156 ok

task_range = list(range(0, max_task + 1))

start_task =  DummyOperator(task_id='start_task', dag=BUG_DAG)
after_all_complete =  DummyOperator(task_id='after_all_complete', dag=BUG_DAG)

for task_step in task_range:
task1 = DummyOperator(task_id='task_{0}'.format(task_step),dag=BUG_DAG)
start_task >> task1 >> after_all_complete



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6789) Concurrency parameter does not work in the latest stable version 1.10.9

2020-02-13 Thread Denis Scheglow (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denis Scheglow updated AIRFLOW-6789:

Component/s: worker

> Concurrency parameter does not work in the latest stable version 1.10.9
> ---
>
> Key: AIRFLOW-6789
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6789
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: celery, worker
>Affects Versions: 1.10.8, 1.10.9
>Reporter: Denis Scheglow
>Priority: Major
> Fix For: 1.10.10
>
>
> Latest 
> [commit|https://github.com/apache/airflow/commit/9b5dbf9886df3977b24c95cadb99bbece9f8ce4b#diff-bbf16e7665ac448883f2ceeb40db35cdR444]
>  in airflow 1.10.8 (1.10.9)
>  does not allow changing worker concurrency through CLI interface



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] potiuk commented on issue #7389: [AIRFLOW-6763] Make systems tests ready for backport tests

2020-02-13 Thread GitBox
potiuk commented on issue #7389: [AIRFLOW-6763] Make systems tests ready for 
backport tests
URL: https://github.com/apache/airflow/pull/7389#issuecomment-585978655
 
 
   Hey @ashb @mik-laj @nuclearpinguin  -> all green for the system tests ready 
for backporting. BTW, I upladed the  docs and added "How to write system tests" 
doc...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7401: [AIRFLOW-6778] Add a configurable DAGs volume mount path for Kubernetes

2020-02-13 Thread GitBox
potiuk commented on issue #7401: [AIRFLOW-6778] Add a configurable DAGs volume 
mount path for Kubernetes
URL: https://github.com/apache/airflow/pull/7401#issuecomment-585977885
 
 
   Yeah. we have timeouts in various places - issue raised to Travis. Hopefully 
we can get the final step on moving to Github Actions as we have not too much 
control over Travis.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #7389: [AIRFLOW-6763] Make systems tests ready for backport tests

2020-02-13 Thread GitBox
codecov-io edited a comment on issue #7389: [AIRFLOW-6763] Make systems tests 
ready for backport tests
URL: https://github.com/apache/airflow/pull/7389#issuecomment-583898179
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/7389?src=pr=h1) 
Report
   > Merging 
[#7389](https://codecov.io/gh/apache/airflow/pull/7389?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/6b19889c0151b66202c2b1b38d99cbb54529b255?src=pr=desc)
 will **increase** coverage by `0.21%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/7389/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/7389?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#7389  +/-   ##
   ==
   + Coverage   86.44%   86.66%   +0.21% 
   ==
 Files 874  874  
 Lines   4086840868  
   ==
   + Hits3533035417  +87 
   + Misses   5538 5451  -87
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/7389?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/www/api/experimental/endpoints.py](https://codecov.io/gh/apache/airflow/pull/7389/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvYXBpL2V4cGVyaW1lbnRhbC9lbmRwb2ludHMucHk=)
 | `89.81% <0%> (ø)` | :arrow_up: |
   | 
[...low/providers/google/cloud/operators/sql\_to\_gcs.py](https://codecov.io/gh/apache/airflow/pull/7389/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Nsb3VkL29wZXJhdG9ycy9zcWxfdG9fZ2NzLnB5)
 | `92.07% <0%> (ø)` | :arrow_up: |
   | 
[airflow/providers/amazon/aws/hooks/s3.py](https://codecov.io/gh/apache/airflow/pull/7389/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvYW1hem9uL2F3cy9ob29rcy9zMy5weQ==)
 | `95.85% <0%> (+0.13%)` | :arrow_up: |
   | 
[airflow/jobs/scheduler\_job.py](https://codecov.io/gh/apache/airflow/pull/7389/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL3NjaGVkdWxlcl9qb2IucHk=)
 | `89.34% <0%> (+0.14%)` | :arrow_up: |
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/7389/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `88.12% <0%> (+0.19%)` | :arrow_up: |
   | 
[airflow/models/dag.py](https://codecov.io/gh/apache/airflow/pull/7389/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvZGFnLnB5)
 | `91.16% <0%> (+0.25%)` | :arrow_up: |
   | 
[airflow/hooks/dbapi\_hook.py](https://codecov.io/gh/apache/airflow/pull/7389/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9kYmFwaV9ob29rLnB5)
 | `91.73% <0%> (+0.82%)` | :arrow_up: |
   | 
[airflow/providers/apache/hive/hooks/hive.py](https://codecov.io/gh/apache/airflow/pull/7389/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvYXBhY2hlL2hpdmUvaG9va3MvaGl2ZS5weQ==)
 | `77.55% <0%> (+1.53%)` | :arrow_up: |
   | 
[airflow/configuration.py](https://codecov.io/gh/apache/airflow/pull/7389/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5)
 | `91.43% <0%> (+3.42%)` | :arrow_up: |
   | 
[...roviders/amazon/aws/operators/s3\_delete\_objects.py](https://codecov.io/gh/apache/airflow/pull/7389/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvYW1hem9uL2F3cy9vcGVyYXRvcnMvczNfZGVsZXRlX29iamVjdHMucHk=)
 | `100% <0%> (+9.52%)` | :arrow_up: |
   | ... and [3 
more](https://codecov.io/gh/apache/airflow/pull/7389/diff?src=pr=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/7389?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/7389?src=pr=footer). 
Last update 
[6b19889...0c78373](https://codecov.io/gh/apache/airflow/pull/7389?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on a change in pull request #7389: [AIRFLOW-6763] Make systems tests ready for backport tests

2020-02-13 Thread GitBox
potiuk commented on a change in pull request #7389: [AIRFLOW-6763] Make systems 
tests ready for backport tests
URL: https://github.com/apache/airflow/pull/7389#discussion_r379122368
 
 

 ##
 File path: BREEZE.rst
 ##
 @@ -611,7 +617,13 @@ This is the current syntax for  `./breeze <./breeze>`_:
   -S, --static-check 
   Run selected static checks for currently changed files. You should 
specify static check that
   you would like to run or 'all' to run all checks. One of
-  [ all all-but-pylint bat-tests check-apache-license 
check-executables-have-shebangs check-hooks-apply check-merge-conflict 
check-xml debug-statements doctoc detect-private-key end-of-file-fixer flake8 
forbid-tabs insert-license lint-dockerfile mixed-line-ending mypy pylint 
pylint-test setup-order shellcheck].
+
+  all all-but-pylint bat-tests check-apache-license
 
 Review comment:
   Resolved,


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on a change in pull request #7389: [AIRFLOW-6763] Make systems tests ready for backport tests

2020-02-13 Thread GitBox
potiuk commented on a change in pull request #7389: [AIRFLOW-6763] Make systems 
tests ready for backport tests
URL: https://github.com/apache/airflow/pull/7389#discussion_r379122655
 
 

 ##
 File path: TESTING.rst
 ##
 @@ -31,8 +31,7 @@ Airflow Test Infrastructure
 
 * **System tests** are automatic tests that use external systems like
   Google Cloud Platform. These tests are intended for an end-to-end DAG 
execution.
-  Note that automated execution of these tests is still
-  `work in progress 
`_.
+  The tests can be executed on both.
 
 Review comment:
   Resolved.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] brandonwillard commented on issue #7401: [AIRFLOW-6778] Add a configurable DAGs volume mount path for Kubernetes

2020-02-13 Thread GitBox
brandonwillard commented on issue #7401: [AIRFLOW-6778] Add a configurable DAGs 
volume mount path for Kubernetes
URL: https://github.com/apache/airflow/pull/7401#issuecomment-585974726
 
 
   @potiuk, looks like the `[kubernetes][git]` test timed-out again.  These 
errors arein some senserandom, and not due to changes introduced 
by the PR?  I'm seeing the same thing in 
https://github.com/apache/airflow/pull/7405.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] ashb commented on a change in pull request #7363: [AIRFLOW-6730] Use total_seconds instead of seconds

2020-02-13 Thread GitBox
ashb commented on a change in pull request #7363: [AIRFLOW-6730] Use 
total_seconds instead of seconds
URL: https://github.com/apache/airflow/pull/7363#discussion_r379119387
 
 

 ##
 File path: tests/providers/google/cloud/operators/test_dataproc.py
 ##
 @@ -104,7 +104,7 @@
 "autoscaling_config": {"policy_uri": "autoscaling_policy"},
 "config_bucket": "storage_bucket",
 "initialization_actions": [
-{"executable_file": "init_actions_uris", "execution_timeout": 
"600s"}
+{"executable_file": "init_actions_uris", "execution_timeout": 
"600.0s"}
 
 Review comment:
   This is my only question -- does DataProc accept this, or does it only 
accept whole-integer seconds here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (AIRFLOW-2906) DataDog Integration for Airflow

2020-02-13 Thread Ash Berlin-Taylor (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-2906.

Fix Version/s: 1.10.10
   Resolution: Fixed

I've marked this as fix for 1.10.10, but to pick it in we'll need to 1.

Check this metaclass implemation works with Py2 (it won't as it stands, but six 
might help us out), and also pull in the previous refactors of stats from 
airflow.settings in to its own airflow.stats module.

> DataDog Integration for Airflow
> ---
>
> Key: AIRFLOW-2906
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2906
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: utils
>Affects Versions: 1.8.0
>Reporter: Austin Hsu
>Assignee: Chandu Kavar
>Priority: Minor
>  Labels: metrics
> Fix For: 1.10.10
>
>
> Add functionality to Airflow to enable sending of metrics to DataDog.  
> DataDog provides support for tags which allows us to aggregate data more 
> easily and visualize it.  We can utilize the [Datadog python 
> library|https://github.com/DataDog/datadogpy] python library and the [Datadog 
> ThreadStats 
> module|https://datadogpy.readthedocs.io/en/latest/#datadog-threadstats-module]
>  to send metrics directly to DataDog without needing to spin up an agent to 
> forward the metrics.  The current implementation in 1.8 uses the statsd 
> library to send the metrics which provides us with much less control to 
> filter our data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-2906) DataDog Integration for Airflow

2020-02-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036499#comment-17036499
 ] 

ASF GitHub Bot commented on AIRFLOW-2906:
-

ashb commented on pull request #7376: [AIRFLOW-2906] Add datadog(dogstatsd) 
support to send airflow metrics
URL: https://github.com/apache/airflow/pull/7376
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> DataDog Integration for Airflow
> ---
>
> Key: AIRFLOW-2906
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2906
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: utils
>Affects Versions: 1.8.0
>Reporter: Austin Hsu
>Assignee: Chandu Kavar
>Priority: Minor
>  Labels: metrics
>
> Add functionality to Airflow to enable sending of metrics to DataDog.  
> DataDog provides support for tags which allows us to aggregate data more 
> easily and visualize it.  We can utilize the [Datadog python 
> library|https://github.com/DataDog/datadogpy] python library and the [Datadog 
> ThreadStats 
> module|https://datadogpy.readthedocs.io/en/latest/#datadog-threadstats-module]
>  to send metrics directly to DataDog without needing to spin up an agent to 
> forward the metrics.  The current implementation in 1.8 uses the statsd 
> library to send the metrics which provides us with much less control to 
> filter our data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-2906) DataDog Integration for Airflow

2020-02-13 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036500#comment-17036500
 ] 

ASF subversion and git services commented on AIRFLOW-2906:
--

Commit ed2f3dc4ca28609bcead681f95b6e26e13d64c28 in airflow's branch 
refs/heads/master from chandu kavar
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=ed2f3dc ]

[AIRFLOW-2906] Add support for DataDog's dogstatsd when emitting metrics (#7376)



> DataDog Integration for Airflow
> ---
>
> Key: AIRFLOW-2906
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2906
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: utils
>Affects Versions: 1.8.0
>Reporter: Austin Hsu
>Assignee: Chandu Kavar
>Priority: Minor
>  Labels: metrics
>
> Add functionality to Airflow to enable sending of metrics to DataDog.  
> DataDog provides support for tags which allows us to aggregate data more 
> easily and visualize it.  We can utilize the [Datadog python 
> library|https://github.com/DataDog/datadogpy] python library and the [Datadog 
> ThreadStats 
> module|https://datadogpy.readthedocs.io/en/latest/#datadog-threadstats-module]
>  to send metrics directly to DataDog without needing to spin up an agent to 
> forward the metrics.  The current implementation in 1.8 uses the statsd 
> library to send the metrics which provides us with much less control to 
> filter our data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] ashb merged pull request #7376: [AIRFLOW-2906] Add datadog(dogstatsd) support to send airflow metrics

2020-02-13 Thread GitBox
ashb merged pull request #7376: [AIRFLOW-2906] Add datadog(dogstatsd) support 
to send airflow metrics
URL: https://github.com/apache/airflow/pull/7376
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-6796) Serialized DAGs can be incorrectly deleted

2020-02-13 Thread Matthew Bruce (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Bruce updated AIRFLOW-6796:
---
Description: 
With serialization of DAGs enabled, `SerializedDagModel.remove_deleted_dags` 
called from `DagFileProcessManager.refresh_dag_dir` can delete the 
serialization of DAGs if they were loaded via a DagBag and globals in a 
different `.py` file:

Consider something like this:
 {{/home/airflow/dags/loader.py}}
{code:python}
dag_bags = []
dag_bags.append(models.DagBag('/home/airflow/project-a/dags')
dag_bags.append(models.DagBag('/home/airflow/project-b/dags')

for dag_bag in dag_bags:
for dag in dag_bag:
  globals()[dag.dag_id] = dag{code}
with files:
{code:java}
/home/airflow/project-a/dags/dag-a.py
/home/airflow/project-b/dags/dag-b.py
{code}
 

The list of file paths passed to {{SerializedDagModel.remove_deleted_dags}} is 
only going to contain {{/home/airflow/dags/loader.py}} and the method will 
remove the serializations for the DAGs in dag-a.py and dag-b.py

With non-serialized DAGs, airflow seems to mark DAGs as inactive based on when 
the scheduler last processed them - I wonder if we should make these two 
methods consistent?

  was:
With serialization of DAGs enabled, `SerializedDagModel.remove_deleted_dags` 
called from `DagFileProcessManager.refresh_dag_dir` can delete the 
serialization of DAGs if they were loaded via a DagBag and globals in a 
different `.py` file:

Consider something like this:
 {{/home/airflow/dags/loader.py}}
{code:python}
dag_bags = []
dag_bags.append(models.DagBag('/home/airflow/project-a/dags')
dag_bags.append(models.DagBag('/home/airflow/project-b/dags')

for dag_bag in dag_bags:
for dag in dag_bag:
  globals()[dag.dag_id] = dag{code}
with files:

```

{{/home/airflow/project-a/dags/dag-a.py}}
{{/home/airflow/project-b/dags/dag-b.py}}

```

The list of file paths passed to {{SerializedDagModel.remove_deleted_dags}} is 
only going to contain {{/home/airflow/dags/loader.py}} and the method will 
remove the serializations for the DAGs in dag-a.py and dag-b.py

With non-serialized DAGs, airflow seems to mark DAGs as inactive based on when 
the scheduler last processed them - I wonder if we should make these two 
methods consistent?


> Serialized DAGs can be incorrectly deleted
> --
>
> Key: AIRFLOW-6796
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6796
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: serialization
>Affects Versions: 1.10.9
>Reporter: Matthew Bruce
>Priority: Major
>
> With serialization of DAGs enabled, `SerializedDagModel.remove_deleted_dags` 
> called from `DagFileProcessManager.refresh_dag_dir` can delete the 
> serialization of DAGs if they were loaded via a DagBag and globals in a 
> different `.py` file:
> Consider something like this:
>  {{/home/airflow/dags/loader.py}}
> {code:python}
> dag_bags = []
> dag_bags.append(models.DagBag('/home/airflow/project-a/dags')
> dag_bags.append(models.DagBag('/home/airflow/project-b/dags')
> for dag_bag in dag_bags:
> for dag in dag_bag:
>   globals()[dag.dag_id] = dag{code}
> with files:
> {code:java}
> /home/airflow/project-a/dags/dag-a.py
> /home/airflow/project-b/dags/dag-b.py
> {code}
>  
> The list of file paths passed to {{SerializedDagModel.remove_deleted_dags}} 
> is only going to contain {{/home/airflow/dags/loader.py}} and the method will 
> remove the serializations for the DAGs in dag-a.py and dag-b.py
> With non-serialized DAGs, airflow seems to mark DAGs as inactive based on 
> when the scheduler last processed them - I wonder if we should make these two 
> methods consistent?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6796) Serialized DAGs can be incorrectly deleted

2020-02-13 Thread Matthew Bruce (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Bruce updated AIRFLOW-6796:
---
Description: 
With serialization of DAGs enabled, `SerializedDagModel.remove_deleted_dags` 
called from `DagFileProcessManager.refresh_dag_dir` can delete the 
serialization of DAGs if they were loaded via a DagBag and globals in a 
different `.py` file:

Consider something like this:
 {{/home/airflow/dags/loader.py}}
{code:python}
dag_bags = []
dag_bags.append(models.DagBag('/home/airflow/project-a/dags')
dag_bags.append(models.DagBag('/home/airflow/project-b/dags')

for dag_bag in dag_bags:
for dag in dag_bag:
  globals()[dag.dag_id] = dag{code}
with files:

```

{{/home/airflow/project-a/dags/dag-a.py}}
{{/home/airflow/project-b/dags/dag-b.py}}

```

The list of file paths passed to {{SerializedDagModel.remove_deleted_dags}} is 
only going to contain {{/home/airflow/dags/loader.py}} and the method will 
remove the serializations for the DAGs in dag-a.py and dag-b.py

With non-serialized DAGs, airflow seems to mark DAGs as inactive based on when 
the scheduler last processed them - I wonder if we should make these two 
methods consistent?

  was:
With serialization of DAGs enabled, `SerializedDagModel.remove_deleted_dags` 
called from `DagFileProcessManager.refresh_dag_dir` can delete the 
serialization of DAGs if they were loaded via a DagBag and globals in a 
different `.py` file:

Consider something like this:
 {{/home/airflow/dags/loader.py}}
{code:python}
dag_bags = []
dag_bags.append(models.DagBag('/home/airflow/project-a/dags')
dag_bags.append(models.DagBag('/home/airflow/project-b/dags')

for dag_bag in dag_bags:
for dag in dag_bag:
  globals()[dag.dag_id] = dag{code}
with files:
{code:python}
 
{code}
{{/home/airflow/project-a/dags/dag-a.py}}
{code:python}
 
{code}
{{/home/airflow/project-b/dags/dag-b.py}}

The list of file paths passed to {{SerializedDagModel.remove_deleted_dags}} is 
only going to contain {{/home/airflow/dags/loader.py}} and the method will 
remove the serializations for the DAGs in dag-a.py and dag-b.py

With non-serialized DAGs, airflow seems to mark DAGs as inactive based on when 
the scheduler last processed them - I wonder if we should make these two 
methods consistent?


> Serialized DAGs can be incorrectly deleted
> --
>
> Key: AIRFLOW-6796
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6796
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: serialization
>Affects Versions: 1.10.9
>Reporter: Matthew Bruce
>Priority: Major
>
> With serialization of DAGs enabled, `SerializedDagModel.remove_deleted_dags` 
> called from `DagFileProcessManager.refresh_dag_dir` can delete the 
> serialization of DAGs if they were loaded via a DagBag and globals in a 
> different `.py` file:
> Consider something like this:
>  {{/home/airflow/dags/loader.py}}
> {code:python}
> dag_bags = []
> dag_bags.append(models.DagBag('/home/airflow/project-a/dags')
> dag_bags.append(models.DagBag('/home/airflow/project-b/dags')
> for dag_bag in dag_bags:
> for dag in dag_bag:
>   globals()[dag.dag_id] = dag{code}
> with files:
> ```
> {{/home/airflow/project-a/dags/dag-a.py}}
> {{/home/airflow/project-b/dags/dag-b.py}}
> ```
> The list of file paths passed to {{SerializedDagModel.remove_deleted_dags}} 
> is only going to contain {{/home/airflow/dags/loader.py}} and the method will 
> remove the serializations for the DAGs in dag-a.py and dag-b.py
> With non-serialized DAGs, airflow seems to mark DAGs as inactive based on 
> when the scheduler last processed them - I wonder if we should make these two 
> methods consistent?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6796) Serialized DAGs can be incorrectly deleted

2020-02-13 Thread Matthew Bruce (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Bruce updated AIRFLOW-6796:
---
Description: 
With serialization of DAGs enabled, `SerializedDagModel.remove_deleted_dags` 
called from `DagFileProcessManager.refresh_dag_dir` can delete the 
serialization of DAGs if they were loaded via a DagBag and globals in a 
different `.py` file:

Consider something like this:
 {{/home/airflow/dags/loader.py}}
{code:python}
dag_bags = []
dag_bags.append(models.DagBag('/home/airflow/project-a/dags')
dag_bags.append(models.DagBag('/home/airflow/project-b/dags')

for dag_bag in dag_bags:
for dag in dag_bag:
  globals()[dag.dag_id] = dag{code}
with files:
{code:python}
 
{code}
{{/home/airflow/project-a/dags/dag-a.py}}
{code:python}
 
{code}
{{/home/airflow/project-b/dags/dag-b.py}}

The list of file paths passed to {{SerializedDagModel.remove_deleted_dags}} is 
only going to contain {{/home/airflow/dags/loader.py}} and the method will 
remove the serializations for the DAGs in dag-a.py and dag-b.py

With non-serialized DAGs, airflow seems to mark DAGs as inactive based on when 
the scheduler last processed them - I wonder if we should make these two 
methods consistent?

  was:
With serialization of DAGs enabled, `SerializedDagModel.remove_deleted_dags` 
called from `DagFileProcessManager.refresh_dag_dir` can delete the 
serialization of DAGs if they were loaded via a DagBag and globals in a 
different `.py` file:

Consider something like this:
 {{/home/airflow/dags/loader.py}}
{code:python}
dag_bags = []
dag_bags.append(models.DagBag('/home/airflow/project-a/dags')
dag_bags.append(models.DagBag('/home/airflow/project-b/dags')

for dag_bag in dag_bags:
for dag in dag_bag:
  globals()[dag.dag_id] = dag{code}
with files:
{code:python}

 
{code}
{{/home/airflow/project-a/dags/dag-a.py}}
{code:python}

 
{code}
{{/home/airflow/project-b/dags/dag-b.py}}

The list of file paths passed to {{SerializedDagModel.remove_deleted_dags}} is 
only going to contain {{/home/airflow/dags/loader.py}} and the method will 
remove the serializations for the DAGs in dag-a.py and dag-b.py

With non-serialized DAGs, airflow seems to mark DAGs as inactive based on when 
the scheduler last processed them - I wonder if we should make these two 
methods consistent?


> Serialized DAGs can be incorrectly deleted
> --
>
> Key: AIRFLOW-6796
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6796
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: serialization
>Affects Versions: 1.10.9
>Reporter: Matthew Bruce
>Priority: Major
>
> With serialization of DAGs enabled, `SerializedDagModel.remove_deleted_dags` 
> called from `DagFileProcessManager.refresh_dag_dir` can delete the 
> serialization of DAGs if they were loaded via a DagBag and globals in a 
> different `.py` file:
> Consider something like this:
>  {{/home/airflow/dags/loader.py}}
> {code:python}
> dag_bags = []
> dag_bags.append(models.DagBag('/home/airflow/project-a/dags')
> dag_bags.append(models.DagBag('/home/airflow/project-b/dags')
> for dag_bag in dag_bags:
> for dag in dag_bag:
>   globals()[dag.dag_id] = dag{code}
> with files:
> {code:python}
>  
> {code}
> {{/home/airflow/project-a/dags/dag-a.py}}
> {code:python}
>  
> {code}
> {{/home/airflow/project-b/dags/dag-b.py}}
> The list of file paths passed to {{SerializedDagModel.remove_deleted_dags}} 
> is only going to contain {{/home/airflow/dags/loader.py}} and the method will 
> remove the serializations for the DAGs in dag-a.py and dag-b.py
> With non-serialized DAGs, airflow seems to mark DAGs as inactive based on 
> when the scheduler last processed them - I wonder if we should make these two 
> methods consistent?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-6798) Add option for service account values for KubernetesPodOperator

2020-02-13 Thread Parhy (Jira)
Parhy created AIRFLOW-6798:
--

 Summary: Add option for service account values for 
KubernetesPodOperator
 Key: AIRFLOW-6798
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6798
 Project: Apache Airflow
  Issue Type: Bug
  Components: contrib
Affects Versions: 1.10.3
 Environment: dev
Reporter: Parhy


I am trying to run  the below dag in a k8s environment.

 

from airflow import DAG
from datetime import datetime, timedelta
from airflow.contrib.operators.kubernetes_pod_operator import 
KubernetesPodOperator
from airflow import configuration as conf
from airflow.contrib.kubernetes.pod import Resources

default_args = {
 'owner': 'airflow',
 'depends_on_past': False,
 'start_date': datetime(2019, 1, 1),
 'email_on_failure': False,
 'email_on_retry': False,
 'retries': 1,
 'retry_delay': timedelta(minutes=5),
}

namespace = conf.get('kubernetes', 'namespace')

# This will detect the default namespace locally and read the
# environment namespace when deployed to Astronomer.
dag = DAG('example_kubernetes_pod',
 schedule_interval='@once',
 default_args=default_args)

compute_resource = Resources()
compute_resource.request_cpu = '5000m'
compute_resource.request_memory = '512Mi'
compute_resource.limit_cpu = '800m'
compute_resource.limit_memory = '1Gi'

#compute_resource = \{'request_cpu': '500m', 'request_memory': '512Mi', 
'limit_cpu': '800m', 'limit_memory': '1Gi'}

with dag:
 k = KubernetesPodOperator(
 namespace=namespace,
 image="hello-world",
 labels=\{"foo": "bar"},
 name="airflow-test-pod",
 task_id="task-one",
 in_cluster=False, # if set to true, will look in the cluster, if false, looks 
for file
 resources=compute_resource,
 config_file=None,
 is_delete_pod_operator=True,
 get_logs=True)

 

I am getting the below error 

 

HTTP response headers: HTTPHeaderDict(\{'Audit-Id': 'x', 'Cache-Control': 
'no-cache, private', 'Content-Type': 'application/json', 
'X-Content-Type-Options': 'nosniff', 'Date': 'Thu, 13 Feb 2020 17:00:11 GMT', 
'Content-Length': '276'})
HTTP response body: 
\{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods
 is forbidden: User \"system:serviceaccount:xxx:default\" cannot create 
resource \"pods\" in API group \"\" in the namespace 
\"xxx\"","reason":"Forbidden","details":\{"kind":"pods"},"code":403}

 

I understand its trying to use the default serviceaccount in my namespace and 
default don't have permission to create pod.

Can we pass the name of the serviceaccount which I created which has permission 
to do so.

Please let me know. 

 

KubernetesExecutor is working fine as in that case the scheduler pod is added 
with the service account which has permission through rolebinding to create the 
pod.

 

Thanks in advance,



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6796) Serialized DAGs can be incorrectly deleted

2020-02-13 Thread Matthew Bruce (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Bruce updated AIRFLOW-6796:
---
Description: 
With serialization of DAGs enabled, `SerializedDagModel.remove_deleted_dags` 
called from `DagFileProcessManager.refresh_dag_dir` can delete the 
serialization of DAGs if they were loaded via a DagBag and globals in a 
different `.py` file:

Consider something like this:
 {{/home/airflow/dags/loader.py}}
{code:python}
dag_bags = []
dag_bags.append(models.DagBag('/home/airflow/project-a/dags')
dag_bags.append(models.DagBag('/home/airflow/project-b/dags')

for dag_bag in dag_bags:
for dag in dag_bag:
  globals()[dag.dag_id] = dag{code}
with files:
{code:python}

 
{code}
{{/home/airflow/project-a/dags/dag-a.py}}
{code:python}

 
{code}
{{/home/airflow/project-b/dags/dag-b.py}}

The list of file paths passed to {{SerializedDagModel.remove_deleted_dags}} is 
only going to contain {{/home/airflow/dags/loader.py}} and the method will 
remove the serializations for the DAGs in dag-a.py and dag-b.py

With non-serialized DAGs, airflow seems to mark DAGs as inactive based on when 
the scheduler last processed them - I wonder if we should make these two 
methods consistent?

  was:
With serialization of DAGs enabled, `SerializedDagModel.remove_deleted_dags` 
called from `DagFileProcessManager.refresh_dag_dir` can delete the 
serialization of DAGs if they were loaded via a DagBag and globals in a 
different `.py` file:

Consider something like this:
{{/home/airflow/dags/loader.py}}
{code:python}
dags = []
dags.append(models.DagBag('/home/airflow/project-a/dags')
dags.append(models.DagBag('/home/airflow/project-b/dags')

globals().update(dags)
{code}

with files:
{{/home/airflow/project-a/dags/dag-a.py}}
{{/home/airflow/project-b/dags/dag-b.py}}


The list of file paths passed to {{SerializedDagModel.remove_deleted_dags}} is 
only going to contain {{/home/airflow/dags/loader.py}} and the method will 
remove the serializations for the DAGs in dag-a.py and dag-b.py

With non-serialized DAGs, airflow seems to mark DAGs as inactive based on when 
the scheduler last processed them - I wonder if we should make these two 
methods consistent?


> Serialized DAGs can be incorrectly deleted
> --
>
> Key: AIRFLOW-6796
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6796
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: serialization
>Affects Versions: 1.10.9
>Reporter: Matthew Bruce
>Priority: Major
>
> With serialization of DAGs enabled, `SerializedDagModel.remove_deleted_dags` 
> called from `DagFileProcessManager.refresh_dag_dir` can delete the 
> serialization of DAGs if they were loaded via a DagBag and globals in a 
> different `.py` file:
> Consider something like this:
>  {{/home/airflow/dags/loader.py}}
> {code:python}
> dag_bags = []
> dag_bags.append(models.DagBag('/home/airflow/project-a/dags')
> dag_bags.append(models.DagBag('/home/airflow/project-b/dags')
> for dag_bag in dag_bags:
> for dag in dag_bag:
>   globals()[dag.dag_id] = dag{code}
> with files:
> {code:python}
>  
> {code}
> {{/home/airflow/project-a/dags/dag-a.py}}
> {code:python}
>  
> {code}
> {{/home/airflow/project-b/dags/dag-b.py}}
> The list of file paths passed to {{SerializedDagModel.remove_deleted_dags}} 
> is only going to contain {{/home/airflow/dags/loader.py}} and the method will 
> remove the serializations for the DAGs in dag-a.py and dag-b.py
> With non-serialized DAGs, airflow seems to mark DAGs as inactive based on 
> when the scheduler last processed them - I wonder if we should make these two 
> methods consistent?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] ashb commented on a change in pull request #7370: [AIRFLOW-6590] Use batch db operations in jobs

2020-02-13 Thread GitBox
ashb commented on a change in pull request #7370: [AIRFLOW-6590] Use batch db 
operations in jobs
URL: https://github.com/apache/airflow/pull/7370#discussion_r379113503
 
 

 ##
 File path: scripts/perf/sql_queries.py
 ##
 @@ -0,0 +1,178 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+from time import sleep, time
+from typing import List, NamedTuple, Optional, Tuple
+
+import pandas as pd
+
+# Setup environment before any Airflow import
+DAG_FOLDER = "/opt/airflow/scripts/perf/dags"
+os.environ["AIRFLOW__CORE__DAGS_FOLDER"] = DAG_FOLDER
+os.environ["AIRFLOW__DEBUG__SQLALCHEMY_STATS"] = "True"
+os.environ["AIRFLOW__CORE__LOAD_EXAMPLES"] = "False"
+
+# Here we setup simpler logger to avoid any code changes in
+# Airflow core code base
+LOG_LEVEL = "INFO"
+LOG_FILE = "/files/sql_stats.log"  # Default to run in Breeze
+
+os.environ[
+"AIRFLOW__LOGGING__LOGGING_CONFIG_CLASS"
+] = "scripts.perf.sql_queries.DEBUG_LOGGING_CONFIG"
+
+DEBUG_LOGGING_CONFIG = {
+"version": 1,
+"disable_existing_loggers": False,
+"formatters": {"airflow": {"format": "%(message)s"}},
+"handlers": {
+"console": {"class": "logging.StreamHandler"},
+"task": {
+"class": "logging.FileHandler",
+"formatter": "airflow",
+"filename": LOG_FILE,
+},
+"processor": {
+"class": "logging.FileHandler",
+"formatter": "airflow",
+"filename": LOG_FILE,
+},
+},
+"loggers": {
+"airflow.processor": {
+"handlers": ["processor"],
+"level": LOG_LEVEL,
+"propagate": False,
+},
+"airflow.task": {"handlers": ["task"], "level": LOG_LEVEL, 
"propagate": False},
+"flask_appbuilder": {
+"handler": ["console"],
+"level": LOG_LEVEL,
+"propagate": True,
+},
+},
+"root": {"handlers": ["console", "task"], "level": LOG_LEVEL},
+}
+
+
+class Query(NamedTuple):
+function: str
+file: str
+location: int
+sql: str
+stack: str
+time: float
+
+def __str__(self):
+sql = self.sql if len(self.sql) < 110 else f"{self.sql[:111]}..."
+return f"{self.function} in {self.file}:{self.location}: {sql}"
+
+def __eq__(self, other):
+return (
+self.function == other.function
+and self.sql == other.sql
+and self.location == other.location
+and self.file == other.file
+)
+
+def to_dict(self):
+return dict(zip(("function", "file", "location", "sql", "stack", 
"time"), self))
+
+
+def reset_db():
+from airflow.utils.db import resetdb
+
+resetdb()
+
+
+def run_scheduler_job(with_db_reset=False) -> None:
+from airflow.jobs.scheduler_job import SchedulerJob
+
+if with_db_reset:
+reset_db()
+SchedulerJob(subdir=DAG_FOLDER, do_pickle=False, num_runs=3).run()
+
+
+def is_query(line: str) -> bool:
+return "@SQLALCHEMY" in line and "|$" in line
 
 Review comment:
   Mostly to make it less work to extract the timinig logs from amongst the 
other logs.. But no not required.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] vsoch commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-13 Thread GitBox
vsoch commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-585967157
 
 
   Perfecto! I'll give this a shot.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-6797) Create policy hooks for DAGs

2020-02-13 Thread Matthew Bruce (Jira)
Matthew Bruce created AIRFLOW-6797:
--

 Summary: Create policy hooks for DAGs
 Key: AIRFLOW-6797
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6797
 Project: Apache Airflow
  Issue Type: New Feature
  Components: scheduler
Affects Versions: 1.10.9
Reporter: Matthew Bruce


Policy hooks exist to modify task objects just before they are run:
[https://airflow.apache.org/docs/stable/concepts.html?highlight=policy#cluster-policy]

 

Similar functionality for DAGs at loading time so that they could be rejected 
or modified would be useful (i.e. to validate DAG naming, etc.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-13 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-585964299
 
 
   > 1. installing Singularity as a binary inside a container (these bases 
already exist)
   > 2. running the container via docker-compose
   > 3. But then running tests inside of that container
   > 
   > Is this possible?
   
   I see now !. In this case I think the easiest is to add the binary to Docker 
image and running it from there. Then there is no need to run it as separate 
image/runtime. That will be way simpler. We already  do that for the 
minicluster - for hadoop tests:
   
   Docker installation here: 
https://github.com/apache/airflow/blob/master/Dockerfile#L184
   And starting the cluster here: 
https://github.com/apache/airflow/blob/master/scripts/ci/in_container/entrypoint_ci.sh#L140
   J.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-6796) Serialized DAGs can be incorrectly deleted

2020-02-13 Thread Ash Berlin-Taylor (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-6796:
---
Description: 
With serialization of DAGs enabled, `SerializedDagModel.remove_deleted_dags` 
called from `DagFileProcessManager.refresh_dag_dir` can delete the 
serialization of DAGs if they were loaded via a DagBag and globals in a 
different `.py` file:

Consider something like this:
{{/home/airflow/dags/loader.py}}
{code:python}
dags = []
dags.append(models.DagBag('/home/airflow/project-a/dags')
dags.append(models.DagBag('/home/airflow/project-b/dags')

globals().update(dags)
{code}

with files:
{{/home/airflow/project-a/dags/dag-a.py}}
{{/home/airflow/project-b/dags/dag-b.py}}


The list of file paths passed to {{SerializedDagModel.remove_deleted_dags}} is 
only going to contain {{/home/airflow/dags/loader.py}} and the method will 
remove the serializations for the DAGs in dag-a.py and dag-b.py

With non-serialized DAGs, airflow seems to mark DAGs as inactive based on when 
the scheduler last processed them - I wonder if we should make these two 
methods consistent?

  was:
With serialization of DAGs enabled, `SerializedDagModel.remove_deleted_dags` 
called from `DagFileProcessManager.refresh_dag_dir` can delete the 
serialization of DAGs if they were loaded via a DagBag and globals in a 
different `.py` file:

Consider something like this:
`/home/airflow/dags/loader.py`
```
dags = []
dags.append(models.DagBag('/home/airflow/project-a/dags')
dags.append(models.DagBag('/home/airflow/project-b/dags')

globals().update(dags)
```

with files:
`/home/airflow/project-a/dags/dag-a.py`
`/home/airflow/project-b/dags/dag-b.py`


The list of file paths passed to `SerializedDagModel.remove_deleted_dags` is 
only going to contain `/home/airflow/dags/loader.py` and the method will remove 
the serializations for the DAGs in dag-a.py and dag-b.py

With non-serialized DAGs, airflow seems to mark DAGs as inactive based on when 
the scheduler last processed them - I wonder if we should make these two 
methods consistent?


> Serialized DAGs can be incorrectly deleted
> --
>
> Key: AIRFLOW-6796
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6796
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: serialization
>Affects Versions: 1.10.9
>Reporter: Matthew Bruce
>Priority: Major
>
> With serialization of DAGs enabled, `SerializedDagModel.remove_deleted_dags` 
> called from `DagFileProcessManager.refresh_dag_dir` can delete the 
> serialization of DAGs if they were loaded via a DagBag and globals in a 
> different `.py` file:
> Consider something like this:
> {{/home/airflow/dags/loader.py}}
> {code:python}
> dags = []
> dags.append(models.DagBag('/home/airflow/project-a/dags')
> dags.append(models.DagBag('/home/airflow/project-b/dags')
> globals().update(dags)
> {code}
> with files:
> {{/home/airflow/project-a/dags/dag-a.py}}
> {{/home/airflow/project-b/dags/dag-b.py}}
> The list of file paths passed to {{SerializedDagModel.remove_deleted_dags}} 
> is only going to contain {{/home/airflow/dags/loader.py}} and the method will 
> remove the serializations for the DAGs in dag-a.py and dag-b.py
> With non-serialized DAGs, airflow seems to mark DAGs as inactive based on 
> when the scheduler last processed them - I wonder if we should make these two 
> methods consistent?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] vsoch commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-13 Thread GitBox
vsoch commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-585946960
 
 
   > I assume we can start singularity from the host and be able to forward 
this connection to inside the airlfow-testing container so that we can connect 
to it. With Kind - we are starting it from within the container (by forwarded 
docker socket) , but it could be started from the host as well (mongo, kerberos 
and others are started from the host using docker-compose configuration and 
then we can connect to them from the "airflow-testing" by specifying their 
service names (mongo/kerberos etc).
   
   Ah yes this is what I wanted to ask about, specifically:
   
   > I assume we can start singularity from the host and be able to forward 
this connection to inside the airlfow-testing container so that we can connect 
to it. 
   
   Singularity is different from docker - it doesn't have a service or daemon, 
it's akin to an executable. It would be installed inside a container, and it 
wouldn't work to "start on the host and forward." 
   
   > started from the host using docker-compose configuration and then we can 
connect to them from the "airflow-testing" by specifying their service names 
(mongo/kerberos etc).
   
   You mean to say that we start a container with Singularity installed via 
docker-compose, and then run as a service? I think it would work to run things 
internally, I'm not sure the extent to which you could issue interactions from 
the outside. This I think is something that we could do:
   
1. installing Singularity as a binary inside a container (these bases 
already exist)
2. running the container via docker-compose
3. But then running tests inside of that container
   
   Is this possible?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-13 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-585943248
 
 
   > okay just to clarify - you want a Singularity + Airflow container run via 
a similar kind cluster? You said something about adding specific commands 
`--start-singularity` and `--stop-singularity` and I'm not sure what / where 
that is referring to, and how / why we would want to start or stop singularity 
(it's not a service, it's just a binary installed).
   
   I assume we can start singularity from the host and be able to forward this 
connection to inside the airlfow-testing container so that we can connect to 
it. With Kind - we are starting it from within the container (by forwarded 
docker socket) , but it could be started from the host as well (mongo, kerberos 
and others are started from the host using `docker-compose` configuration and 
then we can connect to them from the "airflow-testing" by specifying their 
service names (mongo/kerberos etc).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-13 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-585941481
 
 
   > okay so I'm tracing the kubernetes (runtime) as an example, and I have a 
quick question. In scripts/ci/in_container/entrypoint_ci.sh I see that given 
the kubernetes runtime, you are setting some variable for a container name to 
be the airflow name with suffix "-kubernetes."
   > 
   > and this would suggest the container is brought up inside of this 
container? Wouldn't it make more sense to have a docker-compose run that runs 
some equivalent of airflow-testing but with Singularity installed inside?
   
   Yes it would make sense - similar to what we do with other integrations (say 
kerberos or mongo). What we are doing with kind is a bit different - we are 
forwarding the host's docker credential to inside the airflow-testing container 
and we are running the kind command line from inside it. This - in effect - 
reaches out to the host's docker and set's up the kind cluster using host's 
docker engine.
   
   With singularity - if it can be run in the host and we can connect to it 
from inside the airflow-testing container that would be best.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] brtasavpatel commented on issue #6643: [AIRFLOW-6040] Fix KubernetesJobWatcher Read time out error

2020-02-13 Thread GitBox
brtasavpatel commented on issue #6643: [AIRFLOW-6040] Fix KubernetesJobWatcher 
Read time out error
URL: https://github.com/apache/airflow/pull/6643#issuecomment-585941357
 
 
   > Was this resolved? Setting AIRFLOW__KUBERNETES__KUBE_CLIENT_REQUEST_ARGS: 
'{ "_request_timeout": "50" }' did not resolve our issue with 
KubernetesJobWatcher
   
   same here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-13 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-585939693
 
 
   Somehow  I missed it completely. Sorry. Responding now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-6796) Serialized DAGs can be incorrectly deleted

2020-02-13 Thread Matthew Bruce (Jira)
Matthew Bruce created AIRFLOW-6796:
--

 Summary: Serialized DAGs can be incorrectly deleted
 Key: AIRFLOW-6796
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6796
 Project: Apache Airflow
  Issue Type: Bug
  Components: serialization
Affects Versions: 1.10.9
Reporter: Matthew Bruce


With serialization of DAGs enabled, `SerializedDagModel.remove_deleted_dags` 
called from `DagFileProcessManager.refresh_dag_dir` can delete the 
serialization of DAGs if they were loaded via a DagBag and globals in a 
different `.py` file:

Consider something like this:
`/home/airflow/dags/loader.py`
```
dags = []
dags.append(models.DagBag('/home/airflow/project-a/dags')
dags.append(models.DagBag('/home/airflow/project-b/dags')

globals().update(dags)
```

with files:
`/home/airflow/project-a/dags/dag-a.py`
`/home/airflow/project-b/dags/dag-b.py`


The list of file paths passed to `SerializedDagModel.remove_deleted_dags` is 
only going to contain `/home/airflow/dags/loader.py` and the method will remove 
the serializations for the DAGs in dag-a.py and dag-b.py

With non-serialized DAGs, airflow seems to mark DAGs as inactive based on when 
the scheduler last processed them - I wonder if we should make these two 
methods consistent?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6795) serialized_dag table's data column text type is too small for mysql

2020-02-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036456#comment-17036456
 ] 

ASF GitHub Bot commented on AIRFLOW-6795:
-

nritholtz commented on pull request #7414: [AIRFLOW-6795]  Increase text size 
on data column in serialized_dag for MySQL
URL: https://github.com/apache/airflow/pull/7414
 
 
   …r MySQL
   
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [ ] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> serialized_dag table's data column text type is too small for mysql
> ---
>
> Key: AIRFLOW-6795
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6795
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: serialization
>Affects Versions: 1.10.9
>Reporter: Nathaniel Ritholtz
>Priority: Major
>
> When upgrading to v1.10.9, I tried using the new store_serialized_dags flag. 
> However, the scheduler was erroring out with:
> {code}
> scheduler_1  | Process DagFileProcessor2163-Process:
> scheduler_1  | Traceback (most recent call last):
> scheduler_1  |   File "/usr/local/lib/python3.6/multiprocessing/process.py", 
> line 258, in _bootstrap
> scheduler_1  | self.run()
> scheduler_1  |   File "/usr/local/lib/python3.6/multiprocessing/process.py", 
> line 93, in run
> scheduler_1  | self._target(*self._args, **self._kwargs)
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 
> 157, in _run_file_processor
> scheduler_1  | pickle_dags)
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 74, in 
> wrapper
> scheduler_1  | return func(*args, **kwargs)
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 
> 1580, in process_file
> scheduler_1  | dag.sync_to_db()
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 74, in 
> wrapper
> scheduler_1  | return func(*args, **kwargs)
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/airflow/models/dag.py", line 1514, in 
> sync_to_db
> scheduler_1  | session=session
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 70, in 
> wrapper
> scheduler_1  | return func(*args, **kwargs)
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/airflow/models/serialized_dag.py", 
> line 118, in write_dag
> scheduler_1  | session.merge(cls(dag))
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 
> 2113, in merge
> scheduler_1  | _resolve_conflict_map=_resolve_conflict_map,
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 
> 2186, in _merge
> scheduler_1  | merged = self.query(mapper.class_).get(key[1])
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 1004, 
> in get
> scheduler_1  | return self._get_impl(ident, loading.load_on_pk_identity)
> scheduler_1  |   File 
> 

[GitHub] [airflow] nritholtz opened a new pull request #7414: [AIRFLOW-6795] Increase text size on data column in serialized_dag for MySQL

2020-02-13 Thread GitBox
nritholtz opened a new pull request #7414: [AIRFLOW-6795]  Increase text size 
on data column in serialized_dag for MySQL
URL: https://github.com/apache/airflow/pull/7414
 
 
   …r MySQL
   
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [ ] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-6795) serialized_dag table's data column text type is too small for mysql

2020-02-13 Thread Nathaniel Ritholtz (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nathaniel Ritholtz updated AIRFLOW-6795:

Summary: serialized_dag table's data column text type is too small for 
mysql  (was: serialized_dag table's data type max length is too small)

> serialized_dag table's data column text type is too small for mysql
> ---
>
> Key: AIRFLOW-6795
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6795
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: serialization
>Affects Versions: 1.10.9
>Reporter: Nathaniel Ritholtz
>Priority: Major
>
> When upgrading to v1.10.9, I tried using the new store_serialized_dags flag. 
> However, the scheduler was erroring out with:
> {code}
> scheduler_1  | Process DagFileProcessor2163-Process:
> scheduler_1  | Traceback (most recent call last):
> scheduler_1  |   File "/usr/local/lib/python3.6/multiprocessing/process.py", 
> line 258, in _bootstrap
> scheduler_1  | self.run()
> scheduler_1  |   File "/usr/local/lib/python3.6/multiprocessing/process.py", 
> line 93, in run
> scheduler_1  | self._target(*self._args, **self._kwargs)
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 
> 157, in _run_file_processor
> scheduler_1  | pickle_dags)
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 74, in 
> wrapper
> scheduler_1  | return func(*args, **kwargs)
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 
> 1580, in process_file
> scheduler_1  | dag.sync_to_db()
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 74, in 
> wrapper
> scheduler_1  | return func(*args, **kwargs)
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/airflow/models/dag.py", line 1514, in 
> sync_to_db
> scheduler_1  | session=session
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 70, in 
> wrapper
> scheduler_1  | return func(*args, **kwargs)
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/airflow/models/serialized_dag.py", 
> line 118, in write_dag
> scheduler_1  | session.merge(cls(dag))
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 
> 2113, in merge
> scheduler_1  | _resolve_conflict_map=_resolve_conflict_map,
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 
> 2186, in _merge
> scheduler_1  | merged = self.query(mapper.class_).get(key[1])
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 1004, 
> in get
> scheduler_1  | return self._get_impl(ident, loading.load_on_pk_identity)
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 1116, 
> in _get_impl
> scheduler_1  | return db_load_fn(self, primary_key_identity)
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/loading.py", line 284, 
> in load_on_pk_identity
> scheduler_1  | return q.one()
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 3347, 
> in one
> scheduler_1  | ret = self.one_or_none()
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 3316, 
> in one_or_none
> scheduler_1  | ret = list(self)
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/loading.py", line 101, 
> in instances
> scheduler_1  | util.raise_from_cause(err)
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 398, 
> in raise_from_cause
> scheduler_1  | reraise(type(exception), exception, tb=exc_tb, cause=cause)
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 153, 
> in reraise
> scheduler_1  | raise value
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/loading.py", line 81, 
> in instances
> scheduler_1  | rows = [proc(row) for row in fetch]
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/loading.py", line 81, 
> in 
> scheduler_1  | rows = [proc(row) for row in fetch]
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/loading.py", line 574, 
> in _instance
> scheduler_1  | populators,
> scheduler_1  |   File 
> "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/loading.py", line 695, 
> in _populate_full
> scheduler_1  | dict_[key] = getter(row)
> scheduler_1  |   File 
> 

[GitHub] [airflow] feluelle commented on a change in pull request #7410: [AIRFLOW-6790] Add basic Tableau Integration

2020-02-13 Thread GitBox
feluelle commented on a change in pull request #7410: [AIRFLOW-6790] Add basic 
Tableau Integration
URL: https://github.com/apache/airflow/pull/7410#discussion_r379058038
 
 

 ##
 File path: airflow/providers/salesforce/operators/tableau_refresh_workbook.py
 ##
 @@ -0,0 +1,74 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow import AirflowException
+from airflow.models import BaseOperator
+from airflow.providers.salesforce.hooks.tableau import TableauHook
+from airflow.utils.decorators import apply_defaults
+
+
+class TableauRefreshWorkbookOperator(BaseOperator):
+"""
+Refreshes a Tableau Workbook/Extract
+
+:param workbook_name: The name of the workbook to refresh.
+:type workbook_name: str
+:param site_id: The id of the site where the workbook belongs to.
+:type site_id: str
+:param tableau_conn_id: The Tableau Connection id containing the 
credentials
+to authenticate to the Tableau Server.
+:type tableau_conn_id: str
+"""
+
+@apply_defaults
+def __init__(self,
+ workbook_name,
+ *args,
+ site_id='',
+ tableau_conn_id='tableau_default',
+ **kwargs):
+super().__init__(*args, **kwargs)
+self.workbook_name = workbook_name
+self.site_id = site_id
+self.tableau_conn_id = tableau_conn_id
+
+def execute(self, context):
+"""
+Executes the Tableau Extract Refresh and pushes the job id to xcom.
+
+:param context: The task context during execution.
+:type context: dict
+:return: the id of the job that executes the extract refresh
+:rtype: str
+"""
+with TableauHook(self.site_id, self.tableau_conn_id) as tableau_hook:
+workbook = self._get_workbook_by_name(tableau_hook)
+
+return self._refresh_workbook(tableau_hook, workbook.id)
+
+def _get_workbook_by_name(self, tableau_hook):
+for workbook in tableau_hook.get_all(resource_name='workbooks'):
+if workbook.name == self.workbook_name:
+self.log.info('Found matching workbook with id %s', 
workbook.id)
+return workbook
+
+raise AirflowException(f'Workbook {self.workbook_name} not found!')
+
+def _refresh_workbook(self, tableau_hook, workbook_id):
+job = tableau_hook.server.workbooks.refresh(workbook_id)
 
 Review comment:
   I am gonna check that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #7410: [AIRFLOW-6790] Add basic Tableau Integration

2020-02-13 Thread GitBox
feluelle commented on a change in pull request #7410: [AIRFLOW-6790] Add basic 
Tableau Integration
URL: https://github.com/apache/airflow/pull/7410#discussion_r379057502
 
 

 ##
 File path: airflow/providers/salesforce/operators/tableau_refresh_workbook.py
 ##
 @@ -0,0 +1,74 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow import AirflowException
+from airflow.models import BaseOperator
+from airflow.providers.salesforce.hooks.tableau import TableauHook
+from airflow.utils.decorators import apply_defaults
+
+
+class TableauRefreshWorkbookOperator(BaseOperator):
+"""
+Refreshes a Tableau Workbook/Extract
+
+:param workbook_name: The name of the workbook to refresh.
+:type workbook_name: str
+:param site_id: The id of the site where the workbook belongs to.
+:type site_id: str
+:param tableau_conn_id: The Tableau Connection id containing the 
credentials
+to authenticate to the Tableau Server.
+:type tableau_conn_id: str
+"""
+
+@apply_defaults
+def __init__(self,
+ workbook_name,
+ *args,
+ site_id='',
+ tableau_conn_id='tableau_default',
+ **kwargs):
+super().__init__(*args, **kwargs)
+self.workbook_name = workbook_name
+self.site_id = site_id
 
 Review comment:
   Good point.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #7410: [AIRFLOW-6790] Add basic Tableau Integration

2020-02-13 Thread GitBox
feluelle commented on a change in pull request #7410: [AIRFLOW-6790] Add basic 
Tableau Integration
URL: https://github.com/apache/airflow/pull/7410#discussion_r379056816
 
 

 ##
 File path: airflow/providers/salesforce/hooks/tableau.py
 ##
 @@ -0,0 +1,61 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from tableauserverclient import Pager, Server, TableauAuth
+
+from airflow.hooks.base_hook import BaseHook
+
+
+class TableauHook(BaseHook):
+"""
+Connects to the Tableau Server Instance and allows to communicate with it.
+
+:param site_id: The id of the site where the workbook belongs to.
+:type site_id: str
+:param tableau_conn_id: The Tableau Connection id containing the 
credentials
+to authenticate to the Tableau Server.
+:type tableau_conn_id: str
+"""
+
+def __init__(self, site_id='', tableau_conn_id='tableau_default'):
 
 Review comment:
   Actually I think I will remove the default. WDYT?
   ```suggestion
   def __init__(self, site_id='', tableau_conn_id):
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #7410: [AIRFLOW-6790] Add basic Tableau Integration

2020-02-13 Thread GitBox
feluelle commented on a change in pull request #7410: [AIRFLOW-6790] Add basic 
Tableau Integration
URL: https://github.com/apache/airflow/pull/7410#discussion_r379056816
 
 

 ##
 File path: airflow/providers/salesforce/hooks/tableau.py
 ##
 @@ -0,0 +1,61 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from tableauserverclient import Pager, Server, TableauAuth
+
+from airflow.hooks.base_hook import BaseHook
+
+
+class TableauHook(BaseHook):
+"""
+Connects to the Tableau Server Instance and allows to communicate with it.
+
+:param site_id: The id of the site where the workbook belongs to.
+:type site_id: str
+:param tableau_conn_id: The Tableau Connection id containing the 
credentials
+to authenticate to the Tableau Server.
+:type tableau_conn_id: str
+"""
+
+def __init__(self, site_id='', tableau_conn_id='tableau_default'):
 
 Review comment:
   Actually I think I will remove the default. WDYT?
   ```suggestion
   def __init__(self, site_id='', tableau_conn_id'):
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] dhegberg commented on issue #3229: [AIRFLOW-2325] Add cloudwatch task handler (IN PROGRESS)

2020-02-13 Thread GitBox
dhegberg commented on issue #3229: [AIRFLOW-2325] Add cloudwatch task handler 
(IN PROGRESS)
URL: https://github.com/apache/airflow/pull/3229#issuecomment-585918439
 
 
   @ericabertugli  Are you still working on this?   
   
   I started doing some testing and I'm happy to take over.  
   
   I'd write some tests, add to the logging documentation and add an entry in 
airflow_local_settings.  
   
   @potiuk 
   I was thinking of using a URL preflix like 'cloudwatch://'  in the 
remote_base_log_folder.  This seems a bit weird since it's not a folder but it 
looks like the stackdriver option has already gone this route.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #7410: [AIRFLOW-6790] Add basic Tableau Integration

2020-02-13 Thread GitBox
feluelle commented on a change in pull request #7410: [AIRFLOW-6790] Add basic 
Tableau Integration
URL: https://github.com/apache/airflow/pull/7410#discussion_r379056518
 
 

 ##
 File path: airflow/providers/salesforce/operators/tableau_refresh_workbook.py
 ##
 @@ -0,0 +1,74 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow import AirflowException
+from airflow.models import BaseOperator
+from airflow.providers.salesforce.hooks.tableau import TableauHook
+from airflow.utils.decorators import apply_defaults
+
+
+class TableauRefreshWorkbookOperator(BaseOperator):
+"""
+Refreshes a Tableau Workbook/Extract
+
+:param workbook_name: The name of the workbook to refresh.
+:type workbook_name: str
+:param site_id: The id of the site where the workbook belongs to.
+:type site_id: str
+:param tableau_conn_id: The Tableau Connection id containing the 
credentials
+to authenticate to the Tableau Server.
+:type tableau_conn_id: str
+"""
+
+@apply_defaults
+def __init__(self,
+ workbook_name,
+ *args,
+ site_id='',
 
 Review comment:
   An empty string is the default value in the library. See 
https://tableau.github.io/server-client-python/docs/api-ref#authentication


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #7410: [AIRFLOW-6790] Add basic Tableau Integration

2020-02-13 Thread GitBox
feluelle commented on a change in pull request #7410: [AIRFLOW-6790] Add basic 
Tableau Integration
URL: https://github.com/apache/airflow/pull/7410#discussion_r379056816
 
 

 ##
 File path: airflow/providers/salesforce/hooks/tableau.py
 ##
 @@ -0,0 +1,61 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from tableauserverclient import Pager, Server, TableauAuth
+
+from airflow.hooks.base_hook import BaseHook
+
+
+class TableauHook(BaseHook):
+"""
+Connects to the Tableau Server Instance and allows to communicate with it.
+
+:param site_id: The id of the site where the workbook belongs to.
+:type site_id: str
+:param tableau_conn_id: The Tableau Connection id containing the 
credentials
+to authenticate to the Tableau Server.
+:type tableau_conn_id: str
+"""
+
+def __init__(self, site_id='', tableau_conn_id='tableau_default'):
 
 Review comment:
   Actually I think I will remove the default
   ```suggestion
   def __init__(self, site_id='', tableau_conn_id'):
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-6795) serialized_dag table's data type max length is too small

2020-02-13 Thread Nathaniel Ritholtz (Jira)
Nathaniel Ritholtz created AIRFLOW-6795:
---

 Summary: serialized_dag table's data type max length is too small
 Key: AIRFLOW-6795
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6795
 Project: Apache Airflow
  Issue Type: Bug
  Components: serialization
Affects Versions: 1.10.9
Reporter: Nathaniel Ritholtz


When upgrading to v1.10.9, I tried using the new store_serialized_dags flag. 
However, the scheduler was erroring out with:
{code}
scheduler_1  | Process DagFileProcessor2163-Process:
scheduler_1  | Traceback (most recent call last):
scheduler_1  |   File "/usr/local/lib/python3.6/multiprocessing/process.py", 
line 258, in _bootstrap
scheduler_1  | self.run()
scheduler_1  |   File "/usr/local/lib/python3.6/multiprocessing/process.py", 
line 93, in run
scheduler_1  | self._target(*self._args, **self._kwargs)
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 
157, in _run_file_processor
scheduler_1  | pickle_dags)
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 74, in 
wrapper
scheduler_1  | return func(*args, **kwargs)
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 
1580, in process_file
scheduler_1  | dag.sync_to_db()
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 74, in 
wrapper
scheduler_1  | return func(*args, **kwargs)
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/airflow/models/dag.py", line 1514, in 
sync_to_db
scheduler_1  | session=session
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 70, in 
wrapper
scheduler_1  | return func(*args, **kwargs)
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/airflow/models/serialized_dag.py", line 
118, in write_dag
scheduler_1  | session.merge(cls(dag))
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2113, 
in merge
scheduler_1  | _resolve_conflict_map=_resolve_conflict_map,
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2186, 
in _merge
scheduler_1  | merged = self.query(mapper.class_).get(key[1])
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 1004, in 
get
scheduler_1  | return self._get_impl(ident, loading.load_on_pk_identity)
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 1116, in 
_get_impl
scheduler_1  | return db_load_fn(self, primary_key_identity)
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/loading.py", line 284, 
in load_on_pk_identity
scheduler_1  | return q.one()
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 3347, in 
one
scheduler_1  | ret = self.one_or_none()
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 3316, in 
one_or_none
scheduler_1  | ret = list(self)
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/loading.py", line 101, 
in instances
scheduler_1  | util.raise_from_cause(err)
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 398, 
in raise_from_cause
scheduler_1  | reraise(type(exception), exception, tb=exc_tb, cause=cause)
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 153, 
in reraise
scheduler_1  | raise value
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/loading.py", line 81, in 
instances
scheduler_1  | rows = [proc(row) for row in fetch]
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/loading.py", line 81, in 

scheduler_1  | rows = [proc(row) for row in fetch]
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/loading.py", line 574, 
in _instance
scheduler_1  | populators,
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/loading.py", line 695, 
in _populate_full
scheduler_1  | dict_[key] = getter(row)
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/type_api.py", line 1266, 
in process
scheduler_1  | return process_value(impl_processor(value), dialect)
scheduler_1  |   File 
"/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/sqltypes.py", line 2407, 
in process
scheduler_1  | return json_deserializer(value)
scheduler_1  |   File "/usr/local/lib/python3.6/json/__init__.py", line 354, in 
loads
scheduler_1  | return _default_decoder.decode(s)
scheduler_1  |   File "/usr/local/lib/python3.6/json/decoder.py", line 339, in 
decode
scheduler_1  | obj, end = 

[GitHub] [airflow] vsoch commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-13 Thread GitBox
vsoch commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-585869248
 
 
   ping...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] alexzlue commented on issue #7353: [AIRFLOW-6685] Data Quality Check operators

2020-02-13 Thread GitBox
alexzlue commented on issue #7353: [AIRFLOW-6685] Data Quality Check operators
URL: https://github.com/apache/airflow/pull/7353#issuecomment-585860338
 
 
   @eladkal Thanks for bringing this to mind. I do see that there is some 
functionality that I have that `CheckOperator` does not have. I will work on 
merging some of my work into this file then.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] msb217 commented on issue #6652: [AIRFLOW-5548] [AIRFLOW-5550] REST API enhancement - dag info, task …

2020-02-13 Thread GitBox
msb217 commented on issue #6652: [AIRFLOW-5548] [AIRFLOW-5550] REST API 
enhancement - dag info, task …
URL: https://github.com/apache/airflow/pull/6652#issuecomment-585848563
 
 
   @ashb any comments or opinions?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] eladkal commented on issue #7353: [AIRFLOW-6685] Data Quality Check operators

2020-02-13 Thread GitBox
eladkal commented on issue #7353: [AIRFLOW-6685] Data Quality Check operators
URL: https://github.com/apache/airflow/pull/7353#issuecomment-585830929
 
 
   In general the operators in this PR sounds like enhancement of 
[CheckOperator](https://github.com/apache/airflow/blob/master/airflow/operators/check_operator.py)
 
   
   1.  Why it needs to be in a new file?
   2. Aren't CheckOperator and BaseDataQualityOperator have same functionality?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-6344) CI builds with TAG fail with "unknown revision or path not in the working tree"

2020-02-13 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036312#comment-17036312
 ] 

ASF subversion and git services commented on AIRFLOW-6344:
--

Commit 413f9613755bd3ab54c317e73b488d51fa23c30a in airflow's branch 
refs/heads/v1-10-test from Ash Berlin-Taylor
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=413f961 ]

[AIRFLOW-6344] Fix travis CI for tag builds (#7411)

Don't try to find changed files unless we are building a pull request.
This only caused a problem on build of tags, but we were also doing this
for master/branch builds, but it was always saying finding no files
changed.

By checking this early we can make the other conditions in this function
simpler.

(cherry picked from commit 67463c3d8e5fe1618117244364d8a49f80536820)


> CI builds with TAG fail with "unknown revision or path not in the working 
> tree"
> ---
>
> Key: AIRFLOW-6344
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6344
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0, 1.10.7
>Reporter: Jarek Potiuk
>Assignee: Ash Berlin-Taylor
>Priority: Major
> Fix For: 1.10.10
>
>
> See for example here: unknown revision or path not in the working tree



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] dpmccabe commented on issue #3115: [AIRFLOW-2193] Add ROperator for using R

2020-02-13 Thread GitBox
dpmccabe commented on issue #3115: [AIRFLOW-2193] Add ROperator for using R
URL: https://github.com/apache/airflow/pull/3115#issuecomment-585817970
 
 
   Worth mentioning https://github.com/ropensci/drake/ as an alternative for 
people who are building data pipelines in R. Not the best option in a 
mixed-language environment, but if your whole pipeline is in R it's a great 
option.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-6344) CI builds with TAG fail with "unknown revision or path not in the working tree"

2020-02-13 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036308#comment-17036308
 ] 

ASF subversion and git services commented on AIRFLOW-6344:
--

Commit 67463c3d8e5fe1618117244364d8a49f80536820 in airflow's branch 
refs/heads/master from Ash Berlin-Taylor
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=67463c3 ]

[AIRFLOW-6344] Fix travis CI for tag builds (#7411)

Don't try to find changed files unless we are building a pull request.
This only caused a problem on build of tags, but we were also doing this
for master/branch builds, but it was always saying finding no files
changed.

By checking this early we can make the other conditions in this function
simpler.

> CI builds with TAG fail with "unknown revision or path not in the working 
> tree"
> ---
>
> Key: AIRFLOW-6344
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6344
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0, 1.10.7
>Reporter: Jarek Potiuk
>Assignee: Ash Berlin-Taylor
>Priority: Major
>
> See for example here: unknown revision or path not in the working tree



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6344) CI builds with TAG fail with "unknown revision or path not in the working tree"

2020-02-13 Thread Ash Berlin-Taylor (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-6344.

Fix Version/s: 1.10.10
   Resolution: Fixed

> CI builds with TAG fail with "unknown revision or path not in the working 
> tree"
> ---
>
> Key: AIRFLOW-6344
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6344
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0, 1.10.7
>Reporter: Jarek Potiuk
>Assignee: Ash Berlin-Taylor
>Priority: Major
> Fix For: 1.10.10
>
>
> See for example here: unknown revision or path not in the working tree



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6344) CI builds with TAG fail with "unknown revision or path not in the working tree"

2020-02-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036307#comment-17036307
 ] 

ASF GitHub Bot commented on AIRFLOW-6344:
-

ashb commented on pull request #7411: [AIRFLOW-6344] Fix travis CI for tag 
builds
URL: https://github.com/apache/airflow/pull/7411
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> CI builds with TAG fail with "unknown revision or path not in the working 
> tree"
> ---
>
> Key: AIRFLOW-6344
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6344
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0, 1.10.7
>Reporter: Jarek Potiuk
>Assignee: Ash Berlin-Taylor
>Priority: Major
>
> See for example here: unknown revision or path not in the working tree



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] ashb merged pull request #7411: [AIRFLOW-6344] Fix travis CI for tag builds

2020-02-13 Thread GitBox
ashb merged pull request #7411: [AIRFLOW-6344] Fix travis CI for tag builds
URL: https://github.com/apache/airflow/pull/7411
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-6649) Google storage to Snowflake

2020-02-13 Thread nexoriv (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036305#comment-17036305
 ] 

nexoriv commented on AIRFLOW-6649:
--

[~kamil.bregula] that sounds interesting!
[https://docs.snowflake.net/manuals/user-guide/data-load-gcs.html]

I think that now all use cases are suit from having external storage on S3/GCS. 
One might need the data to be inside the snowflake "domain". By providing the 
operator it only gives option.

> Google storage to Snowflake
> ---
>
> Key: AIRFLOW-6649
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6649
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: gcp, operators
>Affects Versions: 1.10.6
>Reporter: nexoriv
>Priority: Major
>  Labels: snowflake
>
> can someone share google storage to snowflake operator?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6768) Graph view rendering angular edges

2020-02-13 Thread Nathan Hadfield (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nathan Hadfield updated AIRFLOW-6768:
-
Fix Version/s: 1.10.10
   2.0.0

> Graph view rendering angular edges
> --
>
> Key: AIRFLOW-6768
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6768
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 1.10.8, 1.10.9
>Reporter: Nathan Hadfield
>Assignee: Nathan Hadfield
>Priority: Minor
> Fix For: 2.0.0, 1.10.10
>
> Attachments: Screenshot 2020-02-10 at 08.51.02.png, Screenshot 
> 2020-02-10 at 08.51.20.png
>
>
> Since the release of v1.10.8 the DAG graph view is rendering the edges 
> between nodes with angular lines rather than nice smooth curves.
> Seems to have been caused by a bump of dagre-d3.
> [https://github.com/apache/airflow/pull/7280]
> [https://github.com/dagrejs/dagre-d3/issues/305]
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] zhongjiajie commented on a change in pull request #6900: [AIRFLOW-6346] Enhance dag default_view and orientation

2020-02-13 Thread GitBox
zhongjiajie commented on a change in pull request #6900: [AIRFLOW-6346] Enhance 
dag default_view and orientation
URL: https://github.com/apache/airflow/pull/6900#discussion_r378914969
 
 

 ##
 File path: airflow/serialization/schema.json
 ##
 @@ -93,7 +93,7 @@
 "end_date": { "$ref": "#/definitions/datetime" },
 "dagrun_timeout": { "$ref": "#/definitions/timedelta" },
 "doc_md": { "type" : "string"},
-"_default_view": { "type" : "string"},
+"default_view": { "type" : "string"},
 
 Review comment:
   > **keep in the old name _default_view** for now
   
   Ok, I get it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Assigned] (AIRFLOW-6794) Allow AWS Operator RedshiftToS3Transfer To Run a Custom Query

2020-02-13 Thread Roger Russel Droique Neris (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roger Russel Droique Neris reassigned AIRFLOW-6794:
---

Assignee: Roger Russel Droique Neris

> Allow AWS Operator RedshiftToS3Transfer To Run a Custom Query
> -
>
> Key: AIRFLOW-6794
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6794
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws, operators
>Affects Versions: 1.10.9
>Reporter: Roger Russel Droique Neris
>Assignee: Roger Russel Droique Neris
>Priority: Trivial
>  Labels: AWS, newbie
>
> {{The Redshift operator }}
> {{"airflow.providers.amazon.aws.operators.redshift_to_s3.}}{{RedshiftToS3Transfer"
>  allow only a simple usage to transfer a table to a S3. }}
> {{[https://github.com/apache/airflow/blob/master/airflow/providers/amazon/aws/operators/redshift_to_s3.py#L110]}}
> {{If possible I would like to implement an usage of a custom query on it.}}
> {{The behavior expected is when a "query" parameter is given then it wil use 
> it, and if the it was not given it will use the default behavior.}}
> {{}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-6794) Allow AWS Operator RedshiftToS3Transfer To Run a Custom Query

2020-02-13 Thread Roger Russel Droique Neris (Jira)
Roger Russel Droique Neris created AIRFLOW-6794:
---

 Summary: Allow AWS Operator RedshiftToS3Transfer To Run a Custom 
Query
 Key: AIRFLOW-6794
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6794
 Project: Apache Airflow
  Issue Type: Improvement
  Components: aws, operators
Affects Versions: 1.10.9
Reporter: Roger Russel Droique Neris


{{The Redshift operator }}

{{"airflow.providers.amazon.aws.operators.redshift_to_s3.}}{{RedshiftToS3Transfer"
 allow only a simple usage to transfer a table to a S3. }}

{{[https://github.com/apache/airflow/blob/master/airflow/providers/amazon/aws/operators/redshift_to_s3.py#L110]}}

{{If possible I would like to implement an usage of a custom query on it.}}
{{The behavior expected is when a "query" parameter is given then it wil use 
it, and if the it was not given it will use the default behavior.}}

{{}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] mik-laj commented on a change in pull request #7389: [AIRFLOW-6763] Make systems tests ready for backport tests

2020-02-13 Thread GitBox
mik-laj commented on a change in pull request #7389: [AIRFLOW-6763] Make 
systems tests ready for backport tests
URL: https://github.com/apache/airflow/pull/7389#discussion_r378910666
 
 

 ##
 File path: TESTING.rst
 ##
 @@ -31,8 +31,7 @@ Airflow Test Infrastructure
 
 * **System tests** are automatic tests that use external systems like
   Google Cloud Platform. These tests are intended for an end-to-end DAG 
execution.
-  Note that automated execution of these tests is still
-  `work in progress 
`_.
+  The tests can be executed on both.
 
 Review comment:
   On both what?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-6793) airflow config command doesn't respect env variable

2020-02-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036281#comment-17036281
 ] 

ASF GitHub Bot commented on AIRFLOW-6793:
-

mik-laj commented on pull request #7413: [AIRFLOW-6793] Respect env variable in 
airflow config command
URL: https://github.com/apache/airflow/pull/7413
 
 
   `airflow config` command always returns a value from a file. It does not 
read information from environment variables.
   
   This bug was introduced by https://github.com/apache/airflow/pull/7117/files
   
   CC: @anitakar 
   
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [X] Description above provides context of the change
   - [X] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [X] Unit tests coverage for changes (not needed for documentation changes)
   - [X] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [X] Relevant documentation is updated including usage instructions.
   - [X] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> airflow config command doesn't respect env variable
> ---
>
> Key: AIRFLOW-6793
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6793
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: cli
>Affects Versions: 1.10.9
>Reporter: Kamil Bregula
>Priority: Major
>
> `airflow config` command always returns a value from a file. Does not read 
> information from environment variables.
> This bug was introduced by [https://github.com/apache/airflow/pull/7117/files]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] mik-laj opened a new pull request #7413: [AIRFLOW-6793] Respect env variable in airflow config command

2020-02-13 Thread GitBox
mik-laj opened a new pull request #7413: [AIRFLOW-6793] Respect env variable in 
airflow config command
URL: https://github.com/apache/airflow/pull/7413
 
 
   `airflow config` command always returns a value from a file. It does not 
read information from environment variables.
   
   This bug was introduced by https://github.com/apache/airflow/pull/7117/files
   
   CC: @anitakar 
   
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [X] Description above provides context of the change
   - [X] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [X] Unit tests coverage for changes (not needed for documentation changes)
   - [X] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [X] Relevant documentation is updated including usage instructions.
   - [X] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   >