[jira] [Commented] (AIRFLOW-630) Airflow worker is not working with Celery 4.0.0

2019-09-29 Thread Luke Bodeen (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16940603#comment-16940603
 ] 

Luke Bodeen commented on AIRFLOW-630:
-

yes please, I have tested and validated celery 4 is good to go with 1.9+

> Airflow worker is not working with Celery 4.0.0
> ---
>
> Key: AIRFLOW-630
> URL: https://issues.apache.org/jira/browse/AIRFLOW-630
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: celery
>Affects Versions: 1.7.1.2, 1.7.1.3
>Reporter: Hafiz Badrie Lubis
>Priority: Major
>
> Soon as celery version is upgraded to 4.0.0, airflow worker is not working, 
> because loglevel value is None. You can see the detail of error log on this 
> image: http://imgur.com/JHedHeN. 
> Should make loglevel value assignment be more flexible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] KevinYang21 commented on issue #6174: [AIRFLOW-5543] Fix tooltip disappears in tree and graph view

2019-09-29 Thread GitBox
KevinYang21 commented on issue #6174: [AIRFLOW-5543] Fix tooltip disappears in 
tree and graph view
URL: https://github.com/apache/airflow/pull/6174#issuecomment-536363064
 
 
   @feng-tao For sure let's connect! With @saguziel recenlty returned to the 
team 🎉, we have 6 people working on core Airflow now.
   
   We did migrate from from 1.8.0 to 1.10.0 eariler this year and recently 
migrated to 1.10.4 for one of our cluster as @pingzh mentioned. We tried to 
migrate w/o downtime for 1.8 -> 1.10, it was possible in theory but we wanted a 
smoothier migration so we spun up two cluster and rolling migrated the DAGs. We 
did learn quite a few lessons during the migration and would love to share them 
during our sync.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #6214: [AIRFLOW-XXX] Improve link to plugin page

2019-09-29 Thread GitBox
codecov-io edited a comment on issue #6214: [AIRFLOW-XXX] Improve link to 
plugin page
URL: https://github.com/apache/airflow/pull/6214#issuecomment-536353046
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=h1) 
Report
   > Merging 
[#6214](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/61190c30df17772ec84e9cb8dc4370c9394fd92c?src=pr&el=desc)
 will **increase** coverage by `70.42%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6214/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=tree)
   
   ```diff
   @@ Coverage Diff @@
   ##   master#6214   +/-   ##
   ===
   + Coverage9.59%   80.02%   +70.42% 
   ===
 Files 610  610   
 Lines   3517635176   
   ===
   + Hits 337628148+24772 
   + Misses  31800 7028-24772
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/plugins\_manager.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9wbHVnaW5zX21hbmFnZXIucHk=)
 | `86.91% <0%> (+0.93%)` | :arrow_up: |
   | 
[airflow/executors/dask\_executor.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvZGFza19leGVjdXRvci5weQ==)
 | `2% <0%> (+2%)` | :arrow_up: |
   | 
[airflow/config\_templates/airflow\_local\_settings.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9jb25maWdfdGVtcGxhdGVzL2FpcmZsb3dfbG9jYWxfc2V0dGluZ3MucHk=)
 | `80% <0%> (+2.5%)` | :arrow_up: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `91.97% <0%> (+2.91%)` | :arrow_up: |
   | 
[airflow/exceptions.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGNlcHRpb25zLnB5)
 | `100% <0%> (+3.7%)` | :arrow_up: |
   | 
[airflow/utils/log/colored\_log.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9sb2cvY29sb3JlZF9sb2cucHk=)
 | `93.18% <0%> (+13.63%)` | :arrow_up: |
   | 
[airflow/settings.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9zZXR0aW5ncy5weQ==)
 | `88.32% <0%> (+15.32%)` | :arrow_up: |
   | 
[airflow/utils/decorators.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9kZWNvcmF0b3JzLnB5)
 | `90.9% <0%> (+15.9%)` | :arrow_up: |
   | 
[airflow/task/task\_runner/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy90YXNrL3Rhc2tfcnVubmVyL19faW5pdF9fLnB5)
 | `63.63% <0%> (+18.18%)` | :arrow_up: |
   | 
[airflow/macros/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9tYWNyb3MvX19pbml0X18ucHk=)
 | `86.36% <0%> (+18.18%)` | :arrow_up: |
   | ... and [496 
more](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=footer). 
Last update 
[61190c3...f13785e](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #6214: [AIRFLOW-XXX] Improve link to plugin page

2019-09-29 Thread GitBox
codecov-io edited a comment on issue #6214: [AIRFLOW-XXX] Improve link to 
plugin page
URL: https://github.com/apache/airflow/pull/6214#issuecomment-536353046
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=h1) 
Report
   > Merging 
[#6214](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/61190c30df17772ec84e9cb8dc4370c9394fd92c?src=pr&el=desc)
 will **increase** coverage by `70.15%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6214/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=tree)
   
   ```diff
   @@ Coverage Diff @@
   ##   master#6214   +/-   ##
   ===
   + Coverage9.59%   79.75%   +70.15% 
   ===
 Files 610  610   
 Lines   3517635176   
   ===
   + Hits 337628054+24678 
   + Misses  31800 7122-24678
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/plugins\_manager.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9wbHVnaW5zX21hbmFnZXIucHk=)
 | `86.91% <0%> (+0.93%)` | :arrow_up: |
   | 
[airflow/executors/dask\_executor.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvZGFza19leGVjdXRvci5weQ==)
 | `2% <0%> (+2%)` | :arrow_up: |
   | 
[airflow/config\_templates/airflow\_local\_settings.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9jb25maWdfdGVtcGxhdGVzL2FpcmZsb3dfbG9jYWxfc2V0dGluZ3MucHk=)
 | `80% <0%> (+2.5%)` | :arrow_up: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `91.97% <0%> (+2.91%)` | :arrow_up: |
   | 
[airflow/exceptions.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGNlcHRpb25zLnB5)
 | `100% <0%> (+3.7%)` | :arrow_up: |
   | 
[airflow/utils/log/colored\_log.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9sb2cvY29sb3JlZF9sb2cucHk=)
 | `93.18% <0%> (+13.63%)` | :arrow_up: |
   | 
[airflow/settings.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9zZXR0aW5ncy5weQ==)
 | `88.32% <0%> (+15.32%)` | :arrow_up: |
   | 
[airflow/utils/decorators.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9kZWNvcmF0b3JzLnB5)
 | `90.9% <0%> (+15.9%)` | :arrow_up: |
   | 
[airflow/task/task\_runner/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy90YXNrL3Rhc2tfcnVubmVyL19faW5pdF9fLnB5)
 | `63.63% <0%> (+18.18%)` | :arrow_up: |
   | 
[airflow/macros/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9tYWNyb3MvX19pbml0X18ucHk=)
 | `86.36% <0%> (+18.18%)` | :arrow_up: |
   | ... and [496 
more](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=footer). 
Last update 
[61190c3...f13785e](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io commented on issue #6214: [AIRFLOW-XXX] Improve link to plugin page

2019-09-29 Thread GitBox
codecov-io commented on issue #6214: [AIRFLOW-XXX] Improve link to plugin page
URL: https://github.com/apache/airflow/pull/6214#issuecomment-536353046
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=h1) 
Report
   > Merging 
[#6214](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/61190c30df17772ec84e9cb8dc4370c9394fd92c?src=pr&el=desc)
 will **increase** coverage by `70.15%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6214/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=tree)
   
   ```diff
   @@ Coverage Diff @@
   ##   master#6214   +/-   ##
   ===
   + Coverage9.59%   79.75%   +70.15% 
   ===
 Files 610  610   
 Lines   3517635176   
   ===
   + Hits 337628054+24678 
   + Misses  31800 7122-24678
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/plugins\_manager.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9wbHVnaW5zX21hbmFnZXIucHk=)
 | `86.91% <0%> (+0.93%)` | :arrow_up: |
   | 
[airflow/executors/dask\_executor.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvZGFza19leGVjdXRvci5weQ==)
 | `2% <0%> (+2%)` | :arrow_up: |
   | 
[airflow/config\_templates/airflow\_local\_settings.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9jb25maWdfdGVtcGxhdGVzL2FpcmZsb3dfbG9jYWxfc2V0dGluZ3MucHk=)
 | `80% <0%> (+2.5%)` | :arrow_up: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `91.97% <0%> (+2.91%)` | :arrow_up: |
   | 
[airflow/exceptions.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGNlcHRpb25zLnB5)
 | `100% <0%> (+3.7%)` | :arrow_up: |
   | 
[airflow/utils/log/colored\_log.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9sb2cvY29sb3JlZF9sb2cucHk=)
 | `93.18% <0%> (+13.63%)` | :arrow_up: |
   | 
[airflow/settings.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9zZXR0aW5ncy5weQ==)
 | `88.32% <0%> (+15.32%)` | :arrow_up: |
   | 
[airflow/utils/decorators.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9kZWNvcmF0b3JzLnB5)
 | `90.9% <0%> (+15.9%)` | :arrow_up: |
   | 
[airflow/task/task\_runner/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy90YXNrL3Rhc2tfcnVubmVyL19faW5pdF9fLnB5)
 | `63.63% <0%> (+18.18%)` | :arrow_up: |
   | 
[airflow/macros/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9tYWNyb3MvX19pbml0X18ucHk=)
 | `86.36% <0%> (+18.18%)` | :arrow_up: |
   | ... and [496 
more](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=footer). 
Last update 
[61190c3...f13785e](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #6214: [AIRFLOW-XXX] Improve link to plugin page

2019-09-29 Thread GitBox
codecov-io edited a comment on issue #6214: [AIRFLOW-XXX] Improve link to 
plugin page
URL: https://github.com/apache/airflow/pull/6214#issuecomment-536353046
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=h1) 
Report
   > Merging 
[#6214](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/61190c30df17772ec84e9cb8dc4370c9394fd92c?src=pr&el=desc)
 will **increase** coverage by `70.15%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6214/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=tree)
   
   ```diff
   @@ Coverage Diff @@
   ##   master#6214   +/-   ##
   ===
   + Coverage9.59%   79.75%   +70.15% 
   ===
 Files 610  610   
 Lines   3517635176   
   ===
   + Hits 337628054+24678 
   + Misses  31800 7122-24678
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/plugins\_manager.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9wbHVnaW5zX21hbmFnZXIucHk=)
 | `86.91% <0%> (+0.93%)` | :arrow_up: |
   | 
[airflow/executors/dask\_executor.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvZGFza19leGVjdXRvci5weQ==)
 | `2% <0%> (+2%)` | :arrow_up: |
   | 
[airflow/config\_templates/airflow\_local\_settings.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9jb25maWdfdGVtcGxhdGVzL2FpcmZsb3dfbG9jYWxfc2V0dGluZ3MucHk=)
 | `80% <0%> (+2.5%)` | :arrow_up: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `91.97% <0%> (+2.91%)` | :arrow_up: |
   | 
[airflow/exceptions.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGNlcHRpb25zLnB5)
 | `100% <0%> (+3.7%)` | :arrow_up: |
   | 
[airflow/utils/log/colored\_log.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9sb2cvY29sb3JlZF9sb2cucHk=)
 | `93.18% <0%> (+13.63%)` | :arrow_up: |
   | 
[airflow/settings.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9zZXR0aW5ncy5weQ==)
 | `88.32% <0%> (+15.32%)` | :arrow_up: |
   | 
[airflow/utils/decorators.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9kZWNvcmF0b3JzLnB5)
 | `90.9% <0%> (+15.9%)` | :arrow_up: |
   | 
[airflow/task/task\_runner/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy90YXNrL3Rhc2tfcnVubmVyL19faW5pdF9fLnB5)
 | `63.63% <0%> (+18.18%)` | :arrow_up: |
   | 
[airflow/macros/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree#diff-YWlyZmxvdy9tYWNyb3MvX19pbml0X18ucHk=)
 | `86.36% <0%> (+18.18%)` | :arrow_up: |
   | ... and [496 
more](https://codecov.io/gh/apache/airflow/pull/6214/diff?src=pr&el=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=footer). 
Last update 
[61190c3...f13785e](https://codecov.io/gh/apache/airflow/pull/6214?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #6213: [AIRFLOW-XXX] Extract operators and hooks to separate page

2019-09-29 Thread GitBox
codecov-io edited a comment on issue #6213: [AIRFLOW-XXX] Extract operators and 
hooks to separate page
URL: https://github.com/apache/airflow/pull/6213#issuecomment-536308177
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=h1) 
Report
   > Merging 
[#6213](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/61190c30df17772ec84e9cb8dc4370c9394fd92c?src=pr&el=desc)
 will **increase** coverage by `70.42%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6213/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=tree)
   
   ```diff
   @@ Coverage Diff @@
   ##   master#6213   +/-   ##
   ===
   + Coverage9.59%   80.02%   +70.42% 
   ===
 Files 610  610   
 Lines   3517635176   
   ===
   + Hits 337628150+24774 
   + Misses  31800 7026-24774
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/plugins\_manager.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9wbHVnaW5zX21hbmFnZXIucHk=)
 | `86.91% <0%> (+0.93%)` | :arrow_up: |
   | 
[airflow/executors/dask\_executor.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvZGFza19leGVjdXRvci5weQ==)
 | `2% <0%> (+2%)` | :arrow_up: |
   | 
[airflow/config\_templates/airflow\_local\_settings.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9jb25maWdfdGVtcGxhdGVzL2FpcmZsb3dfbG9jYWxfc2V0dGluZ3MucHk=)
 | `80% <0%> (+2.5%)` | :arrow_up: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `91.97% <0%> (+2.91%)` | :arrow_up: |
   | 
[airflow/exceptions.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGNlcHRpb25zLnB5)
 | `100% <0%> (+3.7%)` | :arrow_up: |
   | 
[airflow/utils/log/colored\_log.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9sb2cvY29sb3JlZF9sb2cucHk=)
 | `93.18% <0%> (+13.63%)` | :arrow_up: |
   | 
[airflow/settings.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9zZXR0aW5ncy5weQ==)
 | `88.32% <0%> (+15.32%)` | :arrow_up: |
   | 
[airflow/utils/decorators.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9kZWNvcmF0b3JzLnB5)
 | `90.9% <0%> (+15.9%)` | :arrow_up: |
   | 
[airflow/task/task\_runner/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy90YXNrL3Rhc2tfcnVubmVyL19faW5pdF9fLnB5)
 | `63.63% <0%> (+18.18%)` | :arrow_up: |
   | 
[airflow/macros/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9tYWNyb3MvX19pbml0X18ucHk=)
 | `86.36% <0%> (+18.18%)` | :arrow_up: |
   | ... and [498 
more](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=footer). 
Last update 
[61190c3...a0f1334](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj opened a new pull request #6215: [AIRFLOW-XXX] Update airflow commands

2019-09-29 Thread GitBox
mik-laj opened a new pull request #6215: [AIRFLOW-XXX] Update airflow commands
URL: https://github.com/apache/airflow/pull/6215
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj opened a new pull request #6214: [AIRFLOW-XXX] Improve link to plugin page

2019-09-29 Thread GitBox
mik-laj opened a new pull request #6214: [AIRFLOW-XXX] Improve link to plugin 
page
URL: https://github.com/apache/airflow/pull/6214
 
 
   Currently, this link point to 
https://docs.sqlalchemy.org/en/13/orm/extensions/index.html#plugins
   
   ---
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] jaketf commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator

2019-09-29 Thread GitBox
jaketf commented on a change in pull request #6210: [AIRFLOW-5567] [Do not 
Merge] prototype BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r329368348
 
 

 ##
 File path: airflow/models/base_async_operator.py
 ##
 @@ -0,0 +1,96 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Base Asynchronous Operator for kicking off a long running
+operations and polling for completion with reschedule mode.
+"""
+from functools import wraps
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_EXTERNAL_RESOURCE_ID_KEY
+
+class BaseAsyncOperator(BaseSensorOperator, SkipMixin):
+"""
+AsyncOperators are derived from this class and inherit these attributes.
+
+AsyncOperators must define a `submit_request` to fire a request for a
+long running operation with a method and then executes a `poke` method
+executing at a time interval and succeed when a criteria is met and fail
+if and when they time out. They are effctively an opinionated way use
+combine an Operator and a Sensor in order to kick off a long running
+process without blocking a worker slot while waiting for the long running
+process to complete by leveraging reschedule mode.
+
+:param soft_fail: Set to true to mark the task as SKIPPED on failure
+:type soft_fail: bool
+:param poke_interval: Time in seconds that the job should wait in
+between each tries
+:type poke_interval: int
+:param timeout: Time, in seconds before the task times out and fails.
+:type timeout: int
+:type mode: str
+"""
+ui_color = '#9933ff'  # type: str
+valid_modes = ['poke', 'reschedule']  # type: Iterable[str]
+
+@apply_defaults
+def __init__(self,
+ *args,
+ **kwargs) -> None:
+super().__init__(mode='reschedule', *args, **kwargs)
+
+def submit_request(self, context) -> string:
+"""
+This method should kick off a long running operation.
+This method should return the ID for the long running operation used
+for polling
+Context is the same dictionary used as when rendering jinja templates.
+
+Refer to get_template_context for more context.
 
 Review comment:
   Not sure what this is. Is this just a sphinx formatting thing?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] jaketf commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator

2019-09-29 Thread GitBox
jaketf commented on a change in pull request #6210: [AIRFLOW-5567] [Do not 
Merge] prototype BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r329368265
 
 

 ##
 File path: airflow/models/base_async_operator.py
 ##
 @@ -0,0 +1,96 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Base Asynchronous Operator for kicking off a long running
+operations and polling for completion with reschedule mode.
+"""
+from functools import wraps
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_EXTERNAL_RESOURCE_ID_KEY
+
+class BaseAsyncOperator(BaseSensorOperator, SkipMixin):
 
 Review comment:
   I was planning on just inheriting the `execute` form BaseSensorOperator.
   But in light of the pre/post execute conversation I've refactored to have 
execute-ception where BaseAsyncOperator calls `super().execute`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] jaketf commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator

2019-09-29 Thread GitBox
jaketf commented on a change in pull request #6210: [AIRFLOW-5567] [Do not 
Merge] prototype BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r329368265
 
 

 ##
 File path: airflow/models/base_async_operator.py
 ##
 @@ -0,0 +1,96 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Base Asynchronous Operator for kicking off a long running
+operations and polling for completion with reschedule mode.
+"""
+from functools import wraps
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_EXTERNAL_RESOURCE_ID_KEY
+
+class BaseAsyncOperator(BaseSensorOperator, SkipMixin):
 
 Review comment:
   I'm was planning on just inheriting the `execute` form BaseSensorOperator.
   But in light of the pre/post execute conversation I've refactored to have 
execute-ception where BaseAsyncOperator calls `super().execute`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] jaketf commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator

2019-09-29 Thread GitBox
jaketf commented on a change in pull request #6210: [AIRFLOW-5567] [Do not 
Merge] prototype BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r329368268
 
 

 ##
 File path: airflow/models/base_async_operator.py
 ##
 @@ -0,0 +1,96 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Base Asynchronous Operator for kicking off a long running
+operations and polling for completion with reschedule mode.
+"""
+from functools import wraps
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_EXTERNAL_RESOURCE_ID_KEY
+
+class BaseAsyncOperator(BaseSensorOperator, SkipMixin):
 
 Review comment:
   I had a similar thought process. Where I ended up was the implementation of 
the common parent would nearly identical to `BaseSensorOperator`. The logic 
being `BaseAsyncOperator` just implements a few more methods (e.g 
`[submit/process]_request`) and is opinionated that `mode=reschedule`. 
   The main reasoning to subclass v.s. enhancing the existing Sensor is I 
wanted to use the same `execute` method but have a class nomenclature that 
indicates that subclasses of this are intended to take action. I think people 
are less likely to understand that a Sensor can take action rather than being a 
read-only poller.
   
   The spirit of this class is to provide an improved way for what we'd 
traditionally do with Operators so I wanted a class with the name Operator. 
Perhaps this is short sighted.
   
   I've refactored to do the submit / process in the execute method of 
BaseSensorOperator 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] jaketf commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator

2019-09-29 Thread GitBox
jaketf commented on a change in pull request #6210: [AIRFLOW-5567] [Do not 
Merge] prototype BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r329366295
 
 

 ##
 File path: airflow/models/base_async_operator.py
 ##
 @@ -0,0 +1,96 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Base Asynchronous Operator for kicking off a long running
+operations and polling for completion with reschedule mode.
+"""
+from functools import wraps
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_EXTERNAL_RESOURCE_ID_KEY
+
+class BaseAsyncOperator(BaseSensorOperator, SkipMixin):
+"""
+AsyncOperators are derived from this class and inherit these attributes.
+
+AsyncOperators must define a `submit_request` to fire a request for a
+long running operation with a method and then executes a `poke` method
+executing at a time interval and succeed when a criteria is met and fail
+if and when they time out. They are effctively an opinionated way use
+combine an Operator and a Sensor in order to kick off a long running
+process without blocking a worker slot while waiting for the long running
+process to complete by leveraging reschedule mode.
+
+:param soft_fail: Set to true to mark the task as SKIPPED on failure
+:type soft_fail: bool
+:param poke_interval: Time in seconds that the job should wait in
+between each tries
+:type poke_interval: int
+:param timeout: Time, in seconds before the task times out and fails.
+:type timeout: int
+:type mode: str
 
 Review comment:
   removed mode from docstring.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator

2019-09-29 Thread GitBox
dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not 
Merge] prototype BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r329365217
 
 

 ##
 File path: airflow/models/base_async_operator.py
 ##
 @@ -0,0 +1,96 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Base Asynchronous Operator for kicking off a long running
+operations and polling for completion with reschedule mode.
+"""
+from functools import wraps
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_EXTERNAL_RESOURCE_ID_KEY
+
+class BaseAsyncOperator(BaseSensorOperator, SkipMixin):
 
 Review comment:
   did not see your changes to sensor


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator

2019-09-29 Thread GitBox
dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not 
Merge] prototype BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r329365206
 
 

 ##
 File path: airflow/models/base_async_operator.py
 ##
 @@ -0,0 +1,96 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Base Asynchronous Operator for kicking off a long running
+operations and polling for completion with reschedule mode.
+"""
+from functools import wraps
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_EXTERNAL_RESOURCE_ID_KEY
+
+class BaseAsyncOperator(BaseSensorOperator, SkipMixin):
 
 Review comment:
   oh did not initially see you added the orchestration within sensor


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] jaketf commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator

2019-09-29 Thread GitBox
jaketf commented on a change in pull request #6210: [AIRFLOW-5567] [Do not 
Merge] prototype BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r329364469
 
 

 ##
 File path: airflow/sensors/base_sensor_operator.py
 ##
 @@ -121,7 +122,17 @@ def execute(self, context: Dict) -> None:
 raise AirflowRescheduleException(reschedule_date)
 else:
 sleep(self.poke_interval)
-self.log.info("Success criteria met. Exiting.")
+
+if isinstance(self, BaseAsyncOperator):
 
 Review comment:
   I was doing this to avoid a class method not implemented error.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] jaketf commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator

2019-09-29 Thread GitBox
jaketf commented on a change in pull request #6210: [AIRFLOW-5567] [Do not 
Merge] prototype BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r329364099
 
 

 ##
 File path: airflow/sensors/base_sensor_operator.py
 ##
 @@ -121,7 +122,17 @@ def execute(self, context: Dict) -> None:
 raise AirflowRescheduleException(reschedule_date)
 else:
 sleep(self.poke_interval)
-self.log.info("Success criteria met. Exiting.")
+
+if isinstance(self, BaseAsyncOperator):
 
 Review comment:
   I wanted to avoid trying to call `self.process_request` on a 
`BaseSensorOperator` and getting some no such class method error (as in my 
current model process_request would only be defined for BaseAsyncOperators).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator

2019-09-29 Thread GitBox
dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not 
Merge] prototype BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r329361997
 
 

 ##
 File path: airflow/models/base_async_operator.py
 ##
 @@ -0,0 +1,96 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Base Asynchronous Operator for kicking off a long running
+operations and polling for completion with reschedule mode.
+"""
+from functools import wraps
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_EXTERNAL_RESOURCE_ID_KEY
+
+class BaseAsyncOperator(BaseSensorOperator, SkipMixin):
 
 Review comment:
   I think that this operator actually makes more sense as a parent of sensor 
operator, rather than as a subclass of sensor operator.  BaseSensorOperator is 
a BaseAsyncOperator that only checks status, does not submit request, and does 
not process result.
   
   Or maybe you should actually just make these changes directly to 
BaseSensorOperator, making it more generic rather than subclassing it.  I am 
pretty sure you can do this while preserving backward compatibility.  Either 
they have common ancestor, or async is parent.
   
   Either way, I think you need to orchestrate the submit request / process 
result from within the logic of sensor operator's current execute method.  That 
way you only call `submit_request` on first run (which in sensor operator would 
not do anything) and only process result after poke successful (which again 
would not do anything in the sensor operator case).
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator

2019-09-29 Thread GitBox
dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not 
Merge] prototype BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r329360661
 
 

 ##
 File path: airflow/models/base_async_operator.py
 ##
 @@ -0,0 +1,96 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Base Asynchronous Operator for kicking off a long running
+operations and polling for completion with reschedule mode.
+"""
+from functools import wraps
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_EXTERNAL_RESOURCE_ID_KEY
+
+class BaseAsyncOperator(BaseSensorOperator, SkipMixin):
+"""
+AsyncOperators are derived from this class and inherit these attributes.
+
+AsyncOperators must define a `submit_request` to fire a request for a
+long running operation with a method and then executes a `poke` method
+executing at a time interval and succeed when a criteria is met and fail
+if and when they time out. They are effctively an opinionated way use
+combine an Operator and a Sensor in order to kick off a long running
+process without blocking a worker slot while waiting for the long running
+process to complete by leveraging reschedule mode.
+
+:param soft_fail: Set to true to mark the task as SKIPPED on failure
+:type soft_fail: bool
+:param poke_interval: Time in seconds that the job should wait in
+between each tries
+:type poke_interval: int
+:param timeout: Time, in seconds before the task times out and fails.
+:type timeout: int
+:type mode: str
+"""
+ui_color = '#9933ff'  # type: str
+valid_modes = ['poke', 'reschedule']  # type: Iterable[str]
+
+@apply_defaults
+def __init__(self,
+ *args,
+ **kwargs) -> None:
+super().__init__(mode='reschedule', *args, **kwargs)
+
+def submit_request(self, context) -> string:
 
 Review comment:
   @mik-laj why not string / dict / json-serializable string?  if this is using 
xcom, and xcom pickling is going away, is that not a good solution?  Even if we 
were to add a new database column and not use xcom, wouldn't json 
serializability be desired?  (have not really used generics and not sure what 
their value is here... interested in understanding more about what you are 
suggesting)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator

2019-09-29 Thread GitBox
dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not 
Merge] prototype BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r329361264
 
 

 ##
 File path: airflow/models/base_async_operator.py
 ##
 @@ -0,0 +1,96 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Base Asynchronous Operator for kicking off a long running
+operations and polling for completion with reschedule mode.
+"""
+from functools import wraps
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_EXTERNAL_RESOURCE_ID_KEY
+
+class BaseAsyncOperator(BaseSensorOperator, SkipMixin):
 
 Review comment:
   Doesn't this need an execute method to tie everything together, so that 
users can just subclass submit_request and process_result?
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator

2019-09-29 Thread GitBox
dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not 
Merge] prototype BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r329359901
 
 

 ##
 File path: airflow/models/base_async_operator.py
 ##
 @@ -0,0 +1,96 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Base Asynchronous Operator for kicking off a long running
+operations and polling for completion with reschedule mode.
+"""
+from functools import wraps
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_EXTERNAL_RESOURCE_ID_KEY
+
+class BaseAsyncOperator(BaseSensorOperator, SkipMixin):
+"""
+AsyncOperators are derived from this class and inherit these attributes.
+
+AsyncOperators must define a `submit_request` to fire a request for a
+long running operation with a method and then executes a `poke` method
+executing at a time interval and succeed when a criteria is met and fail
+if and when they time out. They are effctively an opinionated way use
+combine an Operator and a Sensor in order to kick off a long running
+process without blocking a worker slot while waiting for the long running
+process to complete by leveraging reschedule mode.
+
+:param soft_fail: Set to true to mark the task as SKIPPED on failure
+:type soft_fail: bool
+:param poke_interval: Time in seconds that the job should wait in
+between each tries
+:type poke_interval: int
+:param timeout: Time, in seconds before the task times out and fails.
+:type timeout: int
+:type mode: str
+"""
+ui_color = '#9933ff'  # type: str
+valid_modes = ['poke', 'reschedule']  # type: Iterable[str]
+
+@apply_defaults
+def __init__(self,
+ *args,
+ **kwargs) -> None:
+super().__init__(mode='reschedule', *args, **kwargs)
+
+def submit_request(self, context) -> string:
+"""
+This method should kick off a long running operation.
+This method should return the ID for the long running operation used
+for polling
+Context is the same dictionary used as when rendering jinja templates.
+
+Refer to get_template_context for more context.
+
+:returns: a resource_id for the long running operation.
+:rtype: str
+"""
+raise AirflowException('Async Operators must define a `submit_request` 
method.')
+
+def process_result(self, context):
+"""
+This method can optionally be overriden to process the result of a 
long running operation.
+Context is the same dictionary used as when rendering jinja templates.
+
+Refer to get_template_context for more context.
+"""
+self.log.info('Got result of {}. Done.'.format(
+  self.get_external_resource_id(context))
+
+def pre_execute(self, context) -> None:
 
 Review comment:
   @mik-laj 
   > Executing code in pre_execute is dangerous because you can jam the thread. 
The code executed in the execute method has a timeout.
   
   Curious what you mean by this.  Are `pre_execute` and `post_execute` not 
meant to be overridden?
   
   I used them in a "WatermarkSqlOperator" for incremental loads.   Pre-execute 
would calculate high watermark (max date in source table) and store it as 
attribute.  It would also retrieve last high watermark from xcom.  Post-execute 
would send to xcom new high watermark after `execute` completed successfully.
   
   I did it this way so that execute method could be overridden in subclass.
   
   Just trying to understand what's the danger inherent in pre / post execute.
   
   OH! I think i understand... you are saying that pre_execute does not have a 
timeout -- that's the danger.I see so in my example I should have defined 
`execute` to do the orchestration but had a `do_work` type of method that is 
meant to be overridden, instead of expecting subclasses to override `execute`?  
Like what is done with sensor operator and poke.  Maybe the `execute` timeout 
should apply to pre-exec + exec + post-exec...

-

[GitHub] [airflow] dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator

2019-09-29 Thread GitBox
dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not 
Merge] prototype BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r329359901
 
 

 ##
 File path: airflow/models/base_async_operator.py
 ##
 @@ -0,0 +1,96 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Base Asynchronous Operator for kicking off a long running
+operations and polling for completion with reschedule mode.
+"""
+from functools import wraps
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_EXTERNAL_RESOURCE_ID_KEY
+
+class BaseAsyncOperator(BaseSensorOperator, SkipMixin):
+"""
+AsyncOperators are derived from this class and inherit these attributes.
+
+AsyncOperators must define a `submit_request` to fire a request for a
+long running operation with a method and then executes a `poke` method
+executing at a time interval and succeed when a criteria is met and fail
+if and when they time out. They are effctively an opinionated way use
+combine an Operator and a Sensor in order to kick off a long running
+process without blocking a worker slot while waiting for the long running
+process to complete by leveraging reschedule mode.
+
+:param soft_fail: Set to true to mark the task as SKIPPED on failure
+:type soft_fail: bool
+:param poke_interval: Time in seconds that the job should wait in
+between each tries
+:type poke_interval: int
+:param timeout: Time, in seconds before the task times out and fails.
+:type timeout: int
+:type mode: str
+"""
+ui_color = '#9933ff'  # type: str
+valid_modes = ['poke', 'reschedule']  # type: Iterable[str]
+
+@apply_defaults
+def __init__(self,
+ *args,
+ **kwargs) -> None:
+super().__init__(mode='reschedule', *args, **kwargs)
+
+def submit_request(self, context) -> string:
+"""
+This method should kick off a long running operation.
+This method should return the ID for the long running operation used
+for polling
+Context is the same dictionary used as when rendering jinja templates.
+
+Refer to get_template_context for more context.
+
+:returns: a resource_id for the long running operation.
+:rtype: str
+"""
+raise AirflowException('Async Operators must define a `submit_request` 
method.')
+
+def process_result(self, context):
+"""
+This method can optionally be overriden to process the result of a 
long running operation.
+Context is the same dictionary used as when rendering jinja templates.
+
+Refer to get_template_context for more context.
+"""
+self.log.info('Got result of {}. Done.'.format(
+  self.get_external_resource_id(context))
+
+def pre_execute(self, context) -> None:
 
 Review comment:
   @mik-laj 
   > Executing code in pre_execute is dangerous because you can jam the thread. 
The code executed in the execute method has a timeout.
   
   Curious what you mean by this.  Are `pre_execute` and `post_execute` not 
meant to be overridden?
   
   I used them in a "WatermarkSqlOperator" for incremental loads.   Pre-execute 
would calculate high watermark (max date in source table) and store it as 
attribute.  It would also retrieve last high watermark from xcom.  Post-execute 
would send to xcom new high watermark after `execute` completed successfully.
   
   I did it this way so that execute method could be overridden in subclass.
   
   Just trying to understand what's the danger inherent in pre / post execute.
   
   OH! I think i understand... you are saying that pre_execute does not have a 
timeout -- that's the danger.I see so in my example I should have defined 
`execute` to do the orchestration but had a `do_work` type of method that is 
meant to be overridden, instead of expecting subclasses to override `execute`?  
Like what is done with sensor operator and poke.


This is an automated mess

[GitHub] [airflow] dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator

2019-09-29 Thread GitBox
dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not 
Merge] prototype BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r329359901
 
 

 ##
 File path: airflow/models/base_async_operator.py
 ##
 @@ -0,0 +1,96 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Base Asynchronous Operator for kicking off a long running
+operations and polling for completion with reschedule mode.
+"""
+from functools import wraps
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_EXTERNAL_RESOURCE_ID_KEY
+
+class BaseAsyncOperator(BaseSensorOperator, SkipMixin):
+"""
+AsyncOperators are derived from this class and inherit these attributes.
+
+AsyncOperators must define a `submit_request` to fire a request for a
+long running operation with a method and then executes a `poke` method
+executing at a time interval and succeed when a criteria is met and fail
+if and when they time out. They are effctively an opinionated way use
+combine an Operator and a Sensor in order to kick off a long running
+process without blocking a worker slot while waiting for the long running
+process to complete by leveraging reschedule mode.
+
+:param soft_fail: Set to true to mark the task as SKIPPED on failure
+:type soft_fail: bool
+:param poke_interval: Time in seconds that the job should wait in
+between each tries
+:type poke_interval: int
+:param timeout: Time, in seconds before the task times out and fails.
+:type timeout: int
+:type mode: str
+"""
+ui_color = '#9933ff'  # type: str
+valid_modes = ['poke', 'reschedule']  # type: Iterable[str]
+
+@apply_defaults
+def __init__(self,
+ *args,
+ **kwargs) -> None:
+super().__init__(mode='reschedule', *args, **kwargs)
+
+def submit_request(self, context) -> string:
+"""
+This method should kick off a long running operation.
+This method should return the ID for the long running operation used
+for polling
+Context is the same dictionary used as when rendering jinja templates.
+
+Refer to get_template_context for more context.
+
+:returns: a resource_id for the long running operation.
+:rtype: str
+"""
+raise AirflowException('Async Operators must define a `submit_request` 
method.')
+
+def process_result(self, context):
+"""
+This method can optionally be overriden to process the result of a 
long running operation.
+Context is the same dictionary used as when rendering jinja templates.
+
+Refer to get_template_context for more context.
+"""
+self.log.info('Got result of {}. Done.'.format(
+  self.get_external_resource_id(context))
+
+def pre_execute(self, context) -> None:
 
 Review comment:
   @mik-laj 
   > Executing code in pre_execute is dangerous because you can jam the thread. 
The code executed in the execute method has a timeout.
   
   Curious what you mean by this.  Are `pre_execute` and `post_execute` not 
meant to be subclassed?
   
   I used them in a "WatermarkSqlOperator" for incremental loads.   Pre-execute 
would calculate high watermark (max date in source table) and store it as 
attribute.  It would also retrieve last high watermark from xcom.  Post-execute 
would send to xcom new high watermark after `execute` completed successfully.
   
   I did it this way so that execute method could be overridden in subclass.
   
   Just trying to understand what's the danger inherent in pre / post execute.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator

2019-09-29 Thread GitBox
dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not 
Merge] prototype BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r329359901
 
 

 ##
 File path: airflow/models/base_async_operator.py
 ##
 @@ -0,0 +1,96 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Base Asynchronous Operator for kicking off a long running
+operations and polling for completion with reschedule mode.
+"""
+from functools import wraps
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_EXTERNAL_RESOURCE_ID_KEY
+
+class BaseAsyncOperator(BaseSensorOperator, SkipMixin):
+"""
+AsyncOperators are derived from this class and inherit these attributes.
+
+AsyncOperators must define a `submit_request` to fire a request for a
+long running operation with a method and then executes a `poke` method
+executing at a time interval and succeed when a criteria is met and fail
+if and when they time out. They are effctively an opinionated way use
+combine an Operator and a Sensor in order to kick off a long running
+process without blocking a worker slot while waiting for the long running
+process to complete by leveraging reschedule mode.
+
+:param soft_fail: Set to true to mark the task as SKIPPED on failure
+:type soft_fail: bool
+:param poke_interval: Time in seconds that the job should wait in
+between each tries
+:type poke_interval: int
+:param timeout: Time, in seconds before the task times out and fails.
+:type timeout: int
+:type mode: str
+"""
+ui_color = '#9933ff'  # type: str
+valid_modes = ['poke', 'reschedule']  # type: Iterable[str]
+
+@apply_defaults
+def __init__(self,
+ *args,
+ **kwargs) -> None:
+super().__init__(mode='reschedule', *args, **kwargs)
+
+def submit_request(self, context) -> string:
+"""
+This method should kick off a long running operation.
+This method should return the ID for the long running operation used
+for polling
+Context is the same dictionary used as when rendering jinja templates.
+
+Refer to get_template_context for more context.
+
+:returns: a resource_id for the long running operation.
+:rtype: str
+"""
+raise AirflowException('Async Operators must define a `submit_request` 
method.')
+
+def process_result(self, context):
+"""
+This method can optionally be overriden to process the result of a 
long running operation.
+Context is the same dictionary used as when rendering jinja templates.
+
+Refer to get_template_context for more context.
+"""
+self.log.info('Got result of {}. Done.'.format(
+  self.get_external_resource_id(context))
+
+def pre_execute(self, context) -> None:
 
 Review comment:
   > Executing code in pre_execute is dangerous because you can jam the thread. 
The code executed in the execute method has a timeout.
   
   Curious what you mean by this.  Are `pre_execute` and `post_execute` not 
meant to be subclassed?
   
   I used them in a "WatermarkSqlOperator" for incremental loads.   Pre-execute 
would calculate high watermark (max date in source table) and store it as 
attribute.  It would also retrieve last high watermark from xcom.  Post-execute 
would send to xcom new high watermark after `execute` completed successfully.
   
   I did it this way so that execute method could be overridden in subclass.
   
   Just trying to understand what's the danger inherent in pre / post execute.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator

2019-09-29 Thread GitBox
dstandish commented on a change in pull request #6210: [AIRFLOW-5567] [Do not 
Merge] prototype BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r329359901
 
 

 ##
 File path: airflow/models/base_async_operator.py
 ##
 @@ -0,0 +1,96 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Base Asynchronous Operator for kicking off a long running
+operations and polling for completion with reschedule mode.
+"""
+from functools import wraps
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_EXTERNAL_RESOURCE_ID_KEY
+
+class BaseAsyncOperator(BaseSensorOperator, SkipMixin):
+"""
+AsyncOperators are derived from this class and inherit these attributes.
+
+AsyncOperators must define a `submit_request` to fire a request for a
+long running operation with a method and then executes a `poke` method
+executing at a time interval and succeed when a criteria is met and fail
+if and when they time out. They are effctively an opinionated way use
+combine an Operator and a Sensor in order to kick off a long running
+process without blocking a worker slot while waiting for the long running
+process to complete by leveraging reschedule mode.
+
+:param soft_fail: Set to true to mark the task as SKIPPED on failure
+:type soft_fail: bool
+:param poke_interval: Time in seconds that the job should wait in
+between each tries
+:type poke_interval: int
+:param timeout: Time, in seconds before the task times out and fails.
+:type timeout: int
+:type mode: str
+"""
+ui_color = '#9933ff'  # type: str
+valid_modes = ['poke', 'reschedule']  # type: Iterable[str]
+
+@apply_defaults
+def __init__(self,
+ *args,
+ **kwargs) -> None:
+super().__init__(mode='reschedule', *args, **kwargs)
+
+def submit_request(self, context) -> string:
+"""
+This method should kick off a long running operation.
+This method should return the ID for the long running operation used
+for polling
+Context is the same dictionary used as when rendering jinja templates.
+
+Refer to get_template_context for more context.
+
+:returns: a resource_id for the long running operation.
+:rtype: str
+"""
+raise AirflowException('Async Operators must define a `submit_request` 
method.')
+
+def process_result(self, context):
+"""
+This method can optionally be overriden to process the result of a 
long running operation.
+Context is the same dictionary used as when rendering jinja templates.
+
+Refer to get_template_context for more context.
+"""
+self.log.info('Got result of {}. Done.'.format(
+  self.get_external_resource_id(context))
+
+def pre_execute(self, context) -> None:
 
 Review comment:
   @mik-laj 
   > Executing code in pre_execute is dangerous because you can jam the thread. 
The code executed in the execute method has a timeout.
   
   Curious what you mean by this.  Are `pre_execute` and `post_execute` not 
meant to be overridden?
   
   I used them in a "WatermarkSqlOperator" for incremental loads.   Pre-execute 
would calculate high watermark (max date in source table) and store it as 
attribute.  It would also retrieve last high watermark from xcom.  Post-execute 
would send to xcom new high watermark after `execute` completed successfully.
   
   I did it this way so that execute method could be overridden in subclass.
   
   Just trying to understand what's the danger inherent in pre / post execute.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #6213: [AIRFLOW-XXX] Extract operators and hooks to separate page

2019-09-29 Thread GitBox
codecov-io edited a comment on issue #6213: [AIRFLOW-XXX] Extract operators and 
hooks to separate page
URL: https://github.com/apache/airflow/pull/6213#issuecomment-536308177
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=h1) 
Report
   > Merging 
[#6213](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/61190c30df17772ec84e9cb8dc4370c9394fd92c?src=pr&el=desc)
 will **increase** coverage by `70.42%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6213/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=tree)
   
   ```diff
   @@ Coverage Diff @@
   ##   master#6213   +/-   ##
   ===
   + Coverage9.59%   80.01%   +70.42% 
   ===
 Files 610  610   
 Lines   3517635176   
   ===
   + Hits 337628147+24771 
   + Misses  31800 7029-24771
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/plugins\_manager.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9wbHVnaW5zX21hbmFnZXIucHk=)
 | `86.91% <0%> (+0.93%)` | :arrow_up: |
   | 
[airflow/executors/dask\_executor.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvZGFza19leGVjdXRvci5weQ==)
 | `2% <0%> (+2%)` | :arrow_up: |
   | 
[airflow/config\_templates/airflow\_local\_settings.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9jb25maWdfdGVtcGxhdGVzL2FpcmZsb3dfbG9jYWxfc2V0dGluZ3MucHk=)
 | `80% <0%> (+2.5%)` | :arrow_up: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `91.97% <0%> (+2.91%)` | :arrow_up: |
   | 
[airflow/exceptions.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGNlcHRpb25zLnB5)
 | `100% <0%> (+3.7%)` | :arrow_up: |
   | 
[airflow/utils/log/colored\_log.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9sb2cvY29sb3JlZF9sb2cucHk=)
 | `93.18% <0%> (+13.63%)` | :arrow_up: |
   | 
[airflow/settings.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9zZXR0aW5ncy5weQ==)
 | `88.32% <0%> (+15.32%)` | :arrow_up: |
   | 
[airflow/utils/decorators.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9kZWNvcmF0b3JzLnB5)
 | `90.9% <0%> (+15.9%)` | :arrow_up: |
   | 
[airflow/task/task\_runner/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy90YXNrL3Rhc2tfcnVubmVyL19faW5pdF9fLnB5)
 | `63.63% <0%> (+18.18%)` | :arrow_up: |
   | 
[airflow/macros/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9tYWNyb3MvX19pbml0X18ucHk=)
 | `86.36% <0%> (+18.18%)` | :arrow_up: |
   | ... and [498 
more](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=footer). 
Last update 
[61190c3...754b690](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io commented on issue #6213: [AIRFLOW-XXX] Extract operators and hooks to separate page

2019-09-29 Thread GitBox
codecov-io commented on issue #6213: [AIRFLOW-XXX] Extract operators and hooks 
to separate page
URL: https://github.com/apache/airflow/pull/6213#issuecomment-536308177
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=h1) 
Report
   > Merging 
[#6213](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/61190c30df17772ec84e9cb8dc4370c9394fd92c?src=pr&el=desc)
 will **increase** coverage by `70.42%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6213/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=tree)
   
   ```diff
   @@ Coverage Diff @@
   ##   master#6213   +/-   ##
   ===
   + Coverage9.59%   80.01%   +70.42% 
   ===
 Files 610  610   
 Lines   3517635176   
   ===
   + Hits 337628147+24771 
   + Misses  31800 7029-24771
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/plugins\_manager.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9wbHVnaW5zX21hbmFnZXIucHk=)
 | `86.91% <0%> (+0.93%)` | :arrow_up: |
   | 
[airflow/executors/dask\_executor.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvZGFza19leGVjdXRvci5weQ==)
 | `2% <0%> (+2%)` | :arrow_up: |
   | 
[airflow/config\_templates/airflow\_local\_settings.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9jb25maWdfdGVtcGxhdGVzL2FpcmZsb3dfbG9jYWxfc2V0dGluZ3MucHk=)
 | `80% <0%> (+2.5%)` | :arrow_up: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `91.97% <0%> (+2.91%)` | :arrow_up: |
   | 
[airflow/exceptions.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGNlcHRpb25zLnB5)
 | `100% <0%> (+3.7%)` | :arrow_up: |
   | 
[airflow/utils/log/colored\_log.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9sb2cvY29sb3JlZF9sb2cucHk=)
 | `93.18% <0%> (+13.63%)` | :arrow_up: |
   | 
[airflow/settings.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9zZXR0aW5ncy5weQ==)
 | `88.32% <0%> (+15.32%)` | :arrow_up: |
   | 
[airflow/utils/decorators.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9kZWNvcmF0b3JzLnB5)
 | `90.9% <0%> (+15.9%)` | :arrow_up: |
   | 
[airflow/task/task\_runner/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy90YXNrL3Rhc2tfcnVubmVyL19faW5pdF9fLnB5)
 | `63.63% <0%> (+18.18%)` | :arrow_up: |
   | 
[airflow/macros/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree#diff-YWlyZmxvdy9tYWNyb3MvX19pbml0X18ucHk=)
 | `86.36% <0%> (+18.18%)` | :arrow_up: |
   | ... and [498 
more](https://codecov.io/gh/apache/airflow/pull/6213/diff?src=pr&el=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=footer). 
Last update 
[61190c3...754b690](https://codecov.io/gh/apache/airflow/pull/6213?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io commented on issue #6212: [AIRFLOW-5571] Fix bug that kubernetes getting log error will stop pod unexpectedly

2019-09-29 Thread GitBox
codecov-io commented on issue #6212: [AIRFLOW-5571] Fix bug that kubernetes 
getting log error will stop pod unexpectedly
URL: https://github.com/apache/airflow/pull/6212#issuecomment-536304746
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6212?src=pr&el=h1) 
Report
   > Merging 
[#6212](https://codecov.io/gh/apache/airflow/pull/6212?src=pr&el=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/61190c30df17772ec84e9cb8dc4370c9394fd92c?src=pr&el=desc)
 will **increase** coverage by `70.41%`.
   > The diff coverage is `57.14%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6212/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/airflow/pull/6212?src=pr&el=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master   #6212   +/-   ##
   ==
   + Coverage9.59% 80%   +70.41% 
   ==
 Files 610 610   
 Lines   35176   35181+5 
   ==
   + Hits 3376   28148+24772 
   + Misses  318007033-24767
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6212?src=pr&el=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6212/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `90.14% <57.14%> (+1.08%)` | :arrow_up: |
   | 
[airflow/plugins\_manager.py](https://codecov.io/gh/apache/airflow/pull/6212/diff?src=pr&el=tree#diff-YWlyZmxvdy9wbHVnaW5zX21hbmFnZXIucHk=)
 | `86.91% <0%> (+0.93%)` | :arrow_up: |
   | 
[airflow/executors/dask\_executor.py](https://codecov.io/gh/apache/airflow/pull/6212/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvZGFza19leGVjdXRvci5weQ==)
 | `2% <0%> (+2%)` | :arrow_up: |
   | 
[airflow/config\_templates/airflow\_local\_settings.py](https://codecov.io/gh/apache/airflow/pull/6212/diff?src=pr&el=tree#diff-YWlyZmxvdy9jb25maWdfdGVtcGxhdGVzL2FpcmZsb3dfbG9jYWxfc2V0dGluZ3MucHk=)
 | `80% <0%> (+2.5%)` | :arrow_up: |
   | 
[airflow/exceptions.py](https://codecov.io/gh/apache/airflow/pull/6212/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGNlcHRpb25zLnB5)
 | `100% <0%> (+3.7%)` | :arrow_up: |
   | 
[airflow/utils/log/colored\_log.py](https://codecov.io/gh/apache/airflow/pull/6212/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9sb2cvY29sb3JlZF9sb2cucHk=)
 | `93.18% <0%> (+13.63%)` | :arrow_up: |
   | 
[airflow/settings.py](https://codecov.io/gh/apache/airflow/pull/6212/diff?src=pr&el=tree#diff-YWlyZmxvdy9zZXR0aW5ncy5weQ==)
 | `88.32% <0%> (+15.32%)` | :arrow_up: |
   | 
[airflow/utils/decorators.py](https://codecov.io/gh/apache/airflow/pull/6212/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9kZWNvcmF0b3JzLnB5)
 | `90.9% <0%> (+15.9%)` | :arrow_up: |
   | 
[airflow/task/task\_runner/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/6212/diff?src=pr&el=tree#diff-YWlyZmxvdy90YXNrL3Rhc2tfcnVubmVyL19faW5pdF9fLnB5)
 | `63.63% <0%> (+18.18%)` | :arrow_up: |
   | ... and [499 
more](https://codecov.io/gh/apache/airflow/pull/6212/diff?src=pr&el=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6212?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6212?src=pr&el=footer). 
Last update 
[61190c3...a7ecd39](https://codecov.io/gh/apache/airflow/pull/6212?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io commented on issue #5994: [AIRFLOW-5395] Fix unloading with column names containing a dot in RedshiftToS3Operator

2019-09-29 Thread GitBox
codecov-io commented on issue #5994: [AIRFLOW-5395] Fix unloading with column 
names containing a dot in RedshiftToS3Operator
URL: https://github.com/apache/airflow/pull/5994#issuecomment-536302568
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5994?src=pr&el=h1) 
Report
   > Merging 
[#5994](https://codecov.io/gh/apache/airflow/pull/5994?src=pr&el=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/61190c30df17772ec84e9cb8dc4370c9394fd92c?src=pr&el=desc)
 will **increase** coverage by `70.42%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5994/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/airflow/pull/5994?src=pr&el=tree)
   
   ```diff
   @@ Coverage Diff @@
   ##   master#5994   +/-   ##
   ===
   + Coverage9.59%   80.02%   +70.42% 
   ===
 Files 610  610   
 Lines   3517635177+1 
   ===
   + Hits 337628151+24775 
   + Misses  31800 7026-24774
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5994?src=pr&el=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/operators/redshift\_to\_s3\_operator.py](https://codecov.io/gh/apache/airflow/pull/5994/diff?src=pr&el=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvcmVkc2hpZnRfdG9fczNfb3BlcmF0b3IucHk=)
 | `95.65% <100%> (+95.65%)` | :arrow_up: |
   | 
[airflow/plugins\_manager.py](https://codecov.io/gh/apache/airflow/pull/5994/diff?src=pr&el=tree#diff-YWlyZmxvdy9wbHVnaW5zX21hbmFnZXIucHk=)
 | `86.91% <0%> (+0.93%)` | :arrow_up: |
   | 
[airflow/executors/dask\_executor.py](https://codecov.io/gh/apache/airflow/pull/5994/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvZGFza19leGVjdXRvci5weQ==)
 | `2% <0%> (+2%)` | :arrow_up: |
   | 
[airflow/config\_templates/airflow\_local\_settings.py](https://codecov.io/gh/apache/airflow/pull/5994/diff?src=pr&el=tree#diff-YWlyZmxvdy9jb25maWdfdGVtcGxhdGVzL2FpcmZsb3dfbG9jYWxfc2V0dGluZ3MucHk=)
 | `80% <0%> (+2.5%)` | :arrow_up: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/5994/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `91.97% <0%> (+2.91%)` | :arrow_up: |
   | 
[airflow/exceptions.py](https://codecov.io/gh/apache/airflow/pull/5994/diff?src=pr&el=tree#diff-YWlyZmxvdy9leGNlcHRpb25zLnB5)
 | `100% <0%> (+3.7%)` | :arrow_up: |
   | 
[airflow/utils/log/colored\_log.py](https://codecov.io/gh/apache/airflow/pull/5994/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9sb2cvY29sb3JlZF9sb2cucHk=)
 | `93.18% <0%> (+13.63%)` | :arrow_up: |
   | 
[airflow/settings.py](https://codecov.io/gh/apache/airflow/pull/5994/diff?src=pr&el=tree#diff-YWlyZmxvdy9zZXR0aW5ncy5weQ==)
 | `88.32% <0%> (+15.32%)` | :arrow_up: |
   | 
[airflow/utils/decorators.py](https://codecov.io/gh/apache/airflow/pull/5994/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9kZWNvcmF0b3JzLnB5)
 | `90.9% <0%> (+15.9%)` | :arrow_up: |
   | 
[airflow/task/task\_runner/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/5994/diff?src=pr&el=tree#diff-YWlyZmxvdy90YXNrL3Rhc2tfcnVubmVyL19faW5pdF9fLnB5)
 | `63.63% <0%> (+18.18%)` | :arrow_up: |
   | ... and [499 
more](https://codecov.io/gh/apache/airflow/pull/5994/diff?src=pr&el=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5994?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5994?src=pr&el=footer). 
Last update 
[61190c3...8828559](https://codecov.io/gh/apache/airflow/pull/5994?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] jaketf commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator

2019-09-29 Thread GitBox
jaketf commented on a change in pull request #6210: [AIRFLOW-5567] [Do not 
Merge] prototype BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r329354212
 
 

 ##
 File path: airflow/models/base_async_operator.py
 ##
 @@ -0,0 +1,96 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Base Asynchronous Operator for kicking off a long running
+operations and polling for completion with reschedule mode.
+"""
+from functools import wraps
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_EXTERNAL_RESOURCE_ID_KEY
+
+class BaseAsyncOperator(BaseSensorOperator, SkipMixin):
+"""
+AsyncOperators are derived from this class and inherit these attributes.
+
+AsyncOperators must define a `submit_request` to fire a request for a
+long running operation with a method and then executes a `poke` method
+executing at a time interval and succeed when a criteria is met and fail
+if and when they time out. They are effctively an opinionated way use
+combine an Operator and a Sensor in order to kick off a long running
+process without blocking a worker slot while waiting for the long running
+process to complete by leveraging reschedule mode.
+
+:param soft_fail: Set to true to mark the task as SKIPPED on failure
+:type soft_fail: bool
+:param poke_interval: Time in seconds that the job should wait in
+between each tries
+:type poke_interval: int
+:param timeout: Time, in seconds before the task times out and fails.
+:type timeout: int
+:type mode: str
+"""
+ui_color = '#9933ff'  # type: str
+valid_modes = ['poke', 'reschedule']  # type: Iterable[str]
+
+@apply_defaults
+def __init__(self,
+ *args,
+ **kwargs) -> None:
+super().__init__(mode='reschedule', *args, **kwargs)
+
+def submit_request(self, context) -> string:
+"""
+This method should kick off a long running operation.
+This method should return the ID for the long running operation used
+for polling
+Context is the same dictionary used as when rendering jinja templates.
+
+Refer to get_template_context for more context.
+
+:returns: a resource_id for the long running operation.
+:rtype: str
+"""
+raise AirflowException('Async Operators must define a `submit_request` 
method.')
+
+def process_result(self, context):
+"""
+This method can optionally be overriden to process the result of a 
long running operation.
+Context is the same dictionary used as when rendering jinja templates.
+
+Refer to get_template_context for more context.
+"""
+self.log.info('Got result of {}. Done.'.format(
+  self.get_external_resource_id(context))
+
+def pre_execute(self, context) -> None:
+"""
+Check if we have the XCOM_EXTERNAL_RESOURCE_ID_KEY
+for this task and call submit_request if it is missing.
+"""
+if not self.get_external_resource_id(context):
 
 Review comment:
   If an operator doesn't operate on an identifier what is there to poke? Maybe 
we can have BaseAsyncOperator set a default `'PLACEHOLDER_ID'` ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #6213: [AIRFLOW-XXX] Extract operators and hooks to separate page

2019-09-29 Thread GitBox
mik-laj commented on issue #6213: [AIRFLOW-XXX] Extract operators and hooks to 
separate page
URL: https://github.com/apache/airflow/pull/6213#issuecomment-536300921
 
 
   ```
   git diff apache/master:./integration.rst operators-and-hooks-ref.rst
   ```
   ```diff
   diff --git a/docs/integration.rst b/docs/operators-and-hooks-ref.rst
   index e9155bcd8..6c8085895 100644
   --- a/docs/integration.rst
   +++ b/docs/operators-and-hooks-ref.rst
   @@ -15,8 +15,8 @@
specific language governing permissions and limitations
under the License.

   -Integration
   -===
   +Operators and Hooks Reference
   +=

.. contents:: Content
  :local:
   @@ -29,11 +29,8 @@ ASF: Apache Software Foundation

Airflow supports various software created by `Apache Software Foundation 
`__.

   -Operators and Hooks
   -'''
   -
Software operators and hooks
   -
   +

These integrations allow you to perform various operations within software 
developed by Apache Software
Foundation.
   @@ -113,7 +110,7 @@ Foundation.


Transfer operators and hooks
   -
   +

These integrations allow you to copy data from/to software developed by 
Apache Software
Foundation.
   @@ -183,18 +180,8 @@ Azure: Microsoft Azure

Airflow has limited support for `Microsoft Azure 
`__.

   -Logging
   -'''
   -
   -Airflow can be configured to read and write task logs in Azure Blob Storage.
   -See :ref:`write-logs-azure`.
   -
   -
   -Operators and Hooks
   -'''
   -
Service operators and hooks
   -"""
   +'''

These integrations allow you to perform various operations within the 
Microsoft Azure.

   @@ -271,19 +258,10 @@ AWS: Amazon Web Services

Airflow has support for `Amazon Web Services `__.

   -Logging
   -'''
   -
   -Airflow can be configured to read and write task logs in Amazon Simple 
Storage Service (Amazon S3).
   -See :ref:`write-logs-amazon`.
   -
   -Operators and Hooks
   -'''
   -
All hooks are based on :mod:`airflow.contrib.hooks.aws_hook`.

Service operators and hooks
   -"""
   +'''

These integrations allow you to perform various operations within the 
Amazon Web Services.

   @@ -472,20 +450,10 @@ Airflow has extensive support for the `Google Cloud 
Platform `_ of 
the particular example DAGs.

Other operators and hooks
   -"
   +'

.. list-table::
   :header-rows: 1
   @@ -800,11 +768,8 @@ Other operators and hooks
Service integrations


   -Operators and Hooks
   -'''
   -
Service operators and hooks
   -"""
   +'''

These integrations allow you to perform various operations within various 
services.

   @@ -925,7 +890,7 @@ These integrations allow you to perform various 
operations within various servic
 -

Transfer operators and hooks
   -
   +

These integrations allow you to perform various operations within various 
services.

   @@ -957,11 +922,8 @@ These integrations allow you to perform various 
operations within various servic
Software integrations
-

   -Operators and Hooks
   -'''
   -
Software operators and hooks
   -
   +

These integrations allow you to perform various ope

[GitHub] [airflow] mik-laj opened a new pull request #6213: [AIRFLOW-XXX] Extract operators and hooks to separate page

2019-09-29 Thread GitBox
mik-laj opened a new pull request #6213: [AIRFLOW-XXX] Extract operators and 
hooks to separate page
URL: https://github.com/apache/airflow/pull/6213
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-5571) Kubernetes operator's bug that get logs will make pod exit unexpectedly

2019-09-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16940406#comment-16940406
 ] 

ASF GitHub Bot commented on AIRFLOW-5571:
-

icyfox-bupt commented on pull request #6212: [AIRFLOW-5571] Fix bug that 
kubernetes getting log error will stop pod unexpectedly
URL: https://github.com/apache/airflow/pull/6212
 
 
   …d run.
   
   - _monitor_pod function didn't catch the error for getting log, then pod
   will stop unexpectedly, add a try-catch here.
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [X] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW-5571/) issues and 
references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-5571
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [X] Here are some details about my PR, including screenshots of any UI 
changes:
   
   In my project, I manage 200+ jobs in kubernetes, as I work on, I found a 
critical bug for kubernetes operator.
   
   In pod_launcher.py, _monitor_pod function:
   ```
   if get_logs:
   logs = self.read_pod_logs(pod) # Here has a retry.
   for line in logs:  # But exception throw from here!
   self.
   ```
   in above code, logs is a HttpResponse, as it implemented _iter_() function, 
you can use for loop to print the lines. In the other words, here use a http 
long connection to get endless log.
   
   There is only try catch over self.read_pod_logs, however, If the network is 
disconnected or jitter occurs, for loop will throw error.
   
   As I have 200+ job run everyday, I can get 4~5 errors everyday, and each 
error will let monitor think the pod is down, and then mark the task as failed, 
then retry it. This eventually lead to data error.
   
   ### Tests
   
   - [X] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason: Add a try-catch here and test is not easy to add.
   
   ### Commits
   
   - [X] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Kubernetes operator's bug that get logs will make pod exit unexpectedly
> ---
>
> Key: AIRFLOW-5571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5571
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 1.10.0, 1.10.1, 1.10.2, 1.10.3
>Reporter: Liu Xuesi
>Priority: Critical
>  Labels: kubernetes, operator
> Attachments: k8s_error_log.png
>
>
> In my project, I manage 200+ jobs in kubernetes, as I work on, I found a 
> critical bug for kubernetes operator.
> In pod_launcher.py, *_monitor_pod* function:
> {code:python}
> if get_logs:
> logs = self.read_pod_logs(pod) # Here has a retry.
> for line in logs:  # But exception throw from here!
> self.
> {code}
> in above code, *logs* is a HttpRe

[GitHub] [airflow] feluelle commented on issue #5994: [AIRFLOW-5395] Fix unloading with column names containing a dot in RedshiftToS3Operator

2019-09-29 Thread GitBox
feluelle commented on issue #5994: [AIRFLOW-5395] Fix unloading with column 
names containing a dot in RedshiftToS3Operator
URL: https://github.com/apache/airflow/pull/5994#issuecomment-536299589
 
 
   @OmerJog that's true, but if the other one gets merged this change is 
superfluous.
   
   I resolved conflicts. Can you review it since you were also interested in 
the other one ;)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] icyfox-bupt opened a new pull request #6212: [AIRFLOW-5571] Fix bug that kubernetes getting log error will stop pod unexpectedly

2019-09-29 Thread GitBox
icyfox-bupt opened a new pull request #6212: [AIRFLOW-5571] Fix bug that 
kubernetes getting log error will stop pod unexpectedly
URL: https://github.com/apache/airflow/pull/6212
 
 
   …d run.
   
   - _monitor_pod function didn't catch the error for getting log, then pod
   will stop unexpectedly, add a try-catch here.
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [X] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW-5571/) issues and 
references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-5571
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [X] Here are some details about my PR, including screenshots of any UI 
changes:
   
   In my project, I manage 200+ jobs in kubernetes, as I work on, I found a 
critical bug for kubernetes operator.
   
   In pod_launcher.py, _monitor_pod function:
   ```
   if get_logs:
   logs = self.read_pod_logs(pod) # Here has a retry.
   for line in logs:  # But exception throw from here!
   self.
   ```
   in above code, logs is a HttpResponse, as it implemented _iter_() function, 
you can use for loop to print the lines. In the other words, here use a http 
long connection to get endless log.
   
   There is only try catch over self.read_pod_logs, however, If the network is 
disconnected or jitter occurs, for loop will throw error.
   
   As I have 200+ job run everyday, I can get 4~5 errors everyday, and each 
error will let monitor think the pod is down, and then mark the task as failed, 
then retry it. This eventually lead to data error.
   
   ### Tests
   
   - [X] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason: Add a try-catch here and test is not easy to add.
   
   ### Commits
   
   - [X] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-5571) Kubernetes operator's bug that get logs will make pod exit unexpectedly

2019-09-29 Thread Liu Xuesi (Jira)
Liu Xuesi created AIRFLOW-5571:
--

 Summary: Kubernetes operator's bug that get logs will make pod 
exit unexpectedly
 Key: AIRFLOW-5571
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5571
 Project: Apache Airflow
  Issue Type: Bug
  Components: operators
Affects Versions: 1.10.3, 1.10.2, 1.10.1, 1.10.0
Reporter: Liu Xuesi
 Attachments: k8s_error_log.png

In my project, I manage 200+ jobs in kubernetes, as I work on, I found a 
critical bug for kubernetes operator.

In pod_launcher.py, *_monitor_pod* function:
{code:python}
if get_logs:
logs = self.read_pod_logs(pod) # Here has a retry.
for line in logs:  # But exception throw from here!
self.
{code}
in above code, *logs* is a HttpResponse, as it implemented __iter__() function, 
you can use for loop to print the lines. In the other words, here use a http 
long connection to get endless log.

There is only try catch over *self.read_pod_logs*, however, If the network is 
disconnected or jitter occurs, for loop will throw error.

As I have 200+ job run everyday, I can get 4~5 errors everyday, and each error 
will let monitor think the pod is down, and then mark the task as failed, then 
retry it. This eventually lead to data error.

 

Below is a typical error log:
{code:java}
 [2019-09-17 20:50:02,532] {logging_mixin.py:95} INFO - [2019-09-17 
20:50:02,532] {pod_launcher.py:105} INFO - b'19/09/17 20:50:00 INFO 
yarn.Client: Application report for application_1565926539066_3866 (state: 
RUNNING)\n'
[2019-09-17 20:50:02,532] {logging_mixin.py:95} INFO - [2019-09-17 
20:50:02,532] {pod_launcher.py:105} INFO - b'19/09/17 20:50:01 INFO 
yarn.Client: Application report for application_1565926539066_3866 (state: 
RUNNING)\n'
[2019-09-17 20:50:02,533] {taskinstance.py:1047} ERROR - ('Connection broken: 
IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 639, 
in _update_chunk_length
self.chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: b''

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 397, 
in _error_catcher
yield
  File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 704, 
in read_chunked
self._update_chunk_length()
  File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 643, 
in _update_chunk_length
raise httplib.IncompleteRead(line)
http.client.IncompleteRead: IncompleteRead(0 bytes read)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/airflow/models/taskinstance.py", 
line 922, in _run_raw_task
result = task_copy.execute(context=context)
  File 
"/usr/local/lib/python3.6/dist-packages/airflow/contrib/operators/k8s_pod_operator.py",
 line 45, in execute
super().execute(context)
  File 
"/usr/local/lib/python3.6/dist-packages/airflow/contrib/operators/kubernetes_pod_operator.py",
 line 148, in execute
get_logs=self.get_logs)
  File 
"/usr/local/lib/python3.6/dist-packages/airflow/contrib/kubernetes/pod_launcher.py",
 line 97, in run_pod
return self._monitor_pod(pod, get_logs)
  File 
"/usr/local/lib/python3.6/dist-packages/airflow/contrib/kubernetes/pod_launcher.py",
 line 104, in _monitor_pod
for line in logs:
  File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 747, 
in __iter__
for chunk in self.stream(decode_content=True):
  File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 527, 
in stream
for line in self.read_chunked(amt, decode_content=decode_content):
  File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 732, 
in read_chunked
self._original_response.close()
  File "/usr/lib/python3.6/contextlib.py", line 99, in __exit__
self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 415, 
in _error_catcher
raise ProtocolError('Connection broken: %r' % e, e)
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(0 bytes 
read)', IncompleteRead(0 bytes read))
[2019-09-17 20:50:02,536] {taskinstance.py:1070} INFO - Marking task as 
UP_FOR_RETRY
[2019-09-17 20:50:02,717] {logging_mixin.py:95} INFO - [2019-09-17 
20:50:02,717] {email.py:126} INFO - Sent an alert email to 
['data_al...@sensetime.com', 'zhao...@sensetime.com']
[2019-09-17 20:50:03,921] {base_task_runner.py:115} INFO - Job 16307: Subtask 
ods_senseid_log_ingestion [2019-09-17 20:50:03,921] {cli_action_loggers.py:82} 
DEBUG - Calling callbacks: []
[2019-09-17 20:50:03,925] {base_task_runner.py:115} INFO - Job 16307: Subtask 

[GitHub] [airflow] codecov-io edited a comment on issue #6209: [AIRFLOW-XXX] Add service transfer operators

2019-09-29 Thread GitBox
codecov-io edited a comment on issue #6209: [AIRFLOW-XXX] Add service transfer 
operators
URL: https://github.com/apache/airflow/pull/6209#issuecomment-536229920
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6209?src=pr&el=h1) 
Report
   > Merging 
[#6209](https://codecov.io/gh/apache/airflow/pull/6209?src=pr&el=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/bfe0ace2808a2a8934fbf578927bde81fc33420e?src=pr&el=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6209/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/airflow/pull/6209?src=pr&el=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#6209   +/-   ##
   ===
 Coverage   80.02%   80.02%   
   ===
 Files 610  610   
 Lines   3517635176   
   ===
 Hits2815028150   
 Misses   7026 7026
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6209?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6209?src=pr&el=footer). 
Last update 
[bfe0ace...d740a5b](https://codecov.io/gh/apache/airflow/pull/6209?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj merged pull request #6208: [AIRFLOW-XXX] Add protocol transfer operators

2019-09-29 Thread GitBox
mik-laj merged pull request #6208: [AIRFLOW-XXX] Add protocol transfer operators
URL: https://github.com/apache/airflow/pull/6208
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj merged pull request #6209: [AIRFLOW-XXX] Add service transfer operators

2019-09-29 Thread GitBox
mik-laj merged pull request #6209: [AIRFLOW-XXX] Add service transfer operators
URL: https://github.com/apache/airflow/pull/6209
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on issue #6209: [AIRFLOW-XXX] Add service transfer operators

2019-09-29 Thread GitBox
feluelle commented on issue #6209: [AIRFLOW-XXX] Add service transfer operators
URL: https://github.com/apache/airflow/pull/6209#issuecomment-536295077
 
 
   LGTM 👍 - Nice job @mik-laj 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #6209: [AIRFLOW-XXX] Add service transfer operators

2019-09-29 Thread GitBox
feluelle commented on a change in pull request #6209: [AIRFLOW-XXX] Add service 
transfer operators
URL: https://github.com/apache/airflow/pull/6209#discussion_r329351245
 
 

 ##
 File path: docs/integration.rst
 ##
 @@ -988,6 +988,86 @@ These integrations allow you to perform various 
operations using various softwar
  - :mod:`airflow.operators.sqlite_operator`
  -
 
+
+Transfer operators and hooks
+
+
+These integrations allow you to copy data.
+
+.. list-table::
+   :header-rows: 1
+
+   * - Source
+ - Destination
+ - Guide
+ - Operators
+
+   * - `Oracle `__
+ - `Azure Data Lake Storage 
`__
+ -
+ - :mod:`airflow.contrib.operators.oracle_to_azure_data_lake_transfer`
+
+   * - `Oracle `__
+ - `Oracle `__
+ -
+ - :mod:`airflow.contrib.operators.oracle_to_oracle_transfer`
+
+   * - `BigQuery `__
+ - `MySQL `__
+ -
+ - :mod:`airflow.operators.bigquery_to_mysql`
+
+   * - `Microsoft SQL Server (MSSQL) 
`__
+ - `Google Cloud Storage (GCS) `__
+ -
+ - :mod:`airflow.operators.mssql_to_gcs`
+
+   * - `Microsoft SQL Server (MSSQL) 
`__
+ - `Apache Hive `__
+ -
+ - :mod:`airflow.operators.mssql_to_hive`
+
+   * - `MySQL `__
+ - `Apache Hive `__
+ -
+ - :mod:`airflow.operators.mysql_to_hive`
+
+   * - `MySQL `__
+ - `Google Cloud Storage (GCS) `__
+ -
+ - :mod:`airflow.operators.mysql_to_gcs`
+
+   * - `PostgresSQL `__
+ - `Google Cloud Storage (GCS) `__
+ -
+ - :mod:`airflow.operators.postgres_to_gcs`
+
+   * - SQL
+ - `Cloud Storage (GCS) `__
+ -
+ - :mod:`airflow.operators.sql_to_gcs`
+
+   * - `Vertica `__
+ - `Apache Hive `__
+ -
+ - :mod:`airflow.contrib.operators.vertica_to_hive`
+
+   * - `Vertica `__
+ - `MySQL `__
+ -
+ - :mod:`airflow.contrib.operators.vertica_to_mysql`
+
+   * - `Presto `__
 
 Review comment:
   ```suggestion
  * - `Presto `__
   ```
   works, too. There is no automatic forwarding from HTTP to HTTPS when you 
access `prestodb.github.io`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj merged pull request #6207: [AIRFLOW-XXX] Add software transfer operators

2019-09-29 Thread GitBox
mik-laj merged pull request #6207: [AIRFLOW-XXX] Add software transfer operators
URL: https://github.com/apache/airflow/pull/6207
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj merged pull request #6206: [AIRFLOW-XXX] Add more GCP transfer operators

2019-09-29 Thread GitBox
mik-laj merged pull request #6206: [AIRFLOW-XXX] Add more GCP transfer operators
URL: https://github.com/apache/airflow/pull/6206
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj merged pull request #6205: [AIRFLOW-XXX] Add more AWS transfer operators

2019-09-29 Thread GitBox
mik-laj merged pull request #6205: [AIRFLOW-XXX] Add more AWS transfer operators
URL: https://github.com/apache/airflow/pull/6205
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj merged pull request #6203: [AIRFLOW-XXX] Add more ASF transfer operators

2019-09-29 Thread GitBox
mik-laj merged pull request #6203: [AIRFLOW-XXX] Add more ASF transfer operators
URL: https://github.com/apache/airflow/pull/6203
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj merged pull request #6202: [AIRFLOW-XXX] Improve ASF operators table

2019-09-29 Thread GitBox
mik-laj merged pull request #6202: [AIRFLOW-XXX] Improve ASF operators table
URL: https://github.com/apache/airflow/pull/6202
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj merged pull request #6204: [AIRFLOW-XXX] Use new import path in GCP table

2019-09-29 Thread GitBox
mik-laj merged pull request #6204: [AIRFLOW-XXX] Use new import path in GCP 
table
URL: https://github.com/apache/airflow/pull/6204
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-5570) Allow to store DAGs on GIT

2019-09-29 Thread Adam Trump (Jira)
Adam Trump created AIRFLOW-5570:
---

 Summary: Allow to store DAGs on GIT
 Key: AIRFLOW-5570
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5570
 Project: Apache Airflow
  Issue Type: New Feature
  Components: core, DAG, scheduler
Affects Versions: 1.10.5
Reporter: Adam Trump


Today it is possible to store the DAGs only on a specific folder, which is 
configurable.

I'd like to suggest to allow storing the DAGs on GIT.

This feature comes from the fact that Airflow already have this functionallity 
- under the KubernetesExecutor.

Thus, it shouldn't take much time moving it out of that executor, to the 
Airflow scheduler.

When you run Airflow as a cluster on different machines, since all of the 
components (webserver, scheduler and worker) needs access to the DAGs file, 
there is always a need of accessable mount.

It can be very helpful if we could store the DAGs on GIT, and configure all 
components to pull from that GIT repo.

Just as today, the scheduler should pull every X seconds from that repository.

When you want to add new DAGs, you simply need to push it to that repo.

It will also allow us to have some version control over DAGs, which could be 
useful for keeping cleaner code and order.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)