[GitHub] [airflow] coufon commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-08-08 Thread GitBox
coufon commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r312312220
 
 

 ##
 File path: airflow/www/utils.py
 ##
 @@ -374,6 +375,38 @@ def get_chart_height(dag):
 return 600 + len(dag.tasks) * 10
 
 
+def get_python_source(x: Any) -> str:
 
 Review comment:
   Yes. Already changed!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] coufon commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-08-08 Thread GitBox
coufon commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r312312018
 
 

 ##
 File path: airflow/models/dagbag.py
 ##
 @@ -416,6 +427,19 @@ def collect_dags(
  format(dag_names),
  file_stat.duration)
 
+def collect_dags_from_db(self):
+"""Collects DAGs from database."""
+start_dttm = timezone.utcnow()
+# DAG post-pcocessing steps such as self.bag_dag and croniter are not 
needed as
+# they are done by scheduler before serialization.
+# The dagbag contains all rows in serialized_dag table. Deleted DAGs 
are deleted
+# from the table by the scheduler job.
+self.log.info("Filling up the DagBag from database")
+self.dags = SerializedDagModel.read_all_dags()
+Stats.gauge(
 
 Review comment:
   > Can we use `Stats.timing` here? And you can just pass `timedelta` in.
   
   Good point! Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] ryanyuan commented on issue #5751: [AIRFLOW-5136] Fix Bug with Incorrect template_fields in DataProc{*} …

2019-08-08 Thread GitBox
ryanyuan commented on issue #5751: [AIRFLOW-5136] Fix Bug with Incorrect 
template_fields in DataProc{*} …
URL: https://github.com/apache/airflow/pull/5751#issuecomment-519734779
 
 
   @potiuk That makes sense. Thanks for the explanation!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-08-08 Thread GitBox
kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r312289887
 
 

 ##
 File path: airflow/www/utils.py
 ##
 @@ -374,6 +375,38 @@ def get_chart_height(dag):
 return 600 + len(dag.tasks) * 10
 
 
+def get_python_source(x: Any) -> str:
 
 Review comment:
   ```suggestion
   def get_python_source(x: Any) -> Optional[str]:
   ```
   
   if  :
   
   ```
   if x is None:
   return None
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-08-08 Thread GitBox
kaxil commented on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized 
DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#issuecomment-519723159
 
 
   Just rebased


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] bingqinzhou opened a new pull request #5764: Change default value of autodetect to be True.

2019-08-08 Thread GitBox
bingqinzhou opened a new pull request #5764: Change default value of autodetect 
to be True.
URL: https://github.com/apache/airflow/pull/5764
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-5152
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Set the default value of autodetect from false to be true to avoid breaking 
downstream services which use GoogleCloudStorageToBigQueryOperator but are not 
aware of this new autodetect field.
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-5152) Fix autodetect default value

2019-08-08 Thread Bingqin Zhou (JIRA)
Bingqin Zhou created AIRFLOW-5152:
-

 Summary: Fix autodetect default value
 Key: AIRFLOW-5152
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5152
 Project: Apache Airflow
  Issue Type: Bug
  Components: operators
Affects Versions: 1.10.4
Reporter: Bingqin Zhou
Assignee: Bingqin Zhou


[https://github.com/apache/airflow/pull/3880/commits/eb4cf61ae7583f5a306aa0cd7faa4d01aef61c33#diff-ee06f8fcbc476ea65446a30160c2a2b2]

The PR above introduces a new field `autodetect` and sets the default value to 
be false. As a result, `schema_fields`/`schema_objects` is always required if 
`autodetect` is not set. This breaks downstream services which use 
`GoogleCloudStorageToBigQueryOperator` without being aware of the added 
`autodetect` field.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (AIRFLOW-5148) Add Google Analytics to the Airflow doc website

2019-08-08 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903382#comment-16903382
 ] 

ASF subversion and git services commented on AIRFLOW-5148:
--

Commit bfe18bfe4e7095d779b81bb9ce978d6bc6123e6b in airflow-site's branch 
refs/heads/asf-site from kaxil
[ https://gitbox.apache.org/repos/asf?p=airflow-site.git;h=bfe18bf ]

Revert "[AIRFLOW-5148] Add Google Analytics to the Airflow doc website (#3)"

This reverts commit 43218b46da1fc60293db01349623ead28a32.


> Add Google Analytics to the Airflow doc website
> ---
>
> Key: AIRFLOW-5148
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5148
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: documentation
>Affects Versions: 1.10.2, 1.10.3, 1.10.4
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Asked by [~aizhamal] 
> Note from her:
> {noformat}
> I've looked at Google Analytics for the Airflow site, and I noticed that:
> -The https://airflow.readthedocs.io/en/latest/ site has the GA code set up.
> - The https://airflow.apache.org site does NOT have the GA code set up.
> So the data that we're getting on GA is not complete. 
> It would be really helpful to fix it soon, before we start revamping the 
> website to understand the changes user behavior (I am signing a contract with 
> a vendor next week)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [airflow] kaxil edited a comment on issue #5763: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
kaxil edited a comment on issue #5763: [AIRFLOW-5148] Add Google Analytics to 
the Airflow doc website
URL: https://github.com/apache/airflow/pull/5763#issuecomment-519705074
 
 
   Points taken - We will talk with Apache Legal team and consult. Have locked 
conversation so that everyone is not spammed with emails. Sorry I was writing @ 
private which actually ending up tagging entire list as shown in image below
   
   
![image](https://user-images.githubusercontent.com/8811558/62740979-34973780-ba31-11e9-8e6b-11367beb8c65.png)
   
   Commit reverted 
(https://github.com/apache/airflow/commit/974ef9cca3ccde2b61f13716e45ca64d6c63751c)
 till we get a reply from Apache Legal team - 
https://issues.apache.org/jira/browse/LEGAL-470


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-5148) Add Google Analytics to the Airflow doc website

2019-08-08 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903380#comment-16903380
 ] 

ASF subversion and git services commented on AIRFLOW-5148:
--

Commit 974ef9cca3ccde2b61f13716e45ca64d6c63751c in airflow's branch 
refs/heads/master from Kaxil Naik
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=974ef9c ]

Revert "[AIRFLOW-5148] Add Google Analytics to the Airflow doc website (#5763)"

This reverts commit 502ed749fee8c1c49e4a8f9180671e32b76a2dbb.


> Add Google Analytics to the Airflow doc website
> ---
>
> Key: AIRFLOW-5148
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5148
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: documentation
>Affects Versions: 1.10.2, 1.10.3, 1.10.4
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Asked by [~aizhamal] 
> Note from her:
> {noformat}
> I've looked at Google Analytics for the Airflow site, and I noticed that:
> -The https://airflow.readthedocs.io/en/latest/ site has the GA code set up.
> - The https://airflow.apache.org site does NOT have the GA code set up.
> So the data that we're getting on GA is not complete. 
> It would be really helpful to fix it soon, before we start revamping the 
> website to understand the changes user behavior (I am signing a contract with 
> a vendor next week)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [airflow] kaxil edited a comment on issue #5763: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
kaxil edited a comment on issue #5763: [AIRFLOW-5148] Add Google Analytics to 
the Airflow doc website
URL: https://github.com/apache/airflow/pull/5763#issuecomment-519705074
 
 
   Points taken - We will talk with Apache Legal team and consult. Have locked 
conversation so that everyone is not spammed with emails. Sorry I was writing @ 
private which actually ending up tagging entire list as shown in image below
   
   
![image](https://user-images.githubusercontent.com/8811558/62740979-34973780-ba31-11e9-8e6b-11367beb8c65.png)
   
   Commit reverted till we get reply from Apache Infra team - 
https://issues.apache.org/jira/browse/LEGAL-470


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil edited a comment on issue #5763: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
kaxil edited a comment on issue #5763: [AIRFLOW-5148] Add Google Analytics to 
the Airflow doc website
URL: https://github.com/apache/airflow/pull/5763#issuecomment-519705074
 
 
   Points taken - We will talk with Apache Legal team and consult. Have locked 
conversation so that everyone is not spammed with emails. Sorry I was writing @ 
private which actually ending up tagging entire list as shown in image below
   
   
![image](https://user-images.githubusercontent.com/8811558/62740979-34973780-ba31-11e9-8e6b-11367beb8c65.png)
   
   Commit reverted 
(https://github.com/apache/airflow/commit/974ef9cca3ccde2b61f13716e45ca64d6c63751c)
 till we get a reply from Apache Infra team - 
https://issues.apache.org/jira/browse/LEGAL-470


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
kaxil commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to the 
Airflow doc website
URL: https://github.com/apache/airflow/pull/5763#issuecomment-519705074
 
 
   Points taken - We will talk with Apache Legal team and consult. Have locked 
conversation so that everyone is not spammed with emails. Sorry I was writing @ 
private which actually ending up tagging entire list as shown in image below
   
   
![image](https://user-images.githubusercontent.com/8811558/62740979-34973780-ba31-11e9-8e6b-11367beb8c65.png)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] JasonGiedymin commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
JasonGiedymin commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to 
the Airflow doc website
URL: https://github.com/apache/airflow/pull/5763#issuecomment-519704051
 
 
   Good point on GDPR. Most likely should yank it until legal guidance is 
provided.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] jthomerson commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
jthomerson commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to the 
Airflow doc website
URL: https://github.com/apache/airflow/pull/5763#issuecomment-519703477
 
 
   Please do not tag apache committers on issues. It now means that we are all 
subscribed to every comment on this issue.
   
   I was able to unsubscribe myself, but I'm not sure how to unsubscribe the 
entire committers "team". Maybe someone else does. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] michael-o commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
michael-o commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to the 
Airflow doc website
URL: https://github.com/apache/airflow/pull/5763#issuecomment-519701300
 
 
   This likely violates GDPR. You have to have user's consent to collect data 
-- which you don't. I have already raised with LEGAL.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] JasonGiedymin commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
JasonGiedymin commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to 
the Airflow doc website
URL: https://github.com/apache/airflow/pull/5763#issuecomment-519699402
 
 
   Apache projects that embed GA must have a published privacy policy.
   
   Believe access should be granted to at least PMCs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-5150) Implement POC of GitLab CI + GKE integration

2019-08-08 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903349#comment-16903349
 ] 

ASF subversion and git services commented on AIRFLOW-5150:
--

Commit e5fa390d62014f54e8462ef7c48b5d0dbb7e52e2 in airflow's branch 
refs/heads/test-gitlab-ci from Jarek Potiuk
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=e5fa390 ]

[AIRFLOW-5150] Implement POC for GitLab + Kubernetes tests


> Implement POC of GitLab CI + GKE integration
> 
>
> Key: AIRFLOW-5150
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5150
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [airflow] KevinYang21 commented on issue #5740: [AIRFLOW-5087] Display task/dag run stats on UI so that users can debug more easily

2019-08-08 Thread GitBox
KevinYang21 commented on issue #5740: [AIRFLOW-5087] Display task/dag run stats 
on UI so that users can debug more easily
URL: https://github.com/apache/airflow/pull/5740#issuecomment-519694417
 
 
   Discussed offline, suggested to limit the scope to only DAG run and build 
some dep framework similar to the idea of task instance dep to scale better.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-5143) Corrupted rat.jar became part of the Docker image

2019-08-08 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903348#comment-16903348
 ] 

ASF subversion and git services commented on AIRFLOW-5143:
--

Commit 08c7898c81041560af3284b997946fa6aace9fd1 in airflow's branch 
refs/heads/v1-10-test from Jarek Potiuk
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=08c7898 ]

[AIRFLOW-5143] Caching works for Checklicence images (#5762)


(cherry picked from commit a4e3295e19cc13f31aedccb9d4f1a264658ec396)


> Corrupted rat.jar became part of the Docker image
> -
>
> Key: AIRFLOW-5143
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5143
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.5
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [airflow] potiuk commented on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-08-08 Thread GitBox
potiuk commented on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized 
DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#issuecomment-519693191
 
 
   @coufon should be fixed now finally :) 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-08-08 Thread GitBox
potiuk commented on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized 
DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#issuecomment-519693259
 
 
   Just rebase to latest master


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] dimberman commented on a change in pull request #5760: [AIRFLOW-5139] Allow custom ES configs

2019-08-08 Thread GitBox
dimberman commented on a change in pull request #5760: [AIRFLOW-5139] Allow 
custom ES configs
URL: https://github.com/apache/airflow/pull/5760#discussion_r312249877
 
 

 ##
 File path: airflow/config_templates/default_airflow.cfg
 ##
 @@ -591,6 +591,11 @@ json_format = False
 # Log fields to also attach to the json output, if enabled
 json_fields = asctime, filename, lineno, levelname, message
 
+[elasticsearch_configs]
+
+use_ssl = False
+verify_certs = False
 
 Review comment:
   @schnie good catch, thought they both defaulted to false


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-5143) Corrupted rat.jar became part of the Docker image

2019-08-08 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903347#comment-16903347
 ] 

ASF subversion and git services commented on AIRFLOW-5143:
--

Commit a4e3295e19cc13f31aedccb9d4f1a264658ec396 in airflow's branch 
refs/heads/master from Jarek Potiuk
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=a4e3295 ]

[AIRFLOW-5143] Caching works for Checklicence images (#5762)



> Corrupted rat.jar became part of the Docker image
> -
>
> Key: AIRFLOW-5143
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5143
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.5
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (AIRFLOW-5143) Corrupted rat.jar became part of the Docker image

2019-08-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903346#comment-16903346
 ] 

ASF GitHub Bot commented on AIRFLOW-5143:
-

potiuk commented on pull request #5762: [AIRFLOW-5143] Caching works for 
Checklicence images
URL: https://github.com/apache/airflow/pull/5762
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Corrupted rat.jar became part of the Docker image
> -
>
> Key: AIRFLOW-5143
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5143
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.5
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [airflow] potiuk merged pull request #5762: [AIRFLOW-5143] Caching works for Checklicence images

2019-08-08 Thread GitBox
potiuk merged pull request #5762: [AIRFLOW-5143] Caching works for Checklicence 
images
URL: https://github.com/apache/airflow/pull/5762
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] schnie commented on a change in pull request #5760: [AIRFLOW-5139] Allow custom ES configs

2019-08-08 Thread GitBox
schnie commented on a change in pull request #5760: [AIRFLOW-5139] Allow custom 
ES configs
URL: https://github.com/apache/airflow/pull/5760#discussion_r312243997
 
 

 ##
 File path: airflow/config_templates/default_airflow.cfg
 ##
 @@ -591,6 +591,11 @@ json_format = False
 # Log fields to also attach to the json output, if enabled
 json_fields = asctime, filename, lineno, levelname, message
 
+[elasticsearch_configs]
+
+use_ssl = False
+verify_certs = False
 
 Review comment:
   @dimberman wondering if we should mirror the defaults that the es client 
uses here so we don't change any behavior. `verify_certs` is `True` by default 
here: 
https://github.com/elastic/elasticsearch-py/blob/master/elasticsearch/connection/http_requests.py#L47-L48


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-5151) Simple boolean variable for a DAGRun to ignore failures for a certain task

2019-08-08 Thread t oo (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated AIRFLOW-5151:
--
Description: 
h3. Trigger Rules documentation in airflow is a bit light

4 scenarios are not covered with recommendations of how to achieve them:

scenario 1: Someone wants a DAG with a single task but this task is a 'nice to 
have' so any failures from the task should still count towards a dagrun of 
'success'

scenario 2:  Someone wants a DAG with 3 tasks but the 1st task is a 'nice to 
have' so any failures from the task should still count towards a dagrun of 
'success' AND the 2nd/3rd tasks should still run as normal (if 2nd/3rd tasks 
fail then dagrun should fail)

scenario 3:  Someone wants a DAG with 3 tasks but the 2nd task is a 'nice to 
have' so any failures from the task should still count towards a dagrun of 
'success' AND the 1st/3rd tasks should still run  (if 1st/3rd tasks fail then 
dagrun should fail)

scenario 4:  Someone wants a DAG with 3 tasks but the 3rd task is a 'nice to 
have' so any failures from the task should still count towards a dagrun of 
'success' AND the 1st/2nd tasks should still run  (if 1st/2nd tasks fail then 
dagrun should fail)

 

notes:

a) callback_triggers are too complex for a not uncommon use case

b) with python/bash operators you can simply not check returncodes but other 
custom operators always give success/fail unless there is some way to make them 
always give success (but unlike a dummyoperator need them to actually perform 
their task ie sftpopeerator..etc)?

 

  was:h3. Trigger Rules documentation in airflow is a bit light


> Simple boolean variable for a DAGRun to ignore failures for a certain task
> --
>
> Key: AIRFLOW-5151
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5151
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: DAG, DagRun, scheduler
>Affects Versions: 1.10.4
>Reporter: t oo
>Priority: Minor
> Fix For: 2.0.0
>
>
> h3. Trigger Rules documentation in airflow is a bit light
> 4 scenarios are not covered with recommendations of how to achieve them:
> scenario 1: Someone wants a DAG with a single task but this task is a 'nice 
> to have' so any failures from the task should still count towards a dagrun of 
> 'success'
> scenario 2:  Someone wants a DAG with 3 tasks but the 1st task is a 'nice to 
> have' so any failures from the task should still count towards a dagrun of 
> 'success' AND the 2nd/3rd tasks should still run as normal (if 2nd/3rd tasks 
> fail then dagrun should fail)
> scenario 3:  Someone wants a DAG with 3 tasks but the 2nd task is a 'nice to 
> have' so any failures from the task should still count towards a dagrun of 
> 'success' AND the 1st/3rd tasks should still run  (if 1st/3rd tasks fail then 
> dagrun should fail)
> scenario 4:  Someone wants a DAG with 3 tasks but the 3rd task is a 'nice to 
> have' so any failures from the task should still count towards a dagrun of 
> 'success' AND the 1st/2nd tasks should still run  (if 1st/2nd tasks fail then 
> dagrun should fail)
>  
> notes:
> a) callback_triggers are too complex for a not uncommon use case
> b) with python/bash operators you can simply not check returncodes but other 
> custom operators always give success/fail unless there is some way to make 
> them always give success (but unlike a dummyoperator need them to actually 
> perform their task ie sftpopeerator..etc)?
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (AIRFLOW-5151) Simple boolean variable for a DAGRun to ignore failures for a certain task

2019-08-08 Thread t oo (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated AIRFLOW-5151:
--
Description: h3. Trigger Rules documentation in airflow is a bit light  
(was: Some airflow users have no use of managing SLAs within airflow. I believe 
the scheduling process should be as fast as possible and not do unnecessary 
logging, the current IF statement is slower than a boolean flag and produces a 
redundant log.
h1. *EXISTING BEHAVIOR*
|if not any([isinstance(ti.sla, timedelta) for ti in dag.tasks]):|
| |self.log.info("Skipping SLA check for %s because no tasks in DAG have SLAs", 
dag)|
| |return|

 
h1. *FIX*

[https://github.com/apache/airflow/blob/master/airflow/jobs/scheduler_job.py]

within
|def _process_dags(self, dagbag, dags, tis_out):|

 

line 1221

*BEFORE*

self._process_task_instances(dag, tis_out)
 self.manage_slas(dag)

 

*AFTER*

1.

self._process_task_instances(dag, tis_out)

if conf.getboolean('scheduler', 'CHECK_SLA'):
    self.manage_slas(dag)

 

2. config then has a new variable check_sla with default true so existing users 
unaffected but other users can set to false.

 

 

 )

> Simple boolean variable for a DAGRun to ignore failures for a certain task
> --
>
> Key: AIRFLOW-5151
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5151
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: DAG, DagRun, scheduler
>Affects Versions: 1.10.4
>Reporter: t oo
>Priority: Minor
> Fix For: 2.0.0
>
>
> h3. Trigger Rules documentation in airflow is a bit light



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (AIRFLOW-5151) Simple boolean variable for a DAGRun to ignore failures for a certain task

2019-08-08 Thread t oo (JIRA)
t oo created AIRFLOW-5151:
-

 Summary: Simple boolean variable for a DAGRun to ignore failures 
for a certain task
 Key: AIRFLOW-5151
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5151
 Project: Apache Airflow
  Issue Type: Improvement
  Components: DAG, DagRun, scheduler
Affects Versions: 1.10.4
Reporter: t oo
 Fix For: 2.0.0


Some airflow users have no use of managing SLAs within airflow. I believe the 
scheduling process should be as fast as possible and not do unnecessary 
logging, the current IF statement is slower than a boolean flag and produces a 
redundant log.
h1. *EXISTING BEHAVIOR*
|if not any([isinstance(ti.sla, timedelta) for ti in dag.tasks]):|
| |self.log.info("Skipping SLA check for %s because no tasks in DAG have SLAs", 
dag)|
| |return|

 
h1. *FIX*

[https://github.com/apache/airflow/blob/master/airflow/jobs/scheduler_job.py]

within
|def _process_dags(self, dagbag, dags, tis_out):|

 

line 1221

*BEFORE*

self._process_task_instances(dag, tis_out)
 self.manage_slas(dag)

 

*AFTER*

1.

self._process_task_instances(dag, tis_out)

if conf.getboolean('scheduler', 'CHECK_SLA'):
    self.manage_slas(dag)

 

2. config then has a new variable check_sla with default true so existing users 
unaffected but other users can set to false.

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [airflow] kaxil edited a comment on issue #5763: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
kaxil edited a comment on issue #5763: [AIRFLOW-5148] Add Google Analytics to 
the Airflow doc website
URL: https://github.com/apache/airflow/pull/5763#issuecomment-519680917
 
 
   @aijamalnk Worth sending an email on mailing list  (@ private one) to ask 
and add them/all to Google Analytics


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil edited a comment on issue #5763: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
kaxil edited a comment on issue #5763: [AIRFLOW-5148] Add Google Analytics to 
the Airflow doc website
URL: https://github.com/apache/airflow/pull/5763#issuecomment-519680917
 
 
   @aijamalnk Worth sending an email on mailing list  (@private one) to ask and 
add them/all to Google Analytics


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
kaxil commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to the 
Airflow doc website
URL: https://github.com/apache/airflow/pull/5763#issuecomment-519680917
 
 
   @aijamalnk Worth sending an email on mailing list (@apache/apache-committers 
) - (@private one) to ask and add them/all to Google Analytics


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-5148) Add Google Analytics to the Airflow doc website

2019-08-08 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903315#comment-16903315
 ] 

ASF subversion and git services commented on AIRFLOW-5148:
--

Commit 502ed749fee8c1c49e4a8f9180671e32b76a2dbb in airflow's branch 
refs/heads/master from Kaxil Naik
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=502ed74 ]

[AIRFLOW-5148] Add Google Analytics to the Airflow doc website (#5763)



> Add Google Analytics to the Airflow doc website
> ---
>
> Key: AIRFLOW-5148
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5148
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: documentation
>Affects Versions: 1.10.2, 1.10.3, 1.10.4
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Asked by [~aizhamal] 
> Note from her:
> {noformat}
> I've looked at Google Analytics for the Airflow site, and I noticed that:
> -The https://airflow.readthedocs.io/en/latest/ site has the GA code set up.
> - The https://airflow.apache.org site does NOT have the GA code set up.
> So the data that we're getting on GA is not complete. 
> It would be really helpful to fix it soon, before we start revamping the 
> website to understand the changes user behavior (I am signing a contract with 
> a vendor next week)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (AIRFLOW-5148) Add Google Analytics to the Airflow doc website

2019-08-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5148?focusedWorklogId=291571&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-291571
 ]

ASF GitHub Bot logged work on AIRFLOW-5148:
---

Author: ASF GitHub Bot
Created on: 08/Aug/19 20:42
Start Date: 08/Aug/19 20:42
Worklog Time Spent: 10m 
  Work Description: kaxil commented on pull request #3: [AIRFLOW-5148] Add 
Google Analytics to the Airflow doc website
URL: https://github.com/apache/airflow-site/pull/3
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 291571)
Time Spent: 20m  (was: 10m)

> Add Google Analytics to the Airflow doc website
> ---
>
> Key: AIRFLOW-5148
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5148
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: documentation
>Affects Versions: 1.10.2, 1.10.3, 1.10.4
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Asked by [~aizhamal] 
> Note from her:
> {noformat}
> I've looked at Google Analytics for the Airflow site, and I noticed that:
> -The https://airflow.readthedocs.io/en/latest/ site has the GA code set up.
> - The https://airflow.apache.org site does NOT have the GA code set up.
> So the data that we're getting on GA is not complete. 
> It would be really helpful to fix it soon, before we start revamping the 
> website to understand the changes user behavior (I am signing a contract with 
> a vendor next week)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (AIRFLOW-5148) Add Google Analytics to the Airflow doc website

2019-08-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903314#comment-16903314
 ] 

ASF GitHub Bot commented on AIRFLOW-5148:
-

kaxil commented on pull request #5763: [AIRFLOW-5148] Add Google Analytics to 
the Airflow doc website
URL: https://github.com/apache/airflow/pull/5763
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Google Analytics to the Airflow doc website
> ---
>
> Key: AIRFLOW-5148
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5148
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: documentation
>Affects Versions: 1.10.2, 1.10.3, 1.10.4
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Asked by [~aizhamal] 
> Note from her:
> {noformat}
> I've looked at Google Analytics for the Airflow site, and I noticed that:
> -The https://airflow.readthedocs.io/en/latest/ site has the GA code set up.
> - The https://airflow.apache.org site does NOT have the GA code set up.
> So the data that we're getting on GA is not complete. 
> It would be really helpful to fix it soon, before we start revamping the 
> website to understand the changes user behavior (I am signing a contract with 
> a vendor next week)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (AIRFLOW-5148) Add Google Analytics to the Airflow doc website

2019-08-08 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903312#comment-16903312
 ] 

ASF subversion and git services commented on AIRFLOW-5148:
--

Commit 43218b46da1fc60293db01349623ead28a32 in airflow-site's branch 
refs/heads/asf-site from Kaxil Naik
[ https://gitbox.apache.org/repos/asf?p=airflow-site.git;h=4321aaa ]

[AIRFLOW-5148] Add Google Analytics to the Airflow doc website (#3)



> Add Google Analytics to the Airflow doc website
> ---
>
> Key: AIRFLOW-5148
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5148
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: documentation
>Affects Versions: 1.10.2, 1.10.3, 1.10.4
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Asked by [~aizhamal] 
> Note from her:
> {noformat}
> I've looked at Google Analytics for the Airflow site, and I noticed that:
> -The https://airflow.readthedocs.io/en/latest/ site has the GA code set up.
> - The https://airflow.apache.org site does NOT have the GA code set up.
> So the data that we're getting on GA is not complete. 
> It would be really helpful to fix it soon, before we start revamping the 
> website to understand the changes user behavior (I am signing a contract with 
> a vendor next week)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [airflow] kaxil merged pull request #5763: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
kaxil merged pull request #5763: [AIRFLOW-5148] Add Google Analytics to the 
Airflow doc website
URL: https://github.com/apache/airflow/pull/5763
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[airflow-site] branch add-ga deleted (was af72144)

2019-08-08 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a change to branch add-ga
in repository https://gitbox.apache.org/repos/asf/airflow-site.git.


 was af72144  [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

The revisions that were on this branch are still contained in
other references; therefore, this change does not discard any commits
from the repository.



[GitHub] [airflow-site] kaxil merged pull request #3: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
kaxil merged pull request #3: [AIRFLOW-5148] Add Google Analytics to the 
Airflow doc website
URL: https://github.com/apache/airflow-site/pull/3
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-5150) Implement POC of GitLab CI + GKE integration

2019-08-08 Thread Jarek Potiuk (JIRA)
Jarek Potiuk created AIRFLOW-5150:
-

 Summary: Implement POC of GitLab CI + GKE integration
 Key: AIRFLOW-5150
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5150
 Project: Apache Airflow
  Issue Type: Improvement
  Components: ci
Affects Versions: 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [airflow-site] kaxil opened a new pull request #3: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
kaxil opened a new pull request #3: [AIRFLOW-5148] Add Google Analytics to the 
Airflow doc website
URL: https://github.com/apache/airflow-site/pull/3
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-5148
   
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Asked by @aijamalnk 
   
   Note from her:
   
   > I've looked at Google Analytics for the Airflow site, and I noticed that:
   > -The https://airflow.readthedocs.io/en/latest/ site has the GA code set up.
   > - The https://airflow.apache.org site does NOT have the GA code set up.
   > So the data that we're getting on GA is not complete. 
   > It would be really helpful to fix it soon, before we start revamping the 
website to understand the changes user behavior (I am signing a contract with a 
vendor next week)
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (AIRFLOW-5148) Add Google Analytics to the Airflow doc website

2019-08-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5148?focusedWorklogId=291566&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-291566
 ]

ASF GitHub Bot logged work on AIRFLOW-5148:
---

Author: ASF GitHub Bot
Created on: 08/Aug/19 20:37
Start Date: 08/Aug/19 20:37
Worklog Time Spent: 10m 
  Work Description: kaxil commented on pull request #3: [AIRFLOW-5148] Add 
Google Analytics to the Airflow doc website
URL: https://github.com/apache/airflow-site/pull/3
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-5148
   
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Asked by @aijamalnk 
   
   Note from her:
   
   > I've looked at Google Analytics for the Airflow site, and I noticed that:
   > -The https://airflow.readthedocs.io/en/latest/ site has the GA code set up.
   > - The https://airflow.apache.org site does NOT have the GA code set up.
   > So the data that we're getting on GA is not complete. 
   > It would be really helpful to fix it soon, before we start revamping the 
website to understand the changes user behavior (I am signing a contract with a 
vendor next week)
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [x] Passes `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 291566)
Time Spent: 10m
Remaining Estimate: 0h

> Add Google Analytics to the Airflow doc website
> ---
>
> Key: AIRFLOW-5148
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5148
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: documentation
>Affects Versions: 1.10.2, 1.10.3, 1.10.4
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Asked by [~aizhamal] 
> Note from her:
> {noformat}
> I've looked at Google Analytics for the Airflow site, and I noticed that:
> -The https://airflow.readthedocs.io/en/latest/ site has the GA code set up.
> - The https://airflow.apache.org site does NOT have the GA code set up.
> So the data that we're getting on GA is not complete. 
> It would be really helpful to fix it soon, before we start revamping the 
> website to understand the changes user behavior (I am signing a contract with 
> a vendor next week)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (AIRFLOW-5149) Config flag to skip SLA checks

2019-08-08 Thread t oo (JIRA)
t oo created AIRFLOW-5149:
-

 Summary: Config flag to skip SLA checks
 Key: AIRFLOW-5149
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5149
 Project: Apache Airflow
  Issue Type: Improvement
  Components: DAG, DagRun, scheduler
Affects Versions: 1.10.4
Reporter: t oo
 Fix For: 2.0.0


Some airflow users have no use of managing SLAs within airflow. I believe the 
scheduling process should be as fast as possible and not do unnecessary 
logging, the current IF statement is slower than a boolean flag and produces a 
redundant log.
h1. *EXISTING BEHAVIOR*
|if not any([isinstance(ti.sla, timedelta) for ti in dag.tasks]):|
| |self.log.info("Skipping SLA check for %s because no tasks in DAG have SLAs", 
dag)|
| |return|

 
h1. *FIX*

[https://github.com/apache/airflow/blob/master/airflow/jobs/scheduler_job.py]

within
|def _process_dags(self, dagbag, dags, tis_out):|

 

line 1221

*BEFORE*

self._process_task_instances(dag, tis_out)
 self.manage_slas(dag)

 

*AFTER*

1.

self._process_task_instances(dag, tis_out)

if conf.getboolean('scheduler', 'CHECK_SLA'):
    self.manage_slas(dag)

 

2. config then has a new variable check_sla with default true so existing users 
unaffected but other users can set to false.

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (AIRFLOW-5148) Add Google Analytics to the Airflow doc website

2019-08-08 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903304#comment-16903304
 ] 

ASF subversion and git services commented on AIRFLOW-5148:
--

Commit af72144f2420f5ad9aa66230d06676714b84ca3f in airflow-site's branch 
refs/heads/add-ga from kaxil
[ https://gitbox.apache.org/repos/asf?p=airflow-site.git;h=af72144 ]

[AIRFLOW-5148] Add Google Analytics to the Airflow doc website


> Add Google Analytics to the Airflow doc website
> ---
>
> Key: AIRFLOW-5148
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5148
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: documentation
>Affects Versions: 1.10.2, 1.10.3, 1.10.4
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Minor
>
> Asked by [~aizhamal] 
> Note from her:
> {noformat}
> I've looked at Google Analytics for the Airflow site, and I noticed that:
> -The https://airflow.readthedocs.io/en/latest/ site has the GA code set up.
> - The https://airflow.apache.org site does NOT have the GA code set up.
> So the data that we're getting on GA is not complete. 
> It would be really helpful to fix it soon, before we start revamping the 
> website to understand the changes user behavior (I am signing a contract with 
> a vendor next week)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[airflow-site] branch add-ga created (now af72144)

2019-08-08 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a change to branch add-ga
in repository https://gitbox.apache.org/repos/asf/airflow-site.git.


  at af72144  [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

This branch includes the following new commits:

 new af72144  [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.




[GitHub] [airflow] kaxil edited a comment on issue #5763: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
kaxil edited a comment on issue #5763: [AIRFLOW-5148] Add Google Analytics to 
the Airflow doc website
URL: https://github.com/apache/airflow/pull/5763#issuecomment-519662287
 
 
   > Do we need to notify users about the tracking - GDRP? What about other 
legal obligations?
   
   The code added has a footer that says that: 
   
   ```
   This page uses https://analytics.google.com/
   Google Analytics to collect statistics. You can disable it by blocking
   the JavaScript coming from www.google-analytics.com.
   ```
   
   
https://github.com/apache/airflow/pull/5763/files#diff-0b5a851defc1a76552c74bb818ba828fR98-R114
   
   
   
   
   > Who has access to data? Do we need it? what about alternatives e.g. polish 
self-hosted app - Piwik? I think we have access to access logs and we can 
generate statistics.
   
   @aijamalnk can answer this one in detail but AFAIK currently only Google has 
the data I guess. This was already added to airflow.readthedocs.io . It was an 
email sent to the Apache Private list to all Airflow PMC Members and it was 
because she wanted to make a case for funding the new upcoming Airflow website 
to Google.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
mik-laj commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to the 
Airflow doc website
URL: https://github.com/apache/airflow/pull/5763#issuecomment-519666004
 
 
   I think someone from Apache should have access to the data.
   
   On monday check if it is possible to generate statistics based on access logs


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil edited a comment on issue #5763: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
kaxil edited a comment on issue #5763: [AIRFLOW-5148] Add Google Analytics to 
the Airflow doc website
URL: https://github.com/apache/airflow/pull/5763#issuecomment-519662287
 
 
   > Do we need to notify users about the tracking - GDRP? What about other 
legal obligations?
   
   The code added has a footer that says that: 
   
   ```
   This page uses https://analytics.google.com/
   Google Analytics to collect statistics. You can disable it by blocking
   the JavaScript coming from www.google-analytics.com.
   ```
   
   
https://github.com/apache/airflow/pull/5763/files#diff-0b5a851defc1a76552c74bb818ba828fR98-R114
   
   
   
   
   > Who has access to data? Do we need it? what about alternatives e.g. polish 
self-hosted app - Piwik? I think we have access to access logs and we can 
generate statistics.
   
   @aijamalnk can answer this one in detail but AFAIK currently only Google has 
the data I guess. This was already added to airflow.readthedocs.io . I remember 
reading somewhere that it was because she wanted to make a case for funding the 
new upcoming Airflow website to Google.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
kaxil commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to the 
Airflow doc website
URL: https://github.com/apache/airflow/pull/5763#issuecomment-519662287
 
 
   > Do we need to notify users about the tracking - GDRP? What about other 
legal obligations?
   
   The code added has a footer that says that: 
   
   ```
   This page uses https://analytics.google.com/
   Google Analytics to collect statistics. You can disable it by blocking
   the JavaScript coming from www.google-analytics.com.
   ```
   
   
https://github.com/apache/airflow/pull/5763/files#diff-0b5a851defc1a76552c74bb818ba828fR98-R114
   
   
   
   
   > Who has access to data? Do we need it? what about alternatives e.g. polish 
self-hosted app - Piwik? I think we have access to access logs and we can 
generate statistics.
   
   @aijamalnk can answer this one in details but AFAIK currently only Google 
has the data I guess. This was already added to airflow.readthedocs.io . I 
remember reading somewhere that it was because she wanted to make a case for 
funding the new upcoming Airflow website to Google.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
mik-laj commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to the 
Airflow doc website
URL: https://github.com/apache/airflow/pull/5763#issuecomment-519659525
 
 
   Do we need to notify users about the tracking - GDRP? What about other legal 
obligations?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
mik-laj commented on issue #5763: [AIRFLOW-5148] Add Google Analytics to the 
Airflow doc website
URL: https://github.com/apache/airflow/pull/5763#issuecomment-519657978
 
 
   Who has access to data? Do we need it? what about alternatives e.g. polish 
self-hosted app - Piwik? I think we have access to access logs and we can 
generate statistics.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-5148) Add Google Analytics to the Airflow doc website

2019-08-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903257#comment-16903257
 ] 

ASF GitHub Bot commented on AIRFLOW-5148:
-

kaxil commented on pull request #5763: [AIRFLOW-5148] Add Google Analytics to 
the Airflow doc website
URL: https://github.com/apache/airflow/pull/5763
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-5148
   
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Asked by @aijamalnk 
   
   Note from her:
   
   > I've looked at Google Analytics for the Airflow site, and I noticed that:
   > -The https://airflow.readthedocs.io/en/latest/ site has the GA code set up.
   > - The https://airflow.apache.org site does NOT have the GA code set up.
   > So the data that we're getting on GA is not complete. 
   > It would be really helpful to fix it soon, before we start revamping the 
website to understand the changes user behavior (I am signing a contract with a 
vendor next week)
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [x] Passes `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Google Analytics to the Airflow doc website
> ---
>
> Key: AIRFLOW-5148
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5148
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: documentation
>Affects Versions: 1.10.2, 1.10.3, 1.10.4
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Minor
>
> Asked by [~aizhamal] 
> Note from her:
> {noformat}
> I've looked at Google Analytics for the Airflow site, and I noticed that:
> -The https://airflow.readthedocs.io/en/latest/ site has the GA code set up.
> - The https://airflow.apache.org site does NOT have the GA code set up.
> So the data that we're getting on GA is not complete. 
> It would be really helpful to fix it soon, before we start revamping the 
> website to understand the changes user behavior (I am signing a contract with 
> a vendor next week)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [airflow] kaxil opened a new pull request #5763: [AIRFLOW-5148] Add Google Analytics to the Airflow doc website

2019-08-08 Thread GitBox
kaxil opened a new pull request #5763: [AIRFLOW-5148] Add Google Analytics to 
the Airflow doc website
URL: https://github.com/apache/airflow/pull/5763
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-5148
   
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Asked by @aijamalnk 
   
   Note from her:
   
   > I've looked at Google Analytics for the Airflow site, and I noticed that:
   > -The https://airflow.readthedocs.io/en/latest/ site has the GA code set up.
   > - The https://airflow.apache.org site does NOT have the GA code set up.
   > So the data that we're getting on GA is not complete. 
   > It would be really helpful to fix it soon, before we start revamping the 
website to understand the changes user behavior (I am signing a contract with a 
vendor next week)
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Assigned] (AIRFLOW-5148) Add Google Analytics to the Airflow doc website

2019-08-08 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik reassigned AIRFLOW-5148:
---

Assignee: Kaxil Naik

> Add Google Analytics to the Airflow doc website
> ---
>
> Key: AIRFLOW-5148
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5148
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: documentation
>Affects Versions: 1.10.2, 1.10.3, 1.10.4
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Minor
>
> Asked by [~aizhamal] 
> Note from her:
> {noformat}
> I've looked at Google Analytics for the Airflow site, and I noticed that:
> -The https://airflow.readthedocs.io/en/latest/ site has the GA code set up.
> - The https://airflow.apache.org site does NOT have the GA code set up.
> So the data that we're getting on GA is not complete. 
> It would be really helpful to fix it soon, before we start revamping the 
> website to understand the changes user behavior (I am signing a contract with 
> a vendor next week)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [airflow] coufon commented on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-08-08 Thread GitBox
coufon commented on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized 
DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#issuecomment-519655290
 
 
   > Zhou Fang Kaxil Naik -> it turned out caching was not enabled for the 
checklicence image building and it was always rebuilding from the scratch: 
Fixed (and tested locally) in #5762
   
   Thanks Jarek!!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-5148) Add Google Analytics to the Airflow doc website

2019-08-08 Thread Kaxil Naik (JIRA)
Kaxil Naik created AIRFLOW-5148:
---

 Summary: Add Google Analytics to the Airflow doc website
 Key: AIRFLOW-5148
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5148
 Project: Apache Airflow
  Issue Type: New Feature
  Components: documentation
Affects Versions: 1.10.4, 1.10.3, 1.10.2
Reporter: Kaxil Naik


Asked by [~aizhamal] 

Note from her:


{noformat}
I've looked at Google Analytics for the Airflow site, and I noticed that:
-The https://airflow.readthedocs.io/en/latest/ site has the GA code set up.
- The https://airflow.apache.org site does NOT have the GA code set up.
So the data that we're getting on GA is not complete. 
It would be really helpful to fix it soon, before we start revamping the 
website to understand the changes user behavior (I am signing a contract with a 
vendor next week)
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [airflow] potiuk commented on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-08-08 Thread GitBox
potiuk commented on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized 
DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#issuecomment-519654332
 
 
   @coufon @kaxil -> it turned out caching was not enabled for the checklicence 
image building and it was always rebuilding from the scratch: Fixed (and tested 
locally) in #5762 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #5762: [AIRFLOW-5143] Caching works for Checklicence images

2019-08-08 Thread GitBox
potiuk commented on issue #5762: [AIRFLOW-5143] Caching works for Checklicence 
images
URL: https://github.com/apache/airflow/pull/5762#issuecomment-519653557
 
 
   It turned out that caching was not really enabled for the "checklicence" 
image. This commit fixes it and also builds the checklicence image locally when 
needed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] coufon commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-08-08 Thread GitBox
coufon commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r312201791
 
 

 ##
 File path: airflow/models/serialized_dag.py
 ##
 @@ -0,0 +1,143 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Serialzed DAG table in database."""
+
+import hashlib
+from typing import Any, Dict, List, Optional, TYPE_CHECKING
+from sqlalchemy import Column, Index, Integer, String, Text, and_
+from sqlalchemy.sql import exists
+
+from airflow.models.base import Base, ID_LEN
+from airflow.utils import timezone
+from airflow.utils.db import provide_session
+from airflow.utils.sqlalchemy import UtcDateTime
+
+
+if TYPE_CHECKING:
+from airflow.dag.serialization.serialized_dag import SerializedDAG  # 
noqa: F401, E501; # pylint: disable=cyclic-import
+from airflow.models import DAG  # noqa: F401; # pylint: 
disable=cyclic-import
+
+
+class SerializedDagModel(Base):
+"""A database table for serialized DAGs."""
+
+__tablename__ = 'serialized_dag'
+
+dag_id = Column(String(ID_LEN), primary_key=True)
+fileloc = Column(String(2000))
+# The max length of fileloc exceeds the limit of indexing.
+fileloc_hash = Column(Integer)
+data = Column(Text)
+last_updated = Column(UtcDateTime)
+
+__table_args__ = (
+Index('idx_fileloc_hash', fileloc_hash, unique=False),
+)
+
+def __init__(self, dag):
+from airflow.dag.serialization import Serialization
+
+self.dag_id = dag.dag_id
+self.fileloc = dag.full_filepath
+self.fileloc_hash = SerializedDagModel.dag_fileloc_hash(self.fileloc)
+self.data = Serialization.to_json(dag)
+self.last_updated = timezone.utcnow()
+
+@staticmethod
+def dag_fileloc_hash(full_filepath: str) -> int:
+Hashing file location for indexing.
+
+:param full_filepath: full filepath of DAG file
+:return: hashed full_filepath
+"""
+# Truncates hash to 4 bytes.
+# TODO(coufon): hashing is needed because the length of fileloc is 
2000 as
+# an Airflow convention, which is over the limit of indexing. If we can
+return int(0x & int(
+hashlib.sha1(full_filepath.encode('utf-8')).hexdigest(), 16))
+
+@classmethod
+@provide_session
+def write_dag(cls, dag: 'DAG', min_update_interval: Optional[int] = None, 
session=None):
+"""Serializes a DAG and writes it into database.
+
+:param dag: a DAG to be written into database
+:param min_update_interval: minimal interval in seconds to update 
serialized DAG
+"""
+if min_update_interval is not None:
+result = session.query(cls.last_updated).filter(
+cls.dag_id == dag.dag_id).first()
+if result is not None and (
+timezone.utcnow() - result.last_updated).total_seconds() < 
min_update_interval:
+return
+session.merge(cls(dag))
+session.commit()
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-5143) Corrupted rat.jar became part of the Docker image

2019-08-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903248#comment-16903248
 ] 

ASF GitHub Bot commented on AIRFLOW-5143:
-

potiuk commented on pull request #5762: [AIRFLOW-5143] Caching works for 
Checklicence images
URL: https://github.com/apache/airflow/pull/5762
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-5143
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [x] Passes `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Corrupted rat.jar became part of the Docker image
> -
>
> Key: AIRFLOW-5143
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5143
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.5
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [airflow] potiuk opened a new pull request #5762: [AIRFLOW-5143] Caching works for Checklicence images

2019-08-08 Thread GitBox
potiuk opened a new pull request #5762: [AIRFLOW-5143] Caching works for 
Checklicence images
URL: https://github.com/apache/airflow/pull/5762
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-5143
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Reopened] (AIRFLOW-5143) Corrupted rat.jar became part of the Docker image

2019-08-08 Thread Jarek Potiuk (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reopened AIRFLOW-5143:
---

> Corrupted rat.jar became part of the Docker image
> -
>
> Key: AIRFLOW-5143
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5143
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.5
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [airflow] houqp commented on issue #5731: [AIRFLOW-5117] support refreshing EKS api tokens

2019-08-08 Thread GitBox
houqp commented on issue #5731: [AIRFLOW-5117] support refreshing EKS api tokens
URL: https://github.com/apache/airflow/pull/5731#issuecomment-519648466
 
 
   Looks like upstream k8s client is moving to a new openapi code generator, I 
have sent my code gen fix to openapi-generator as well. It will probably take 
awhile for the fix to land in upstream k8s client given they are still in the 
process of migrating to openapi-generator.
   
   All my upstream fix activities can be tracked at 
https://github.com/kubernetes-client/python/issues/741.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-4568) The ExternalTaskSensor should be configurable to raise an Airflow Exception in case the poked external task reaches a disallowed state, such as f.i. failed

2019-08-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903235#comment-16903235
 ] 

ASF GitHub Bot commented on AIRFLOW-4568:
-

michaelmdeng commented on pull request #5755: [AIRFLOW-4568] Add 
unallowed_states to ExternalTaskSensor
URL: https://github.com/apache/airflow/pull/5755
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> The ExternalTaskSensor should be configurable to raise an Airflow Exception 
> in case the poked external task reaches a disallowed state, such as f.i. 
> failed
> ---
>
> Key: AIRFLOW-4568
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4568
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: operators
>Affects Versions: 1.10.3
>Reporter: ddluke
>Priority: Minor
>
> _As an engineer, I would like to have the behavior of the ExternalTaskSensor 
> changed_
> _So that it fails in case the poked external_task_id fails_
> *Therefore*
>  * I suggest extending the behavior of the sensor to optionally also query 
> the TaskInstance for disallowed states and raise an AirflowException if 
> found. Currently, if the poked external task reaches a failed state, the 
> sensor continues to poke and does not terminate
> *Acceptance Criteria (from my pov)*
>  * The class interface for ExternalTaskSensor is extended with an additional 
> parameter, disallowed_states, which is an Optional List of 
> airflow.utils.state.State
>  * The poke method is expanded to count the number of rows from TaskInstance 
> which met the filter criteria dag_id, task_id, disallowed_states and 
> dttm_filter if disallowed_states is not None
>  * If disallowed_states is not None and the above query returns a counter > 
> 0, an Airflow Exception is thrown



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (AIRFLOW-4568) The ExternalTaskSensor should be configurable to raise an Airflow Exception in case the poked external task reaches a disallowed state, such as f.i. failed

2019-08-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903236#comment-16903236
 ] 

ASF GitHub Bot commented on AIRFLOW-4568:
-

michaelmdeng commented on pull request #5755: [AIRFLOW-4568] Add 
unallowed_states to ExternalTaskSensor
URL: https://github.com/apache/airflow/pull/5755
 
 
   
   
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-4568
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   Also possibly dupe of [AIRFLOW-104]??
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Often times we want an ExternalTaskSensor to not behave as a sensor (poke 
until a condition is satisfied), and instead as a mirror of a task in another 
dag (succeed if the external succeeds, fail if the external fails).
   
   This change adds the `unallowed_states` parameter to the ExternalTaskSensor. 
If external task/dag is found to be in `unallowed_states`, the sensor will 
immediately fail instead of previous behavior of returning false and continuing 
to poke until timeout.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   Additional tests in tests/sensors/test_external_task_sensor.py
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [x] Passes `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> The ExternalTaskSensor should be configurable to raise an Airflow Exception 
> in case the poked external task reaches a disallowed state, such as f.i. 
> failed
> ---
>
> Key: AIRFLOW-4568
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4568
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: operators
>Affects Versions: 1.10.3
>Reporter: ddluke
>Priority: Minor
>
> _As an engineer, I would like to have the behavior of the ExternalTaskSensor 
> changed_
> _So that it fails in case the poked external_task_id fails_
> *Therefore*
>  * I suggest extending the behavior of the sensor to optionally also query 
> the TaskInstance for disallowed states and raise an AirflowException if 
> found. Currently, if the poked external task reaches a failed state, the 
> sensor continues to poke and does not terminate
> *Acceptance Criteria (from my pov)*
>  * The class interface for ExternalTaskSensor is extended with an additional 
> parameter, disallowed_states, which is an Optional List of 
> airflow.utils.state.State
>  * The poke method is expanded to count the number of rows from TaskInstance 
> which met the filter criteria dag_id, task_id, disallowed_states

[GitHub] [airflow] kaxil commented on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-08-08 Thread GitBox
kaxil commented on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized 
DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#issuecomment-519647789
 
 
   > hmm. It still failing on rat. This is strange. I am taking a look.
   
   Thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] michaelmdeng closed pull request #5755: [AIRFLOW-4568] Add unallowed_states to ExternalTaskSensor

2019-08-08 Thread GitBox
michaelmdeng closed pull request #5755: [AIRFLOW-4568] Add unallowed_states to 
ExternalTaskSensor
URL: https://github.com/apache/airflow/pull/5755
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] michaelmdeng opened a new pull request #5755: [AIRFLOW-4568] Add unallowed_states to ExternalTaskSensor

2019-08-08 Thread GitBox
michaelmdeng opened a new pull request #5755: [AIRFLOW-4568] Add 
unallowed_states to ExternalTaskSensor
URL: https://github.com/apache/airflow/pull/5755
 
 
   
   
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-4568
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   Also possibly dupe of [AIRFLOW-104]??
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Often times we want an ExternalTaskSensor to not behave as a sensor (poke 
until a condition is satisfied), and instead as a mirror of a task in another 
dag (succeed if the external succeeds, fail if the external fails).
   
   This change adds the `unallowed_states` parameter to the ExternalTaskSensor. 
If external task/dag is found to be in `unallowed_states`, the sensor will 
immediately fail instead of previous behavior of returning false and continuing 
to poke until timeout.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   Additional tests in tests/sensors/test_external_task_sensor.py
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-08-08 Thread GitBox
potiuk commented on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized 
DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#issuecomment-519647298
 
 
   hmm. It still failing on rat. This is strange. I am taking a look.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] houqp commented on issue #5664: [AIRFLOW-5140] fix all missing type annotation errors from dmypy

2019-08-08 Thread GitBox
houqp commented on issue #5664: [AIRFLOW-5140] fix all missing type annotation 
errors from dmypy
URL: https://github.com/apache/airflow/pull/5664#issuecomment-519644012
 
 
   Thanks @mik-laj for the 2nd round of review, I have made all the suggested 
changes :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on a change in pull request #5760: AIRFLOW-5139 Allow custom ES configs

2019-08-08 Thread GitBox
kaxil commented on a change in pull request #5760: AIRFLOW-5139 Allow custom ES 
configs
URL: https://github.com/apache/airflow/pull/5760#discussion_r312189867
 
 

 ##
 File path: airflow/utils/log/es_task_handler.py
 ##
 @@ -54,7 +55,8 @@ class ElasticsearchTaskHandler(FileTaskHandler, 
LoggingMixin):
 def __init__(self, base_log_folder, filename_template,
  log_id_template, end_of_log_mark,
  write_stdout, json_format, json_fields,
- host='localhost:9200'):
+ host='localhost:9200',
+ es_kwargs=conf.getsection("elasticsearch_configs") or {}):
 
 Review comment:
   Can we add this to 
`https://github.com/apache/airflow/blob/master/airflow/config_templates/default_airflow.cfg`
 as well, please?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on a change in pull request #5760: AIRFLOW-5139 Allow custom ES configs

2019-08-08 Thread GitBox
kaxil commented on a change in pull request #5760: AIRFLOW-5139 Allow custom ES 
configs
URL: https://github.com/apache/airflow/pull/5760#discussion_r312189867
 
 

 ##
 File path: airflow/utils/log/es_task_handler.py
 ##
 @@ -54,7 +55,8 @@ class ElasticsearchTaskHandler(FileTaskHandler, 
LoggingMixin):
 def __init__(self, base_log_folder, filename_template,
  log_id_template, end_of_log_mark,
  write_stdout, json_format, json_fields,
- host='localhost:9200'):
+ host='localhost:9200',
+ es_kwargs=conf.getsection("elasticsearch_configs") or {}):
 
 Review comment:
   Can we add this to 
https://github.com/apache/airflow/blob/master/airflow/config_templates/default_airflow.cfg
 as well, please?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] coufon commented on a change in pull request #5701: [AIRFLOW-5088][AIP-24] Add DAG serialization using JSON

2019-08-08 Thread GitBox
coufon commented on a change in pull request #5701: [AIRFLOW-5088][AIP-24] Add 
DAG serialization using JSON
URL: https://github.com/apache/airflow/pull/5701#discussion_r312173903
 
 

 ##
 File path: airflow/dag/serialization.py
 ##
 @@ -0,0 +1,250 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""DAG serialization with JSON."""
+
+import json
+import logging
+
+import datetime
+import dateutil.parser
+import pendulum
+
+from airflow import models
+from airflow.www.utils import get_python_source
+
+
+# JSON primitive types.
+_primitive_types = (int, bool, float, str)
+
+_datetime_types = (datetime.datetime, datetime.date, datetime.time)
+
+# Object types that are always excluded.
+# TODO(coufon): not needed if _dag_included_fields and _op_included_fields are 
customized.
+_excluded_types = (logging.Logger, models.connection.Connection, type)
+
+# Stringified DADs and operators contain exactly these fields.
+# TODO(coufon): to customize included fields and keep only necessary fields.
+_dag_included_fields = list(vars(models.DAG(dag_id='test')).keys())
+_op_included_fields = list(vars(models.BaseOperator(task_id='test')).keys()) + 
[
+'_dag', 'ui_color', 'ui_fgcolor', 'template_fields']
+
+# Encoding constants.
+TYPE = '__type'
+CLASS = '__class'
+VAR = '__var'
+
+# Supported types. primitives and list are not encoded.
+DAG = 'dag'
+OP = 'operator'
+DATETIME = 'datetime'
+TIMEDELTA = 'timedelta'
+TIMEZONE = 'timezone'
+DICT = 'dict'
+SET = 'set'
+TUPLE = 'tuple'
+
+# Constants.
+BASE_OPERATOR_CLASS = 'BaseOperator'
+# Serialization failure returns 'failed'.
+FAILED = 'failed'
+
+
+def _is_primitive(x):
+"""Primitive types."""
+return x is None or isinstance(x, _primitive_types)
+
+
+def _is_excluded(x):
+"""Types excluded from serialization.
+
+TODO(coufon): not needed if _dag_included_fields and _op_included_fields 
are customized.
+"""
+return x is None or isinstance(x, _excluded_types)
+
+
+def _serialize_object(x, visited_dags, included_fields):
+"""Helper fn to serialize an object as a JSON dict."""
+new_x = {}
+for k in included_fields:
+# None is ignored in serialized form and is added back in 
deserialization.
+v = getattr(x, k, None)
+if not _is_excluded(v):
+new_x[k] = _serialize(v, visited_dags)
+return new_x
+
+
+def _serialize_dag(x, visited_dags):
+"""Serialize a DAG."""
+if x.dag_id in visited_dags:
+return {TYPE: DAG, VAR: str(x.dag_id)}
+
+new_x = {TYPE: DAG}
+visited_dags[x.dag_id] = new_x
+new_x[VAR] = _serialize_object(
+x, visited_dags, included_fields=_dag_included_fields)
+return new_x
+
+
+def _serialize_operator(x, visited_dags):
+"""Serialize an operator."""
+return _encode(
+_serialize_object(
+x, visited_dags, included_fields=_op_included_fields),
+type_=OP,
+class_=x.__class__.__name__
+)
+
+
+def _encode(x, type_, class_=None):
+"""Encode data by a JSON dict."""
+return ({VAR: x, TYPE: type_} if class_ is None
+else {VAR: x, TYPE: type_, CLASS: class_})
+
+
+def _serialize(x, visited_dags):  # pylint: disable=too-many-return-statements
+"""Helper function of depth first search for serialization.
+
+visited_dags stores DAGs that are being stringifying for have been 
stringified,
+for:
+  (1) preventing deadlock loop caused by task.dag, task._dag, and 
dag.parent_dag;
+  (2) replacing the fields in (1) with serialized counterparts.
+
+The serialization protocol is:
+  (1) keeping JSON supported types: primitives, dict, list;
+  (2) encoding other types as {TYPE, 'foo', VAR, 'bar'}, the 
deserialization
+  step decode VAR according to TYPE;
+  (3) Operator has a special field CLASS to record the original class
+  name for displaying in UI.
+"""
+try:
+if _is_primitive(x):
+return x
+elif isinstance(x, dict):
+return _encode({k: _serialize(v, visited_dags) for k, v in 
x.items()}, type_=DICT)
 
 Review comment:
   Use str(k) instead of k here. If uses put a non-primtive key in some dict, 
e.g., user-defined-mac

[GitHub] [airflow] coufon commented on a change in pull request #5701: [AIRFLOW-5088][AIP-24] Add DAG serialization using JSON

2019-08-08 Thread GitBox
coufon commented on a change in pull request #5701: [AIRFLOW-5088][AIP-24] Add 
DAG serialization using JSON
URL: https://github.com/apache/airflow/pull/5701#discussion_r312172751
 
 

 ##
 File path: airflow/dag/serialization.py
 ##
 @@ -0,0 +1,250 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""DAG serialization with JSON."""
+
+import json
+import logging
+
+import datetime
+import dateutil.parser
+import pendulum
+
+from airflow import models
+from airflow.www.utils import get_python_source
+
+
+# JSON primitive types.
+_primitive_types = (int, bool, float, str)
+
+_datetime_types = (datetime.datetime, datetime.date, datetime.time)
+
+# Object types that are always excluded.
+# TODO(coufon): not needed if _dag_included_fields and _op_included_fields are 
customized.
+_excluded_types = (logging.Logger, models.connection.Connection, type)
+
+# Stringified DADs and operators contain exactly these fields.
+# TODO(coufon): to customize included fields and keep only necessary fields.
+_dag_included_fields = list(vars(models.DAG(dag_id='test')).keys())
+_op_included_fields = list(vars(models.BaseOperator(task_id='test')).keys()) + 
[
 
 Review comment:
   Made it a property of SerializedBaseOperator in 
airflow/dag/serialization/serialized_baseoperator.py. We can then access this 
by SerializedBaseOperator._included_fields. (to make it a public properties if 
used externally in future)
   
   The same to DAG.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-08-08 Thread GitBox
kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r312159084
 
 

 ##
 File path: airflow/models/serialized_dag.py
 ##
 @@ -0,0 +1,143 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Serialzed DAG table in database."""
+
+import hashlib
+from typing import Any, Dict, List, Optional, TYPE_CHECKING
+from sqlalchemy import Column, Index, Integer, String, Text, and_
+from sqlalchemy.sql import exists
+
+from airflow.models.base import Base, ID_LEN
+from airflow.utils import timezone
+from airflow.utils.db import provide_session
+from airflow.utils.sqlalchemy import UtcDateTime
+
+
+if TYPE_CHECKING:
+from airflow.dag.serialization.serialized_dag import SerializedDAG  # 
noqa: F401, E501; # pylint: disable=cyclic-import
+from airflow.models import DAG  # noqa: F401; # pylint: 
disable=cyclic-import
+
+
+class SerializedDagModel(Base):
+"""A database table for serialized DAGs."""
+
+__tablename__ = 'serialized_dag'
+
+dag_id = Column(String(ID_LEN), primary_key=True)
+fileloc = Column(String(2000))
+# The max length of fileloc exceeds the limit of indexing.
+fileloc_hash = Column(Integer)
+data = Column(Text)
+last_updated = Column(UtcDateTime)
+
+__table_args__ = (
+Index('idx_fileloc_hash', fileloc_hash, unique=False),
+)
+
+def __init__(self, dag):
+from airflow.dag.serialization import Serialization
+
+self.dag_id = dag.dag_id
+self.fileloc = dag.full_filepath
+self.fileloc_hash = SerializedDagModel.dag_fileloc_hash(self.fileloc)
+self.data = Serialization.to_json(dag)
+self.last_updated = timezone.utcnow()
+
+@staticmethod
+def dag_fileloc_hash(full_filepath: str) -> int:
+Hashing file location for indexing.
+
+:param full_filepath: full filepath of DAG file
+:return: hashed full_filepath
+"""
+# Truncates hash to 4 bytes.
+# TODO(coufon): hashing is needed because the length of fileloc is 
2000 as
+# an Airflow convention, which is over the limit of indexing. If we can
+return int(0x & int(
+hashlib.sha1(full_filepath.encode('utf-8')).hexdigest(), 16))
+
+@classmethod
+@provide_session
+def write_dag(cls, dag: 'DAG', min_update_interval: Optional[int] = None, 
session=None):
+"""Serializes a DAG and writes it into database.
+
+:param dag: a DAG to be written into database
+:param min_update_interval: minimal interval in seconds to update 
serialized DAG
+"""
+if min_update_interval is not None:
+result = session.query(cls.last_updated).filter(
+cls.dag_id == dag.dag_id).first()
+if result is not None and (
+timezone.utcnow() - result.last_updated).total_seconds() < 
min_update_interval:
+return
+session.merge(cls(dag))
+session.commit()
 
 Review comment:
   @coufon - lets use that `create_session()` function instead of adding 
rolling back. May be you agreed to that only but just in-case


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] coufon commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-08-08 Thread GitBox
coufon commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r312154855
 
 

 ##
 File path: airflow/models/serialized_dag.py
 ##
 @@ -0,0 +1,143 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Serialzed DAG table in database."""
+
+import hashlib
+from typing import Any, Dict, List, Optional, TYPE_CHECKING
+from sqlalchemy import Column, Index, Integer, String, Text, and_
+from sqlalchemy.sql import exists
+
+from airflow.models.base import Base, ID_LEN
+from airflow.utils import timezone
+from airflow.utils.db import provide_session
+from airflow.utils.sqlalchemy import UtcDateTime
+
+
+if TYPE_CHECKING:
+from airflow.dag.serialization.serialized_dag import SerializedDAG  # 
noqa: F401, E501; # pylint: disable=cyclic-import
+from airflow.models import DAG  # noqa: F401; # pylint: 
disable=cyclic-import
+
+
+class SerializedDagModel(Base):
+"""A database table for serialized DAGs."""
+
+__tablename__ = 'serialized_dag'
+
+dag_id = Column(String(ID_LEN), primary_key=True)
+fileloc = Column(String(2000))
+# The max length of fileloc exceeds the limit of indexing.
+fileloc_hash = Column(Integer)
+data = Column(Text)
+last_updated = Column(UtcDateTime)
+
+__table_args__ = (
+Index('idx_fileloc_hash', fileloc_hash, unique=False),
+)
+
+def __init__(self, dag):
+from airflow.dag.serialization import Serialization
+
+self.dag_id = dag.dag_id
+self.fileloc = dag.full_filepath
+self.fileloc_hash = SerializedDagModel.dag_fileloc_hash(self.fileloc)
+self.data = Serialization.to_json(dag)
+self.last_updated = timezone.utcnow()
+
+@staticmethod
+def dag_fileloc_hash(full_filepath: str) -> int:
+Hashing file location for indexing.
+
+:param full_filepath: full filepath of DAG file
+:return: hashed full_filepath
+"""
+# Truncates hash to 4 bytes.
+# TODO(coufon): hashing is needed because the length of fileloc is 
2000 as
+# an Airflow convention, which is over the limit of indexing. If we can
+return int(0x & int(
+hashlib.sha1(full_filepath.encode('utf-8')).hexdigest(), 16))
+
+@classmethod
+@provide_session
+def write_dag(cls, dag: 'DAG', min_update_interval: Optional[int] = None, 
session=None):
+"""Serializes a DAG and writes it into database.
+
+:param dag: a DAG to be written into database
+:param min_update_interval: minimal interval in seconds to update 
serialized DAG
+"""
+if min_update_interval is not None:
+result = session.query(cls.last_updated).filter(
+cls.dag_id == dag.dag_id).first()
+if result is not None and (
+timezone.utcnow() - result.last_updated).total_seconds() < 
min_update_interval:
+return
+session.merge(cls(dag))
+session.commit()
 
 Review comment:
   Yes I think it is a best practice. I will add that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] milton0825 commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-08-08 Thread GitBox
milton0825 commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r312151765
 
 

 ##
 File path: airflow/migrations/versions/d38e04c12aa2_add_serialized_dag_table.py
 ##
 @@ -0,0 +1,50 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""add serialized_dag table
+
+Revision ID: d38e04c12aa2
+Revises: 6e96a59344a4
+Create Date: 2019-08-01 14:39:35.616417
+
+"""
+
+# revision identifiers, used by Alembic.
+revision = 'd38e04c12aa2'
+down_revision = '6e96a59344a4'
+branch_labels = None
+depends_on = None
+
+from alembic import op
+import sqlalchemy as sa
+
+
+def upgrade():
+op.create_table('serialized_dag',
 
 Review comment:
   Make sense


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-08-08 Thread GitBox
kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r312151490
 
 

 ##
 File path: airflow/migrations/versions/d38e04c12aa2_add_serialized_dag_table.py
 ##
 @@ -0,0 +1,50 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""add serialized_dag table
+
+Revision ID: d38e04c12aa2
+Revises: 6e96a59344a4
+Create Date: 2019-08-01 14:39:35.616417
+
+"""
+
+# revision identifiers, used by Alembic.
+revision = 'd38e04c12aa2'
+down_revision = '6e96a59344a4'
+branch_labels = None
+depends_on = None
+
+from alembic import op
+import sqlalchemy as sa
+
+
+def upgrade():
+op.create_table('serialized_dag',
 
 Review comment:
   And as we want this feature to be completely optional till 2.0 - it is 
better to have a new table instead.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] coufon commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-08-08 Thread GitBox
coufon commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r312150236
 
 

 ##
 File path: airflow/migrations/versions/d38e04c12aa2_add_serialized_dag_table.py
 ##
 @@ -0,0 +1,50 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""add serialized_dag table
+
+Revision ID: d38e04c12aa2
+Revises: 6e96a59344a4
+Create Date: 2019-08-01 14:39:35.616417
+
+"""
+
+# revision identifiers, used by Alembic.
+revision = 'd38e04c12aa2'
+down_revision = '6e96a59344a4'
+branch_labels = None
+depends_on = None
+
+from alembic import op
+import sqlalchemy as sa
+
+
+def upgrade():
+op.create_table('serialized_dag',
 
 Review comment:
   Thanks for asking. There are a few benefits to create a few table:
   
   - As this feature is new and may be unstable, we would like to minimize its 
influence on current Airflow code (and db performance), especially things on 
the critical path. Using a separate table is safer. At least we know the dag 
table's behavior is not changed
   - Merging these two tables make the dag table wide: it needs a hash of file 
path, a 'DELETED' maker, and a JSON data column.
   - Logically serialized_dag is a snapshot of the files we have in DAG folder, 
but the dag table is not
   
   I am open for changes. Any ideas?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-5147) Annotations for k8s executors should support extended alphabet (like '/'))

2019-08-08 Thread Andrei Loginov (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Loginov updated AIRFLOW-5147:

Description: 
The fix to introduce k8s annotations for executors 
([https://github.com/apache/airflow/pull/4589] for 
https://issues.apache.org/jira/browse/AIRFLOW-3766) limited the character set 
allowed for the annotation key to [-._a-zA-Z0-9] set. However many annotations 
contain `/` in it, for example: 
{code:java}
injector.tumblr.com/request{code}
 or
{code:java}
iam.amazonaws.com/role{code}
Which would not be allowed in the current solution.

 

I believe original solution should be completely revisited. And instead of 
using a separate *kubernetes_annotations* section there should be a key which 
will contain a set of key:value annotations in some format. E.g. json:
{code:java}

[kubernetes]
annotations = { "iam.amazonaws.com/role": 
"arn:aws:iam:::role/some-role-CKU5HL9BIPXG", "some-other-anno-key": 
"some/value" }
{code}
 

  was:
The fix to introduce k8s annotations for executors 
([https://github.com/apache/airflow/pull/4589] for 
https://issues.apache.org/jira/browse/AIRFLOW-3766) limited the character set 
allowed for the annotation key to [-._a-zA-Z0-9] set. However many annotations 
contain `/` in it, for example: 
{code:java}
injector.tumblr.com/request{code}
 or
{code:java}
iam.amazonaws.com/role{code}
Which would not be allowed in the current solution.

 

I believe original solution should be completely revisited. And instead of 
using a section there should be a key which will contain a set of key:value 
annotations.


> Annotations for k8s executors should support extended alphabet (like '/')) 
> ---
>
> Key: AIRFLOW-5147
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5147
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executor-kubernetes, executors
>Affects Versions: 1.10.3, 1.10.4
>Reporter: Andrei Loginov
>Assignee: Daniel Imberman
>Priority: Major
>
> The fix to introduce k8s annotations for executors 
> ([https://github.com/apache/airflow/pull/4589] for 
> https://issues.apache.org/jira/browse/AIRFLOW-3766) limited the character set 
> allowed for the annotation key to [-._a-zA-Z0-9] set. However many 
> annotations contain `/` in it, for example: 
> {code:java}
> injector.tumblr.com/request{code}
>  or
> {code:java}
> iam.amazonaws.com/role{code}
> Which would not be allowed in the current solution.
>  
> I believe original solution should be completely revisited. And instead of 
> using a separate *kubernetes_annotations* section there should be a key which 
> will contain a set of key:value annotations in some format. E.g. json:
> {code:java}
> [kubernetes]
> annotations = { "iam.amazonaws.com/role": 
> "arn:aws:iam:::role/some-role-CKU5HL9BIPXG", "some-other-anno-key": 
> "some/value" }
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [airflow] codecov-io edited a comment on issue #4294: [AIRFLOW-3451] Add UI button to refresh all dags

2019-08-08 Thread GitBox
codecov-io edited a comment on issue #4294: [AIRFLOW-3451] Add UI button to 
refresh all dags
URL: https://github.com/apache/airflow/pull/4294#issuecomment-445349042
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/4294?src=pr&el=h1) 
Report
   > :exclamation: No coverage uploaded for pull request base 
(`master@3ee2dcb`). [Click here to learn what that 
means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/4294/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/airflow/pull/4294?src=pr&el=tree)
   
   ```diff
   @@Coverage Diff@@
   ## master#4294   +/-   ##
   =
 Coverage  ?   74.09%   
   =
 Files ?  421   
 Lines ?27665   
 Branches  ?0   
   =
 Hits  ?20497   
 Misses? 7168   
 Partials  ?0
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/4294?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/4294?src=pr&el=footer). 
Last update 
[3ee2dcb...7cba6ac](https://codecov.io/gh/apache/airflow/pull/4294?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-5147) Annotations for k8s executors should support extended alphabet (like '/'))

2019-08-08 Thread Andrei Loginov (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Loginov updated AIRFLOW-5147:

Description: 
The fix to introduce k8s annotations for executors 
([https://github.com/apache/airflow/pull/4589] for 
https://issues.apache.org/jira/browse/AIRFLOW-3766) limited the character set 
allowed for the annotation key to [-._a-zA-Z0-9] set. However many annotations 
contain `/` in it, for example: 
{code:java}
injector.tumblr.com/request{code}
 or
{code:java}
iam.amazonaws.com/role{code}
Which would not be allowed in the current solution.

 

I believe original solution should be completely revisited. And instead of 
using a section there should be a key which will contain a set of key:value 
annotations.

  was:
The fix to introduce k8s annotations for executors 
([https://github.com/apache/airflow/pull/4589] for 
https://issues.apache.org/jira/browse/AIRFLOW-3766) limited the character set 
allowed for the annotation key to [-._a-zA-Z0-9] set. However many annotations 
contain `/` in it, for example: 
{code:java}
injector.tumblr.com/request{code}
 or
{code:java}
iam.amazonaws.com/role{code}
Which would not be allowed in the current solution.


> Annotations for k8s executors should support extended alphabet (like '/')) 
> ---
>
> Key: AIRFLOW-5147
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5147
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executor-kubernetes, executors
>Affects Versions: 1.10.3, 1.10.4
>Reporter: Andrei Loginov
>Assignee: Daniel Imberman
>Priority: Major
>
> The fix to introduce k8s annotations for executors 
> ([https://github.com/apache/airflow/pull/4589] for 
> https://issues.apache.org/jira/browse/AIRFLOW-3766) limited the character set 
> allowed for the annotation key to [-._a-zA-Z0-9] set. However many 
> annotations contain `/` in it, for example: 
> {code:java}
> injector.tumblr.com/request{code}
>  or
> {code:java}
> iam.amazonaws.com/role{code}
> Which would not be allowed in the current solution.
>  
> I believe original solution should be completely revisited. And instead of 
> using a section there should be a key which will contain a set of key:value 
> annotations.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (AIRFLOW-5147) Annotations for k8s executors should support extended alphabet (like '/'))

2019-08-08 Thread Andrei Loginov (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Loginov updated AIRFLOW-5147:

Description: 
The fix to introduce k8s annotations for executors 
([https://github.com/apache/airflow/pull/4589] for 
https://issues.apache.org/jira/browse/AIRFLOW-3766) limited the character set 
allowed for the annotation key to [-._a-zA-Z0-9] set. However many annotations 
contain `/` in it, for example: 
{code:java}
injector.tumblr.com/request{code}
 or
{code:java}
iam.amazonaws.com/role{code}
Which would not be allowed in the current solution.

  was:
The fix to introduce k8s annotations for executors 
([https://github.com/apache/airflow/pull/4589] for 
https://issues.apache.org/jira/browse/AIRFLOW-3766) limited the character set 
allowed for the annotation key to [-._a-zA-Z0-9] set. However many annotations 
contain `/` in it, for example: 
{code:java}
injector.tumblr.com/request{code}
{{}} or
{code:java}
iam.amazonaws.com/role{code}
Which would not be allowed in the current solution.


> Annotations for k8s executors should support extended alphabet (like '/')) 
> ---
>
> Key: AIRFLOW-5147
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5147
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executor-kubernetes, executors
>Affects Versions: 1.10.3, 1.10.4
>Reporter: Andrei Loginov
>Assignee: Daniel Imberman
>Priority: Major
>
> The fix to introduce k8s annotations for executors 
> ([https://github.com/apache/airflow/pull/4589] for 
> https://issues.apache.org/jira/browse/AIRFLOW-3766) limited the character set 
> allowed for the annotation key to [-._a-zA-Z0-9] set. However many 
> annotations contain `/` in it, for example: 
> {code:java}
> injector.tumblr.com/request{code}
>  or
> {code:java}
> iam.amazonaws.com/role{code}
> Which would not be allowed in the current solution.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (AIRFLOW-5147) Annotations for k8s executors should support extended alphabet (like '/'))

2019-08-08 Thread Andrei Loginov (JIRA)
Andrei Loginov created AIRFLOW-5147:
---

 Summary: Annotations for k8s executors should support extended 
alphabet (like '/')) 
 Key: AIRFLOW-5147
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5147
 Project: Apache Airflow
  Issue Type: Bug
  Components: executor-kubernetes, executors
Affects Versions: 1.10.4, 1.10.3
Reporter: Andrei Loginov
Assignee: Daniel Imberman


The fix to introduce k8s annotations for executors 
([https://github.com/apache/airflow/pull/4589] for 
https://issues.apache.org/jira/browse/AIRFLOW-3766) limited the character set 
allowed for the annotation key to [-._a-zA-Z0-9] set. However many annotations 
contain `/` in it, for example: 
{code:java}
injector.tumblr.com/request{code}
{{}} or
{code:java}
iam.amazonaws.com/role{code}
Which would not be allowed in the current solution.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [airflow] OmerJog commented on issue #4294: [AIRFLOW-3451] Add UI button to refresh all dags

2019-08-08 Thread GitBox
OmerJog commented on issue #4294: [AIRFLOW-3451] Add UI button to refresh all 
dags
URL: https://github.com/apache/airflow/pull/4294#issuecomment-519602187
 
 
   @idavison ping


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-5143) Corrupted rat.jar became part of the Docker image

2019-08-08 Thread Jarek Potiuk (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-5143:
--
Fix Version/s: (was: 2.0.0)
   1.10.5

> Corrupted rat.jar became part of the Docker image
> -
>
> Key: AIRFLOW-5143
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5143
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.5
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (AIRFLOW-5143) Corrupted rat.jar became part of the Docker image

2019-08-08 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903137#comment-16903137
 ] 

ASF subversion and git services commented on AIRFLOW-5143:
--

Commit 0ee7967c9f0cce5ded00e3258c1ae1e6ffe23087 in airflow's branch 
refs/heads/v1-10-test from Jarek Potiuk
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=0ee7967 ]

[AIRFLOW-5143] Fix for potentially corrupted .jar (#5759)

(cherry picked from commit 8288cf1c12720ed64e8337b7a34493495b13931f)


> Corrupted rat.jar became part of the Docker image
> -
>
> Key: AIRFLOW-5143
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5143
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.5
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [airflow] potiuk commented on issue #5760: AIRFLOW-5139 Allow custom ES configs

2019-08-08 Thread GitBox
potiuk commented on issue #5760: AIRFLOW-5139 Allow custom ES configs
URL: https://github.com/apache/airflow/pull/5760#issuecomment-519597832
 
 
   Hey @dimberman -> it should all be fixed. The image is rebuilt in Dockerhub 
and all the subsequent builds will use the "proper" rat from the Docker image :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Comment Edited] (AIRFLOW-351) Failed to clear downstream tasks

2019-08-08 Thread Oleg Khavronin (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903067#comment-16903067
 ] 

Oleg Khavronin edited comment on AIRFLOW-351 at 8/8/19 3:22 PM:


[~jackjack10]j, our Airflow instances run inside docker container (either 
local, or in GKE). So, operating system is Linux.


was (Author: okhavronin):
[~jackjack10]j, our Airflow instances run inside docker container (either 
local, or in GKE). So, OS is Linux.

> Failed to clear downstream tasks
> 
>
> Key: AIRFLOW-351
> URL: https://issues.apache.org/jira/browse/AIRFLOW-351
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: models, operators, webserver
>Affects Versions: 1.7.1.3
>Reporter: Adinata
>Priority: Major
>  Labels: subdag
> Attachments: dag_error.py, error.log, error_on_clear_dag.txt, 
> ubuntu-14-packages.log, ubuntu-16-oops.log, ubuntu-16-packages.log
>
>
> {code}
>   / (  ()   )  \___
>  /( (  (  )   _))  )   )\
>(( (   )()  )   (   )  )
>  ((/  ( _(   )   (   _) ) (  () )  )
> ( (  ( (_)   (((   )  .((_ ) .  )_
>( (  )(  (  ))   ) . ) (   )
>   (  (   (  (   ) (  _  ( _) ).  ) . ) ) ( )
>   ( (  (   ) (  )   (  )) ) _)(   )  )  )
>  ( (  ( \ ) ((_  ( ) ( )  )   ) )  )) ( )
>   (  (   (  (   (_ ( ) ( _)  ) (  )  )   )
>  ( (  ( (  (  ) (_  )  ) )  _)   ) _( ( )
>   ((  (   )(( _)   _) _(_ (  (_ )
>(_((__(_(__(( ( ( |  ) ) ) )_))__))_)___)
>((__)\\||lll|l||///  \_))
> (   /(/ (  )  ) )\   )
>   (( ( ( | | ) ) )\   )
>(   /(| / ( )) ) ) )) )
>  ( ( _(|)_) )
>   (  ||\(|(|)|/|| )
> (|(||(||))
>   ( //|/l|||)|\\ \ )
> (/ / //  /|//\\  \ \  \ _)
> ---
> Node: 9889a7c79e9b
> ---
> Traceback (most recent call last):
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1817, in 
> wsgi_app
> response = self.full_dispatch_request()
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1477, in 
> full_dispatch_request
> rv = self.handle_user_exception(e)
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1381, in 
> handle_user_exception
> reraise(exc_type, exc_value, tb)
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1475, in 
> full_dispatch_request
> rv = self.dispatch_request()
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1461, in 
> dispatch_request
> return self.view_functions[rule.endpoint](**req.view_args)
>   File "/usr/local/lib/python2.7/dist-packages/flask_admin/base.py", line 68, 
> in inner
> return self._run_view(f, *args, **kwargs)
>   File "/usr/local/lib/python2.7/dist-packages/flask_admin/base.py", line 
> 367, in _run_view
> return fn(self, *args, **kwargs)
>   File "/usr/local/lib/python2.7/dist-packages/flask_login.py", line 755, in 
> decorated_view
> return func(*args, **kwargs)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/www/utils.py", line 
> 118, in wrapper
> return f(*args, **kwargs)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/www/utils.py", line 
> 167, in wrapper
> return f(*args, **kwargs)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/www/views.py", line 
> 1017, in clear
> include_upstream=upstream)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 2870, 
> in sub_dag
> dag = copy.deepcopy(self)
>   File "/usr/lib/python2.7/copy.py", line 174, in deepcopy
> y = copier(memo)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 2856, 
> in __deepcopy__
> setattr(result, k, copy.deepcopy(v, memo))
>   File "/usr/lib/python2.7/copy.py", line 163, in deepcopy
> y = copier(x, memo)
>   File "/usr/lib/python2.7/copy.py", line 257, in _deepcopy_dict
> y[deepcopy(key, memo)] = deepcopy(value, memo)
>   File "/usr/lib/python2.7/copy.py", line 174, in deepcopy
> y = copier(memo)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 1974, 
> in __deepcopy__
> setattr(result, k, copy.deepcopy(v, memo))
>   File "/usr/lib/py

[jira] [Commented] (AIRFLOW-351) Failed to clear downstream tasks

2019-08-08 Thread Oleg Khavronin (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903067#comment-16903067
 ] 

Oleg Khavronin commented on AIRFLOW-351:


[~jackjack10]j, our Airflow instances run inside docker container (either 
local, or in GKE). So, OS is Linux.

> Failed to clear downstream tasks
> 
>
> Key: AIRFLOW-351
> URL: https://issues.apache.org/jira/browse/AIRFLOW-351
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: models, operators, webserver
>Affects Versions: 1.7.1.3
>Reporter: Adinata
>Priority: Major
>  Labels: subdag
> Attachments: dag_error.py, error.log, error_on_clear_dag.txt, 
> ubuntu-14-packages.log, ubuntu-16-oops.log, ubuntu-16-packages.log
>
>
> {code}
>   / (  ()   )  \___
>  /( (  (  )   _))  )   )\
>(( (   )()  )   (   )  )
>  ((/  ( _(   )   (   _) ) (  () )  )
> ( (  ( (_)   (((   )  .((_ ) .  )_
>( (  )(  (  ))   ) . ) (   )
>   (  (   (  (   ) (  _  ( _) ).  ) . ) ) ( )
>   ( (  (   ) (  )   (  )) ) _)(   )  )  )
>  ( (  ( \ ) ((_  ( ) ( )  )   ) )  )) ( )
>   (  (   (  (   (_ ( ) ( _)  ) (  )  )   )
>  ( (  ( (  (  ) (_  )  ) )  _)   ) _( ( )
>   ((  (   )(( _)   _) _(_ (  (_ )
>(_((__(_(__(( ( ( |  ) ) ) )_))__))_)___)
>((__)\\||lll|l||///  \_))
> (   /(/ (  )  ) )\   )
>   (( ( ( | | ) ) )\   )
>(   /(| / ( )) ) ) )) )
>  ( ( _(|)_) )
>   (  ||\(|(|)|/|| )
> (|(||(||))
>   ( //|/l|||)|\\ \ )
> (/ / //  /|//\\  \ \  \ _)
> ---
> Node: 9889a7c79e9b
> ---
> Traceback (most recent call last):
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1817, in 
> wsgi_app
> response = self.full_dispatch_request()
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1477, in 
> full_dispatch_request
> rv = self.handle_user_exception(e)
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1381, in 
> handle_user_exception
> reraise(exc_type, exc_value, tb)
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1475, in 
> full_dispatch_request
> rv = self.dispatch_request()
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1461, in 
> dispatch_request
> return self.view_functions[rule.endpoint](**req.view_args)
>   File "/usr/local/lib/python2.7/dist-packages/flask_admin/base.py", line 68, 
> in inner
> return self._run_view(f, *args, **kwargs)
>   File "/usr/local/lib/python2.7/dist-packages/flask_admin/base.py", line 
> 367, in _run_view
> return fn(self, *args, **kwargs)
>   File "/usr/local/lib/python2.7/dist-packages/flask_login.py", line 755, in 
> decorated_view
> return func(*args, **kwargs)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/www/utils.py", line 
> 118, in wrapper
> return f(*args, **kwargs)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/www/utils.py", line 
> 167, in wrapper
> return f(*args, **kwargs)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/www/views.py", line 
> 1017, in clear
> include_upstream=upstream)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 2870, 
> in sub_dag
> dag = copy.deepcopy(self)
>   File "/usr/lib/python2.7/copy.py", line 174, in deepcopy
> y = copier(memo)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 2856, 
> in __deepcopy__
> setattr(result, k, copy.deepcopy(v, memo))
>   File "/usr/lib/python2.7/copy.py", line 163, in deepcopy
> y = copier(x, memo)
>   File "/usr/lib/python2.7/copy.py", line 257, in _deepcopy_dict
> y[deepcopy(key, memo)] = deepcopy(value, memo)
>   File "/usr/lib/python2.7/copy.py", line 174, in deepcopy
> y = copier(memo)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 1974, 
> in __deepcopy__
> setattr(result, k, copy.deepcopy(v, memo))
>   File "/usr/lib/python2.7/copy.py", line 190, in deepcopy
> y = _reconstruct(x, rv, 1, memo)
>   File "/usr/lib/python2.7/copy.py", line 334, in _reconstruct
> state = deepcopy(state, memo)
>   File "/usr/lib/p

[jira] [Updated] (AIRFLOW-5146) Add ability to monitor queueing duration of DAG runs

2019-08-08 Thread Christian (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian updated AIRFLOW-5146:
---
Description: 
As a system administrator I would like to monitor the system load based on the 
duration tasks reside in the {{QUEUED}} state. With this information it is 
possible to see potential bottlenecks which may be caused by ,e.g., 
{{schedule_intervals}} that fall together.

Right now this information cannot be retrieved easily from the database without 
writing additional lines of code, to extract that information.

  was:
As a system administrator I would like to monitor the system load based on the 
duration tasks reside in the `QUEUED` state. With this information it is 
possible to see potential bottlenecks which may be caused by ,e.g., 
`schedule_intervals` that fall together.

Right now this information cannot be retrieved easily from the database without 
writing additional lines of code, to extract that information.


> Add ability to monitor queueing duration of DAG runs
> 
>
> Key: AIRFLOW-5146
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5146
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: DagRun, database
>Affects Versions: 2.0.0, 1.10.5
>Reporter: Christian
>Priority: Minor
>
> As a system administrator I would like to monitor the system load based on 
> the duration tasks reside in the {{QUEUED}} state. With this information it 
> is possible to see potential bottlenecks which may be caused by ,e.g., 
> {{schedule_intervals}} that fall together.
> Right now this information cannot be retrieved easily from the database 
> without writing additional lines of code, to extract that information.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (AIRFLOW-5146) Add ability to monitor queueing duration of DAG runs

2019-08-08 Thread Christian (JIRA)
Christian created AIRFLOW-5146:
--

 Summary: Add ability to monitor queueing duration of DAG runs
 Key: AIRFLOW-5146
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5146
 Project: Apache Airflow
  Issue Type: Wish
  Components: DagRun, database
Affects Versions: 2.0.0, 1.10.5
Reporter: Christian


As a system administrator I would like to monitor the system load based on the 
duration tasks reside in the `QUEUED` state. With this information it is 
possible to see potential bottlenecks which may be caused by ,e.g., 
`schedule_intervals` that fall together.

Right now this information cannot be retrieved easily from the database without 
writing additional lines of code, to extract that information.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (AIRFLOW-5145) rbac ui presents false choice to encrypt or not encrypt variable values

2019-08-08 Thread Jon Stern (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903055#comment-16903055
 ] 

Jon Stern commented on AIRFLOW-5145:


Created [https://github.com/apache/airflow/pull/5761] to address.

> rbac ui presents false choice to encrypt or not encrypt variable values
> ---
>
> Key: AIRFLOW-5145
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5145
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 1.10.3, 1.10.4
>Reporter: Jon Stern
>Priority: Minor
>
> With webserver rbac=True and core fernet_key set, variable values are always 
> encrypted, but they are only marked as is_encrypted = true in the database if 
> the user explicitly checks the *Is Encrypted* checkbox on the variable create 
> screen. If a variable is set up this way, then it is not correctly displayed 
> in the UI and calls to Variable.get return the cipher text instead of the 
> decrypted value.
> Workarounds are:
>  * Explicitly check the box in the UI if you know you've got a fernet key set 
> up.
>  * Edit the variable after creation and re-enter the value. On the edit 
> screen there is no checkbox, and is_encrypted is properly set to true on 
> saving.
> Since the only right answer is is_encrypted = true, it seems like the 
> checkbox should not be displayed (as in the edit screen and as in the 
> non-rbac create screen)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [airflow] jstern opened a new pull request #5761: take false is_encrypted choice away from user

2019-08-08 Thread GitBox
jstern opened a new pull request #5761: take false is_encrypted choice away 
from user
URL: https://github.com/apache/airflow/pull/5761
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-5145
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Don't display the *Is Encrypted* checkbox in the Variable add UI; the user's 
decision (explicit or implicit) here can incorrectly override the correct 
setting of this value when the model is saved.
   
   Current `/variable/add` in my local 1.10.3 instance:
   
   https://user-images.githubusercontent.com/617540/62714062-71c4e080-b9c3-11e9-8562-84c20742c10a.png";>
   
   After change in that instance:
   
   https://user-images.githubusercontent.com/617540/62714074-7b4e4880-b9c3-11e9-9319-deeabf640ede.png";>
   
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   * No tests added yet as this is a minor config change ... but I'm new to 
this to so willing to stand corrected :)
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (AIRFLOW-5143) Corrupted rat.jar became part of the Docker image

2019-08-08 Thread Jarek Potiuk (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-5143.
---
   Resolution: Fixed
Fix Version/s: 2.0.0

> Corrupted rat.jar became part of the Docker image
> -
>
> Key: AIRFLOW-5143
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5143
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (AIRFLOW-5143) Corrupted rat.jar became part of the Docker image

2019-08-08 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903049#comment-16903049
 ] 

ASF subversion and git services commented on AIRFLOW-5143:
--

Commit 8288cf1c12720ed64e8337b7a34493495b13931f in airflow's branch 
refs/heads/master from Jarek Potiuk
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=8288cf1 ]

[AIRFLOW-5143] Fix for potentially corrupted .jar (#5759)



> Corrupted rat.jar became part of the Docker image
> -
>
> Key: AIRFLOW-5143
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5143
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [airflow] potiuk merged pull request #5759: [AIRFLOW-5143] Fix for potentially corrupted .jar

2019-08-08 Thread GitBox
potiuk merged pull request #5759: [AIRFLOW-5143] Fix for potentially corrupted 
.jar
URL: https://github.com/apache/airflow/pull/5759
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-5143) Corrupted rat.jar became part of the Docker image

2019-08-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903047#comment-16903047
 ] 

ASF GitHub Bot commented on AIRFLOW-5143:
-

potiuk commented on pull request #5759: [AIRFLOW-5143] Fix for potentially 
corrupted .jar
URL: https://github.com/apache/airflow/pull/5759
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Corrupted rat.jar became part of the Docker image
> -
>
> Key: AIRFLOW-5143
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5143
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (AIRFLOW-5145) rbac ui presents false choice to encrypt or not encrypt variable values

2019-08-08 Thread Jon Stern (JIRA)
Jon Stern created AIRFLOW-5145:
--

 Summary: rbac ui presents false choice to encrypt or not encrypt 
variable values
 Key: AIRFLOW-5145
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5145
 Project: Apache Airflow
  Issue Type: Bug
  Components: ui
Affects Versions: 1.10.4, 1.10.3
Reporter: Jon Stern


With webserver rbac=True and core fernet_key set, variable values are always 
encrypted, but they are only marked as is_encrypted = true in the database if 
the user explicitly checks the *Is Encrypted* checkbox on the variable create 
screen. If a variable is set up this way, then it is not correctly displayed in 
the UI and calls to Variable.get return the cipher text instead of the 
decrypted value.

Workarounds are:
 * Explicitly check the box in the UI if you know you've got a fernet key set 
up.
 * Edit the variable after creation and re-enter the value. On the edit screen 
there is no checkbox, and is_encrypted is properly set to true on saving.

Since the only right answer is is_encrypted = true, it seems like the checkbox 
should not be displayed (as in the edit screen and as in the non-rbac create 
screen)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


  1   2   >