[GitHub] [airflow] dstandish edited a comment on pull request #15100: Add support for arbitrary json in conn uri format

2021-03-31 Thread GitBox


dstandish edited a comment on pull request #15100:
URL: https://github.com/apache/airflow/pull/15100#issuecomment-811659712


   ok so to be clear @mik-laj you are in support of supporting arbitrary dict 
under `extra`, just not primative or list?
   
   OR, you do _not_ support this PR at all and you think extra should just be 
dict of strings i.e `'key1':'val1'` and so on, e.g. _not_ 
`'key1':{'key3':'key4'}`
   
   if you support this i'll remove the tests for primitive and list, so that 
the codebase does not explicitly endorse them.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] dstandish edited a comment on pull request #15100: Add support for arbitrary json in conn uri format

2021-03-31 Thread GitBox


dstandish edited a comment on pull request #15100:
URL: https://github.com/apache/airflow/pull/15100#issuecomment-811659712


   ok so to be clear @mik-laj you are in support of supporting arbitrary dict 
under `extra`, just not primative or list?
   
   OR, you do _not_ support this PR at all and you think extra should just be 
dict of strings i.e `'key1':'val1'` and so on, e.g. _not_ 
`'key1':{'key3':'key4'}`
   
   if you support this i'll remove the tests for primitive and list, so the 
codebase does not explicitly endorse them.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ephraimbuddy commented on pull request #15124: Minor Helm Chart doc enhancements

2021-03-31 Thread GitBox


ephraimbuddy commented on pull request #15124:
URL: https://github.com/apache/airflow/pull/15124#issuecomment-811660724


   There's a little mistake on the quickstart.rst 
https://github.com/apache/airflow/blob/master/docs/helm-chart/quick-start.rst 
for installing a release:
   `helm install airflow --n airflow .` should be 
   `helm install airflow -n airflow .`
   
   Wondering if we can add it to this PR?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] dstandish edited a comment on pull request #15100: Add support for arbitrary json in conn uri format

2021-03-31 Thread GitBox


dstandish edited a comment on pull request #15100:
URL: https://github.com/apache/airflow/pull/15100#issuecomment-811659712


   ok so to be clear @mik-laj you are in support of supporting arbitrary dict 
under `extra`, just not primative or list?
   
   OR, you do _not_ support this PR at all and you think extra should just be 
dict of strings i.e `'key1':'val1'` and so on, e.g. _not_ 
`'key1':{'key3':'key4'}`
   
   if you support this i'll go ahead and remove primitive and list but keep 
arbitrary dict.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] dstandish commented on pull request #15100: Add support for arbitrary json in conn uri format

2021-03-31 Thread GitBox


dstandish commented on pull request #15100:
URL: https://github.com/apache/airflow/pull/15100#issuecomment-811659712


   ok so to be clear @mik-laj you are in support of supporting arbitrary dict 
under `extra`, just not primative or list?
   
   OR, you do _not_ support this PR at all and you think extra should just be 
dict of strings i.e `'key1':'val1'` and so on, e.g. _not_ 
`'key1':{'key3':'key4'}`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] dstandish commented on a change in pull request #15013: Enable Connection creation from Vault parameters

2021-03-31 Thread GitBox


dstandish commented on a change in pull request #15013:
URL: https://github.com/apache/airflow/pull/15013#discussion_r605388747



##
File path: tests/providers/hashicorp/secrets/test_vault.py
##
@@ -59,6 +59,48 @@ def test_get_conn_uri(self, mock_hvac):
 returned_uri = test_client.get_conn_uri(conn_id="test_postgres")
 assert 'postgresql://airflow:airflow@host:5432/airflow' == returned_uri
 
+
@mock.patch("airflow.providers.hashicorp._internal_client.vault_client.hvac")
+def test_get_connection(self, mock_hvac):
+mock_client = mock.MagicMock()
+mock_hvac.Client.return_value = mock_client
+mock_client.secrets.kv.v2.read_secret_version.return_value = {
+'request_id': '94011e25-f8dc-ec29-221b-1f9c1d9ad2ae',
+'lease_id': '',
+'renewable': False,
+'lease_duration': 0,
+'data': {
+'data': {
+'conn_type': 'postgresql',
+'login': 'airflow',
+'password': 'airflow',
+'host': 'host',
+'port': '5432',
+'schema': 'airflow',
+},

Review comment:
   probably should include `extra` since it's a case that should be 
supported
   
   one thought i had is, when storing creds in the "object" style, should we 
make it so that you can add values here at top level and they'll automatically 
get added to extra.
   
   so e.g. we could support this:
   ```
   'data': {
   'conn_type': 'postgresql',
   'login': 'airflow',
   'password': 'airflow',
   'host': 'host',
   'port': '5432',
   'myfield1': 'hello',
   'myfield2': 'hello again',
   'schema': 'airflow',
   },
   ```
   
   and then when bulding the connection object, add these to the extra dict.
   
   it's maybe a little wacky but it would allow for less nesting.  and then 
when defining the values in vault you wouldn't have to dump to json and store 
under `extra` as json string.  just a thought 路‍♂️
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] dstandish opened a new pull request #15125: Add latest only decorator

2021-03-31 Thread GitBox


dstandish opened a new pull request #15125:
URL: https://github.com/apache/airflow/pull/15125


   This decorator lets us implement latest only behavior without adding an 
extraneous task.
   
   You can make any operator a latest only operator by applying this decorator 
to `execute`.
   
   I thought for a minute about pulling the `is_latest_execution` logic out 
into a function shared by both latest only decorator
   and latest only operator, but the logic is just a few lines so I decided it 
wasn't worth the added obfuscation and object proliferation.
   
   Please let me know what you think.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #15124: Minor Helm Chart doc enhancements

2021-03-31 Thread GitBox


github-actions[bot] commented on pull request #15124:
URL: https://github.com/apache/airflow/pull/15124#issuecomment-811613157


   The PR is likely ready to be merged. No tests are needed as no important 
environment files, nor python files were modified by it. However, committers 
might decide that full test matrix is needed and add the 'full tests needed' 
label. Then you should rebase it to the latest master or amend the last commit 
of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] wolfier commented on issue #9975: max_active_runs = 1 can still create multiple active execution runs

2021-03-31 Thread GitBox


wolfier commented on issue #9975:
URL: https://github.com/apache/airflow/issues/9975#issuecomment-811586928


   I couldn't find any mention of this 
[AIRFLOW-2535](https://issues.apache.org/jira/browse/AIRFLOW-2535). 
   
   This is an issue for TriggerDagRunOperators as well in 1.10.10 where 
max_active_runs was not respected.
   
   PR [#13803](https://github.com/apache/airflow/pull/13803) does fix the issue 
in 1.10.15! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil opened a new pull request #15124: Minor Helm Chart doc enhancements

2021-03-31 Thread GitBox


kaxil opened a new pull request #15124:
URL: https://github.com/apache/airflow/pull/15124


   - Better formatting for `tip`
   - Correct listing
   
   **Before**:
   
   
![image](https://user-images.githubusercontent.com/8811558/113229569-25f93600-928f-11eb-8ab8-668efcd63f4c.png)
   
   
![image](https://user-images.githubusercontent.com/8811558/113229595-33aebb80-928f-11eb-8137-e73fc7320c30.png)
   
   
   **After**:
   
   
![image](https://user-images.githubusercontent.com/8811558/113229584-2d204400-928f-11eb-853f-14a5e63bcc84.png)
   
   
![image](https://user-images.githubusercontent.com/8811558/113229612-3ad5c980-928f-11eb-850b-3827d3dbca56.png)
   
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ShakaibKhan commented on issue #15004: Running upgrade_check command doesn't process DAGs located in zip files

2021-03-31 Thread GitBox


ShakaibKhan commented on issue #15004:
URL: https://github.com/apache/airflow/issues/15004#issuecomment-811558458


   Hey,
   I am trying to patch this issue and have a PR up 
https://github.com/apache/airflow/pull/15123 was wondering if I could get some 
guidance on testing my change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] boring-cyborg[bot] commented on pull request #15123: check for zipped files in import rules for upgrade check

2021-03-31 Thread GitBox


boring-cyborg[bot] commented on pull request #15123:
URL: https://github.com/apache/airflow/pull/15123#issuecomment-811557830


   Congratulations on your first Pull Request and welcome to the Apache Airflow 
community! If you have any issues or are unsure about any anything please check 
our Contribution Guide 
(https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, pylint and type 
annotations). Our [pre-commits]( 
https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks)
 will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in 
`docs/` directory). Adding a new operator? Check this short 
[guide](https://github.com/apache/airflow/blob/master/docs/apache-airflow/howto/custom-operator.rst)
 Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze 
environment](https://github.com/apache/airflow/blob/master/BREEZE.rst) for 
testing locally, it’s a heavy docker but it ships with a working Airflow and a 
lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get 
the final approval from Committers.
   - Please follow [ASF Code of 
Conduct](https://www.apache.org/foundation/policies/conduct) for all 
communication including (but not limited to) comments on Pull Requests, Mailing 
list and Slack.
   - Be sure to read the [Airflow Coding style]( 
https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it 
better .
   In case of doubts contact the developers at:
   Mailing List: d...@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ShakaibKhan opened a new pull request #15123: check for zipped files in import rules for upgrade check

2021-03-31 Thread GitBox


ShakaibKhan opened a new pull request #15123:
URL: https://github.com/apache/airflow/pull/15123


   This addresses #15004
   added zip file check for import rules of upgrade check


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] closed pull request #12617: Moved Contributors' Guide to CONTRIBUTING.rst file

2021-03-31 Thread GitBox


github-actions[bot] closed pull request #12617:
URL: https://github.com/apache/airflow/pull/12617


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #13894: critical class definition issue

2021-03-31 Thread GitBox


github-actions[bot] commented on pull request #13894:
URL: https://github.com/apache/airflow/pull/13894#issuecomment-811546288


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed in 5 days if no further activity occurs. 
Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Dr-Denzy commented on pull request #15122: Skip `DatabaseVersionCheckRule` check if invalid version is detected

2021-03-31 Thread GitBox


Dr-Denzy commented on pull request #15122:
URL: https://github.com/apache/airflow/pull/15122#issuecomment-811522320


   > When the version is something like `12.2 (Debian 12.2-2.pgdg100+1)`, the 
upgrade check errors with the following:
   > 
   > ```
   > Traceback (most recent call last):
   >   File "", line 1, in 
   >   File 
"/usr/local/lib/python3.7/site-packages/airflow/upgrade/checker.py", line 130, 
in __main__
   > run(args)
   >   File 
"/usr/local/lib/python3.7/site-packages/airflow/upgrade/checker.py", line 118, 
in run
   > all_problems = check_upgrade(formatter, rules)
   >   File 
"/usr/local/lib/python3.7/site-packages/airflow/upgrade/checker.py", line 38, 
in check_upgrade
   > rule_status = RuleStatus.from_rule(rule)
   >   File 
"/usr/local/lib/python3.7/site-packages/airflow/upgrade/problem.py", line 44, 
in from_rule
   > result = rule.check()
   >   File "/usr/local/lib/python3.7/site-packages/airflow/utils/db.py", line 
74, in wrapper
   > return func(*args, **kwargs)
   >   File 
"/usr/local/lib/python3.7/site-packages/airflow/upgrade/rules/postgres_mysql_sqlite_version_upgrade_check.py",
 line 52, in check
   > installed_postgres_version = Version(session.execute('SHOW 
server_version;').scalar())
   >   File "/usr/local/lib/python3.7/site-packages/packaging/version.py", line 
298, in __init__
   > raise InvalidVersion("Invalid version: '{0}'".format(version))
   > packaging.version.InvalidVersion: Invalid version: '12.2 (Debian 
12.2-2.pgdg100+1)'
   > ```
   > 
   > This commit will SKIP the check during such occasions.
   > 
   > cc @Dr-Denzy
   > 
   > **^ Add meaningful description above**
   > 
   > Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   > In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   > In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   > In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   
   Nice catch!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ephraimbuddy commented on issue #14012: Changing state of task_instance to None

2021-03-31 Thread GitBox


ephraimbuddy commented on issue #14012:
URL: https://github.com/apache/airflow/issues/14012#issuecomment-811520833


   It's the state of the pod that's been set to None after execution and not 
the task instance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jhtimmins edited a comment on pull request #14946: Standardize default fab perms

2021-03-31 Thread GitBox


jhtimmins edited a comment on pull request #14946:
URL: https://github.com/apache/airflow/pull/14946#issuecomment-811519584


   @ashb This updates the permission model, so #14840 will need to integrate 
these changes, which I'm happy to help with.
   Can you and @kaxil take a look?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jhtimmins commented on pull request #14946: Standardize default fab perms

2021-03-31 Thread GitBox


jhtimmins commented on pull request #14946:
URL: https://github.com/apache/airflow/pull/14946#issuecomment-811519584


   @ashb This updates the permission model, so #14840 will need to integrate 
these changes.
   Can you and @kaxil take a look?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil opened a new pull request #15122: Skip `DatabaseVersionCheckRule` check if invalid version is detected

2021-03-31 Thread GitBox


kaxil opened a new pull request #15122:
URL: https://github.com/apache/airflow/pull/15122


   When the version is something like `12.2 (Debian 12.2-2.pgdg100+1)`, the 
upgrade check errors with the following:
   
   ```
   Traceback (most recent call last):
 File "", line 1, in 
 File "/usr/local/lib/python3.7/site-packages/airflow/upgrade/checker.py", 
line 130, in __main__
   run(args)
 File "/usr/local/lib/python3.7/site-packages/airflow/upgrade/checker.py", 
line 118, in run
   all_problems = check_upgrade(formatter, rules)
 File "/usr/local/lib/python3.7/site-packages/airflow/upgrade/checker.py", 
line 38, in check_upgrade
   rule_status = RuleStatus.from_rule(rule)
 File "/usr/local/lib/python3.7/site-packages/airflow/upgrade/problem.py", 
line 44, in from_rule
   result = rule.check()
 File "/usr/local/lib/python3.7/site-packages/airflow/utils/db.py", line 
74, in wrapper
   return func(*args, **kwargs)
 File 
"/usr/local/lib/python3.7/site-packages/airflow/upgrade/rules/postgres_mysql_sqlite_version_upgrade_check.py",
 line 52, in check
   installed_postgres_version = Version(session.execute('SHOW 
server_version;').scalar())
 File "/usr/local/lib/python3.7/site-packages/packaging/version.py", line 
298, in __init__
   raise InvalidVersion("Invalid version: '{0}'".format(version))
   packaging.version.InvalidVersion: Invalid version: '12.2 (Debian 
12.2-2.pgdg100+1)'
   ```
   
   This commit will SKIP the check during such occasions.
   
   cc @Dr-Denzy 
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] nkumar15 commented on issue #9486: ECSOperator failed to display logs from Cloudwatch after providing log configurations

2021-03-31 Thread GitBox


nkumar15 commented on issue #9486:
URL: https://github.com/apache/airflow/issues/9486#issuecomment-811507955


   Thanks @zdenulo finally after reading your article I realized that in 
awslogs_stream_prefix configuration it should not have leading / in it.
   
   So instead of  /ecs/hello-world  it should be ecs/hello-world".
   
   It worked. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (b0e68eb -> 116a8a0)

2021-03-31 Thread ash
This is an automated email from the ASF dual-hosted git repository.

ash pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from b0e68eb  Avoid scheduler/parser manager deadlock by using non-blocking 
IO (#15112)
 add 116a8a0  Remove duplicate call to sync_metadata inside 
DagFileProcessorManager (#15121)

No new revisions were added by this update.

Summary of changes:
 airflow/utils/dag_processing.py | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)


[GitHub] [airflow] ashb merged pull request #15121: Remove duplicate call to sync_metadata inside DagFileProcessorManager

2021-03-31 Thread GitBox


ashb merged pull request #15121:
URL: https://github.com/apache/airflow/pull/15121


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb merged pull request #15112: Avoid scheduler/parser manager deadlock by using non-blocking IO

2021-03-31 Thread GitBox


ashb merged pull request #15112:
URL: https://github.com/apache/airflow/pull/15112


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb closed issue #7935: scheduler gets stuck without a trace

2021-03-31 Thread GitBox


ashb closed issue #7935:
URL: https://github.com/apache/airflow/issues/7935


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (7417f81 -> b0e68eb)

2021-03-31 Thread ash
This is an automated email from the ASF dual-hosted git repository.

ash pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 7417f81  Run UI tests selectively when UI files have changed (#15009)
 add b0e68eb  Avoid scheduler/parser manager deadlock by using non-blocking 
IO (#15112)

No new revisions were added by this update.

Summary of changes:
 airflow/utils/dag_processing.py| 32 +++
 pylintrc-tests |  4 +-
 tests/utils/test_dag_processing.py | 80 +-
 3 files changed, 106 insertions(+), 10 deletions(-)


[GitHub] [airflow] thekyz commented on issue #14128: Circular package imports in Airflow 2.0

2021-03-31 Thread GitBox


thekyz commented on issue #14128:
URL: https://github.com/apache/airflow/issues/14128#issuecomment-811500469


   Thanks for the links, definitely appreciated!
   
   I managed to make it work by depending directly on the providers + 
apache-airflow, since then the dependency on the extras is not expressed. 
   
   For reference if somebody is encountering the same issue as me, this:
   ```
   apache-airflow-providers-cncf-kubernetes==1.0.2
   apache-airflow-providers-postgres==1.0.1
   apache-airflow[statsd]==2.0.1
   ```
   
   instead of this:
   ```
   apache-airflow[statsd,postgres,kubernetes]==2.0.1
   ```
   
   Thanks again @potiuk for the support!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated: Run UI tests selectively when UI files have changed (#15009)

2021-03-31 Thread ash
This is an automated email from the ASF dual-hosted git repository.

ash pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/master by this push:
 new 7417f81  Run UI tests selectively when UI files have changed (#15009)
7417f81 is described below

commit 7417f81d36ad02c2a9d7feb9b9f881610f50ceba
Author: Ash Berlin-Taylor 
AuthorDate: Wed Mar 31 23:10:00 2021 +0100

Run UI tests selectively when UI files have changed (#15009)

If we only change the new React UI files, we don't want to have to run
the whole python test suite!
---
 .github/workflows/ci.yml   | 37 ++
 .pre-commit-config.yaml|  7 ++
 BREEZE.rst |  2 +-
 STATIC_CODE_CHECKS.rst |  2 ++
 breeze-complete|  1 +
 scripts/ci/selective_ci_checks.sh  | 46 +++---
 scripts/ci/static_checks/eslint.sh | 31 +
 7 files changed, 122 insertions(+), 4 deletions(-)

diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index 49fb2e7..3f90b43 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -147,6 +147,7 @@ jobs:
   sqliteExclude: ${{ steps.selective-checks.outputs.sqlite-exclude }}
   kubernetesExclude: ${{ steps.selective-checks.outputs.kubernetes-exclude 
}}
   run-tests: ${{ steps.selective-checks.outputs.run-tests }}
+  run-ui-tests: ${{ steps.selective-checks.outputs.run-ui-tests }}
   run-kubernetes-tests: ${{ 
steps.selective-checks.outputs.run-kubernetes-tests }}
   basic-checks-only: ${{ steps.selective-checks.outputs.basic-checks-only 
}}
   image-build: ${{ steps.selective-checks.outputs.image-build }}
@@ -344,6 +345,12 @@ 
pre-commit-local-installation-${{steps.host-python-version.outputs.host-python-v
   key: 
"pre-commit-no-pylint-${{steps.host-python-version.outputs.host-python-version}}-\
 ${{ hashFiles('.pre-commit-config.yaml') }}"
   restore-keys: 
pre-commit-no-pylint-${{steps.host-python-version.outputs.host-python-version}}
+
+  - name: "Cache eslint"
+uses: actions/cache@v2
+with:
+  path: 'airflow/ui/node_modules'
+  key: ${{ runner.os }}-ui-node-modules-${{ 
hashFiles('airflow/ui/**/yarn.lock') }}
   - name: "Static checks: except pylint"
 run: ./scripts/ci/static_checks/run_static_checks.sh
 env:
@@ -1264,3 +1271,33 @@ ${{ hashFiles('.pre-commit-config.yaml') }}"
   tags: true
   force: true
   branch: master
+
+  tests-ui:
+timeout-minutes: 10
+name: React UI tests
+runs-on: ${{ fromJson(needs.build-info.outputs.runsOn) }}
+needs: [build-info, ci-images]
+env:
+  GITHUB_REGISTRY: ${{ needs.ci-images.outputs.githubRegistry }}
+if: needs.build-info.outputs.run-ui-tests == 'true'
+steps:
+  - name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
+uses: actions/checkout@v2
+with:
+  persist-credentials: false
+  submodules: recursive
+  - name: "Setup node"
+uses: actions/setup-node@v2
+with:
+  node-version: 14
+  - name: "Free space"
+run: ./scripts/ci/tools/ci_free_space_on_ci.sh
+  - name: "Prepare CI image ${{env.PYTHON_MAJOR_MINOR_VERSION}}:${{ 
env.GITHUB_REGISTRY_PULL_IMAGE_TAG }}"
+run: ./scripts/ci/images/ci_prepare_ci_image_on_ci.sh
+  - name: "Cache eslint"
+uses: actions/cache@v2
+with:
+  path: 'airflow/ui/node_modules'
+  key: ${{ runner.os }}-ui-node-modules-${{ 
hashFiles('airflow/ui/**/yarn.lock') }}
+  - run: yarn --cwd airflow/ui/ install --frozen-lockfile --non-interactive
+  - run: yarn --cwd airflow/ui/ run test
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index e3cc75a..50e18ed 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -620,6 +620,13 @@ repos:
 entry: "./scripts/ci/pre_commit/pre_commit_flake8.sh"
 files: \.py$
 pass_filenames: true
+  - id: ui-lint
+name: ESLint against airflow/ui
+language: node
+'types_or': [javascript, tsx, ts]
+files: ^airflow/ui/
+entry: "scripts/ci/static_checks/eslint.sh"
+pass_filenames: false
   - id: bats-in-container-tests
 name: Run in container bats tests
 language: system
diff --git a/BREEZE.rst b/BREEZE.rst
index 55cc46a..2d0a346 100644
--- a/BREEZE.rst
+++ b/BREEZE.rst
@@ -2298,7 +2298,7 @@ This is the current syntax for  `./breeze <./breeze>`_:
  pre-commit-hook-names provide-create-sessions 
providers-init-file provider-yamls
  pydevd pydocstyle pylint pylint-tests python-no-log-warn 
pyupgrade
  restrict-start_date rst-backticks setup-order 
setup-extra-packages shellcheck
- 

[GitHub] [airflow] ashb merged pull request #15009: Run UI tests selectively

2021-03-31 Thread GitBox


ashb merged pull request #15009:
URL: https://github.com/apache/airflow/pull/15009


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb closed issue #14957: Run selective CI pipeline for UI-only PRs

2021-03-31 Thread GitBox


ashb closed issue #14957:
URL: https://github.com/apache/airflow/issues/14957


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #15009: Run UI tests selectively

2021-03-31 Thread GitBox


github-actions[bot] commented on pull request #15009:
URL: https://github.com/apache/airflow/pull/15009#issuecomment-811498731


   The PR most likely needs to run full matrix of tests because it modifies 
parts of the core of Airflow. However, committers might decide to merge it 
quickly and take the risk. If they don't merge it quickly - please rebase it to 
the latest master at your convenience, or amend the last commit of the PR, and 
push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #14840: Add CUD REST API endpoints for Roles

2021-03-31 Thread GitBox


github-actions[bot] commented on pull request #14840:
URL: https://github.com/apache/airflow/pull/14840#issuecomment-811493957


   The PR most likely needs to run full matrix of tests because it modifies 
parts of the core of Airflow. However, committers might decide to merge it 
quickly and take the risk. If they don't merge it quickly - please rebase it to 
the latest master at your convenience, or amend the last commit of the PR, and 
push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (0e024c4 -> a2d9929)

2021-03-31 Thread ash
This is an automated email from the ASF dual-hosted git repository.

ash pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 0e024c4  Fixed all remaining code usage of old task import (#15118)
 add a2d9929  Add REST API query sort and order to some endpoints (#14895)

No new revisions were added by this update.

Summary of changes:
 .../api_connexion/endpoints/connection_endpoint.py | 10 ++-
 .../api_connexion/endpoints/dag_run_endpoint.py| 26 +-
 .../api_connexion/endpoints/event_log_endpoint.py  | 19 -
 .../endpoints/import_error_endpoint.py | 16 ++--
 airflow/api_connexion/endpoints/pool_endpoint.py   | 10 ++-
 .../endpoints/role_and_permission_endpoint.py  | 10 ++-
 airflow/api_connexion/endpoints/task_endpoint.py   | 11 ++-
 airflow/api_connexion/endpoints/user_endpoint.py   | 19 -
 .../api_connexion/endpoints/variable_endpoint.py   | 17 ++--
 airflow/api_connexion/openapi/v1.yaml  | 28 ++-
 airflow/api_connexion/parameters.py| 19 +
 airflow/api_connexion/schemas/dag_run_schema.py|  1 +
 .../endpoints/test_connection_endpoint.py  | 49 
 .../endpoints/test_dag_run_endpoint.py | 93 ++
 .../endpoints/test_event_log_endpoint.py   | 65 ++-
 .../endpoints/test_import_error_endpoint.py| 58 ++
 .../api_connexion/endpoints/test_pool_endpoint.py  | 43 ++
 .../endpoints/test_role_and_permission_endpoint.py |  8 ++
 .../api_connexion/endpoints/test_task_endpoint.py  | 75 +++--
 .../api_connexion/endpoints/test_user_endpoint.py  |  9 +++
 .../endpoints/test_variable_endpoint.py| 10 +++
 21 files changed, 551 insertions(+), 45 deletions(-)


[GitHub] [airflow] ashb merged pull request #14895: Add REST API query sort and order to some endpoints

2021-03-31 Thread GitBox


ashb merged pull request #14895:
URL: https://github.com/apache/airflow/pull/14895


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated: Fixed all remaining code usage of old task import (#15118)

2021-03-31 Thread ash
This is an automated email from the ASF dual-hosted git repository.

ash pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/master by this push:
 new 0e024c4  Fixed all remaining code usage of old task import (#15118)
0e024c4 is described below

commit 0e024c4a79ed5843d7eb6d53fc5f6e4950d2f42a
Author: Andrew Godwin 
AuthorDate: Wed Mar 31 15:23:46 2021 -0600

Fixed all remaining code usage of old task import (#15118)

This finishes the job of removing all imports of task from
`operators.python` and replaces them with the new import from
`decorators`.
---
 tests/models/test_dag.py   | 2 +-
 tests/models/test_dagparam.py  | 2 +-
 tests/utils/test_task_group.py | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tests/models/test_dag.py b/tests/models/test_dag.py
index d2d0f9a..7f00134 100644
--- a/tests/models/test_dag.py
+++ b/tests/models/test_dag.py
@@ -38,6 +38,7 @@ from parameterized import parameterized
 
 from airflow import models, settings
 from airflow.configuration import conf
+from airflow.decorators import task as task_decorator
 from airflow.exceptions import AirflowException, DuplicateTaskIdFound
 from airflow.models import DAG, DagModel, DagRun, DagTag, TaskFail, 
TaskInstance as TI
 from airflow.models.baseoperator import BaseOperator
@@ -45,7 +46,6 @@ from airflow.models.dag import dag as dag_decorator
 from airflow.models.dagparam import DagParam
 from airflow.operators.bash import BashOperator
 from airflow.operators.dummy import DummyOperator
-from airflow.operators.python import task as task_decorator
 from airflow.operators.subdag import SubDagOperator
 from airflow.security import permissions
 from airflow.utils import timezone
diff --git a/tests/models/test_dagparam.py b/tests/models/test_dagparam.py
index 1eb28cb..618eb88 100644
--- a/tests/models/test_dagparam.py
+++ b/tests/models/test_dagparam.py
@@ -19,8 +19,8 @@
 import unittest
 from datetime import timedelta
 
+from airflow.decorators import task
 from airflow.models.dag import DAG
-from airflow.operators.python import task
 from airflow.utils import timezone
 from airflow.utils.state import State
 from airflow.utils.types import DagRunType
diff --git a/tests/utils/test_task_group.py b/tests/utils/test_task_group.py
index 0bc80e3..de227e4 100644
--- a/tests/utils/test_task_group.py
+++ b/tests/utils/test_task_group.py
@@ -269,7 +269,7 @@ def test_build_task_group_with_task_decorator():
 """
 Test that TaskGroup can be used with the @task decorator.
 """
-from airflow.operators.python import task
+from airflow.decorators import task
 
 @task
 def task_1():


[GitHub] [airflow] ashb merged pull request #15118: Fixed all remaining usage of the old task import

2021-03-31 Thread GitBox


ashb merged pull request #15118:
URL: https://github.com/apache/airflow/pull/15118


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #14128: Circular package imports in Airflow 2.0

2021-03-31 Thread GitBox


potiuk commented on issue #14128:
URL: https://github.com/apache/airflow/issues/14128#issuecomment-811474838


   And here is explanation about airflow's extras 
http://airflow.apache.org/docs/apache-airflow/stable/extra-packages-ref.html - 
it's rather straightforward and helpful I think


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk edited a comment on issue #14128: Circular package imports in Airflow 2.0

2021-03-31 Thread GitBox


potiuk edited a comment on issue #14128:
URL: https://github.com/apache/airflow/issues/14128#issuecomment-811473508


   https://stackoverflow.com/q/52474931/516701 gives a brief explanation about 
extras. I think it make sense to at least take a brief look if you are going to 
use them to understand what they are 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #14128: Circular package imports in Airflow 2.0

2021-03-31 Thread GitBox


potiuk commented on issue #14128:
URL: https://github.com/apache/airflow/issues/14128#issuecomment-811473508


   https://stackoverflow.com/q/52474931/516701 gives a brief explanation about 
extras. I think it make sens to at least take a brief look if you are going to 
use them to understand what they are 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #14128: Circular package imports in Airflow 2.0

2021-03-31 Thread GitBox


potiuk commented on issue #14128:
URL: https://github.com/apache/airflow/issues/14128#issuecomment-811472295


   And answering your question - yes. There is no point in installing provider 
without airflow. So it makes perfect sense to have this dependency. It also 
prevents from installing providers with airflow 1.10 (as there is 2.0 
apache-airflow dependency) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #14128: Circular package imports in Airflow 2.0

2021-03-31 Thread GitBox


potiuk commented on issue #14128:
URL: https://github.com/apache/airflow/issues/14128#issuecomment-811470959


   There is no need to install both provider and extra. Extra is just a legacy 
way to install providers, but you can simply install 'apache-airflow' (no 
extra) and 'apche-airflow-providers-cncf-kubernetes' as two separate packages. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb opened a new pull request #15121: Remove duplicate call to sync_metadata inside DagFileProcessorManager

2021-03-31 Thread GitBox


ashb opened a new pull request #15121:
URL: https://github.com/apache/airflow/pull/15121


   `_process_message` already calls sync_metadata if it's a DagParsingStat,
   so there's no need to call it again.  This isn't expensive, but no point
   doing it
   
   I've also added a runtime check to ensure this function is only used in
   sync mode, which makes the purpose/logic used in this function clearer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] thekyz commented on issue #14128: Circular package imports in Airflow 2.0

2021-03-31 Thread GitBox


thekyz commented on issue #14128:
URL: https://github.com/apache/airflow/issues/14128#issuecomment-811463347


   Well I'm using pip through rules_python (a set of bazel rules to use pip 
libraries; it actually uses pip wheel to get the modules).
   
   The download works fine, but then apache-airflow-providers-cncf-kubernetes 
declares a dependency on apache-airflow, which itself declares a dependency on 
apache-airflow-providers-cncf-kubernetes when installed with the extra, thus 
making bazel complains because of the circular dependency.
   
   Since the issue had been addressed above for the other (albeit preinstalled) 
providers, I thought this would be a similar issue. I'm sorry I'm not very 
knowledgeable on pip so I'm not aware of the underlying difference between the 
preinstalled providers and the extras.
   
   Is that dependency from apache-airflow-providers-cncf-kubernetes towards 
apache-airflow required for regular pip install?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #15112: Avoid scheduler/parser manager deadlock by using non-blocking IO

2021-03-31 Thread GitBox


kaxil commented on a change in pull request #15112:
URL: https://github.com/apache/airflow/pull/15112#discussion_r605216893



##
File path: airflow/utils/dag_processing.py
##
@@ -141,7 +141,7 @@ def waitable_handle(self):
 class DagParsingStat(NamedTuple):
 """Information on processing progress"""
 
-file_paths: List[str]
+num_file_paths: int

Review comment:
   lol yeah, good catch @XD-DENG `self. _sync_metadata` is redundant here




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #15112: Avoid scheduler/parser manager deadlock by using non-blocking IO

2021-03-31 Thread GitBox


ashb commented on a change in pull request #15112:
URL: https://github.com/apache/airflow/pull/15112#discussion_r605215817



##
File path: airflow/utils/dag_processing.py
##
@@ -141,7 +141,7 @@ def waitable_handle(self):
 class DagParsingStat(NamedTuple):
 """Information on processing progress"""
 
-file_paths: List[str]
+num_file_paths: int

Review comment:
   Yeah, that looks like a bug. PR comin.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #14640: Allow ExternalTaskSensor to wait for taskgroup

2021-03-31 Thread GitBox


github-actions[bot] commented on pull request #14640:
URL: https://github.com/apache/airflow/pull/14640#issuecomment-811459419


   [The Workflow run](https://github.com/apache/airflow/actions/runs/706274159) 
is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static 
checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm 
tests$,^Test OpenAPI*.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #14640: Allow ExternalTaskSensor to wait for taskgroup

2021-03-31 Thread GitBox


github-actions[bot] commented on pull request #14640:
URL: https://github.com/apache/airflow/pull/14640#issuecomment-811459103


   [The Workflow run](https://github.com/apache/airflow/actions/runs/706272731) 
is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static 
checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm 
tests$,^Test OpenAPI*.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on pull request #14946: Standardize default fab perms

2021-03-31 Thread GitBox


ashb commented on pull request #14946:
URL: https://github.com/apache/airflow/pull/14946#issuecomment-811457323


   FYI: This PR and https://github.com/apache/airflow/pull/14840 "fight"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] XD-DENG commented on a change in pull request #15112: Avoid scheduler/parser manager deadlock by using non-blocking IO

2021-03-31 Thread GitBox


XD-DENG commented on a change in pull request #15112:
URL: https://github.com/apache/airflow/pull/15112#discussion_r605210067



##
File path: airflow/utils/dag_processing.py
##
@@ -141,7 +141,7 @@ def waitable_handle(self):
 class DagParsingStat(NamedTuple):
 """Information on processing progress"""
 
-file_paths: List[str]
+num_file_paths: int

Review comment:
   > Checking `self._sync_metadata` also leads me to something else:
   > 
   > 
https://github.com/apache/airflow/blob/1bec3b21266764f367431ab5d9a5b75f52b9b6d2/airflow/utils/dag_processing.py#L323-L326
   > 
   > is doing exactly the same thing in `self._process_message()`, which has 
been invoked just before L323-326.
   > A clean-up refactor to do? Or I misunderstood something?
   
   Thanks @ashb . How about this question?
   
   We can address it separately of course, but want to confirm if I overlooked 
anything regarding this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on issue #15116: Logging to S3 fails using eventlet workers (maximum recursion depth exceeded)

2021-03-31 Thread GitBox


ashb commented on issue #15116:
URL: https://github.com/apache/airflow/issues/15116#issuecomment-811450727


   Long term the "fix" for us is probably to wait for Flask to support asyncio 
and then to drop support for gevent/eventlet.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on issue #15097: Errors when launching many pods simultaneously on GKE

2021-03-31 Thread GitBox


ashb commented on issue #15097:
URL: https://github.com/apache/airflow/issues/15097#issuecomment-811450057


   I agree with you that cases like this where Airflow was never even able to 
_start_ the task don't feel like they should "consume" a retry attempt.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on issue #15097: Errors when launching many pods simultaneously on GKE

2021-03-31 Thread GitBox


ashb commented on issue #15097:
URL: https://github.com/apache/airflow/issues/15097#issuecomment-811449743


   @SamWheating Assigned, thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on issue #15119: Flag on CLI airflow tasks test to attach debugger

2021-03-31 Thread GitBox


ashb commented on issue #15119:
URL: https://github.com/apache/airflow/issues/15119#issuecomment-811448411


   Are you aware of the the `dag.cli()` function? That might help you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] dstandish opened a new pull request #15120: Do not log info when ssm secret not found

2021-03-31 Thread GitBox


dstandish opened a new pull request #15120:
URL: https://github.com/apache/airflow/pull/15120


   Not finding a connection in a particular backend is not and event that 
merits logging.  That's why we have 3 connections in the search path.  
   
   It's only when the connection is not found in all 3 backends is something 
exceptional happening.
   
   Log level should be DEBUG.
   
   Also, use of the word 'error' is misleading, because not finding a parameter 
is normal.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] GuillaumeDesforges opened a new issue #15119: Flag on CLI airflow tasks test to attach debugger

2021-03-31 Thread GitBox


GuillaumeDesforges opened a new issue #15119:
URL: https://github.com/apache/airflow/issues/15119


   **Description**
   
   Any way to allow to attach a debugger when using `airflow tasks test`
   
   **Use case / motivation**
   
   Following [discussion](https://github.com/apache/airflow/discussions/14621):
   
   I generate DAGs dynamically using a database.
   I want to debug tasks.
   
   The doc ([Debug 
Executor](https://airflow.apache.org/docs/apache-airflow/stable/executor/debug.html))
 suggests to add
   
   ```python
   if __name__ == '__main__':
 from airflow.utils.state import State
 dag.clear(dag_run_state=State.NONE)
 dag.run()
   ```
   and run from IDE.
   
   The above code snippet is something I'd rather avoid:
   * it is hard to select the right dag and tasks when they are generated 
automatically
   * it's a code artifact I'd rather avoid having, because everytime I test 
another dag, task or execution time, it creates git diffs
   
   **Are you willing to submit a PR?**
   
   No
   
   **Related Issues**
   
   Haven't seen any (?)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #15112: Avoid scheduler/parser manager deadlock by using non-blocking IO

2021-03-31 Thread GitBox


ashb commented on a change in pull request #15112:
URL: https://github.com/apache/airflow/pull/15112#discussion_r605198494



##
File path: airflow/utils/dag_processing.py
##
@@ -141,7 +141,7 @@ def waitable_handle(self):
 class DagParsingStat(NamedTuple):
 """Information on processing progress"""
 
-file_paths: List[str]
+num_file_paths: int

Review comment:
   Yes, done, was only used in one test, and that was strictly speaking not 
testing the right thing anymore, just a likely side-effect.
   
   ```diff
   -assert result.num_file_paths == 3
   +assert sum(stat.run_count for stat in manager._file_stats.values()) 
== 3
   ```
   
   See 4d8da57eb for full diff.
   
   Good call @XD-DENG 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #15112: Avoid scheduler/parser manager deadlock by using non-blocking IO

2021-03-31 Thread GitBox


ashb commented on a change in pull request #15112:
URL: https://github.com/apache/airflow/pull/15112#discussion_r605194036



##
File path: airflow/utils/dag_processing.py
##
@@ -141,7 +141,7 @@ def waitable_handle(self):
 class DagParsingStat(NamedTuple):
 """Information on processing progress"""
 
-file_paths: List[str]
+num_file_paths: int

Review comment:
   Fair point yes, this is only used for tests and could possibly be 
removed. Let me take a look.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on pull request #15009: WIP: run UI tests

2021-03-31 Thread GitBox


ashb commented on pull request #15009:
URL: https://github.com/apache/airflow/pull/15009#issuecomment-811431004


   Note that _this_ PR will build everything as the CI scripts have changed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] XD-DENG commented on a change in pull request #15112: Avoid scheduler/parser manager deadlock by using non-blocking IO

2021-03-31 Thread GitBox


XD-DENG commented on a change in pull request #15112:
URL: https://github.com/apache/airflow/pull/15112#discussion_r605181000



##
File path: airflow/utils/dag_processing.py
##
@@ -141,7 +141,7 @@ def waitable_handle(self):
 class DagParsingStat(NamedTuple):
 """Information on processing progress"""
 
-file_paths: List[str]
+num_file_paths: int

Review comment:
   let's further challenge this part: why do we need `num_file_paths` (or 
the earlier `file_paths`) at all?
   
   Where we really need this `DagParsingStat` is `self._sync_metadata`, in 
which we don't use `num_file_paths` (or the earlier `file_paths`)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] XD-DENG commented on a change in pull request #15112: Avoid scheduler/parser manager deadlock by using non-blocking IO

2021-03-31 Thread GitBox


XD-DENG commented on a change in pull request #15112:
URL: https://github.com/apache/airflow/pull/15112#discussion_r605184402



##
File path: airflow/utils/dag_processing.py
##
@@ -141,7 +141,7 @@ def waitable_handle(self):
 class DagParsingStat(NamedTuple):
 """Information on processing progress"""
 
-file_paths: List[str]
+num_file_paths: int

Review comment:
   Checking `self._sync_metadata` also leads me to something else:
   
   
https://github.com/apache/airflow/blob/1bec3b21266764f367431ab5d9a5b75f52b9b6d2/airflow/utils/dag_processing.py#L323-L326
 is doing exactly the same thing in `self._process_message()`, which has been 
invoked just before L323-326. 
   
   A clean-up refactor to do? Or I misunderstood something?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] XD-DENG commented on a change in pull request #15112: Avoid scheduler/parser manager deadlock by using non-blocking IO

2021-03-31 Thread GitBox


XD-DENG commented on a change in pull request #15112:
URL: https://github.com/apache/airflow/pull/15112#discussion_r605181000



##
File path: airflow/utils/dag_processing.py
##
@@ -141,7 +141,7 @@ def waitable_handle(self):
 class DagParsingStat(NamedTuple):
 """Information on processing progress"""
 
-file_paths: List[str]
+num_file_paths: int

Review comment:
   let's further challenge this part: why do we need `num_file_paths` or 
earlier `file_paths`?
   
   Where we really need this `DagParsingStat` is `self._sync_metadata`, in 
which we don't use `num_file_paths` or earlier `file_paths`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #15086: Mark the test_scheduler_task_start_date as quarantined

2021-03-31 Thread GitBox


github-actions[bot] commented on pull request #15086:
URL: https://github.com/apache/airflow/pull/15086#issuecomment-811399680


   The PR most likely needs to run full matrix of tests because it modifies 
parts of the core of Airflow. However, committers might decide to merge it 
quickly and take the risk. If they don't merge it quickly - please rebase it to 
the latest master at your convenience, or amend the last commit of the PR, and 
push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Comment Edited] (AIRFLOW-2535) TriggerDagRunOperator ignores max_active_runs, leading to scheduler breakdown

2021-03-31 Thread Chris (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312673#comment-17312673
 ] 

Chris edited comment on AIRFLOW-2535 at 3/31/21, 7:43 PM:
--

This is making me avoid the TriggerDagRunOperator like the plague. I think I'll 
stick with Sensors until this gets sorted out. I'm on Airflow 1.10.10. Will 
this happen in v2.0+?


was (Author: cdabel):
This is making me avoid the TriggerDagRunOperator like the plague. I think I'll 
stick with Sensors until this gets sorted out.

> TriggerDagRunOperator ignores max_active_runs, leading to scheduler breakdown
> -
>
> Key: AIRFLOW-2535
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2535
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators, scheduler
>Affects Versions: 1.9.0
> Environment: CeleryExec, Rabbitmq, mysql
>Reporter: Tylar Murray
>Priority: Minor
>
> `TriggerDagRunOperator` does not respect the `max_active_runs` setting. This 
> in itself is not a huge issue, but the scheduler breaks down if 
> `max_active_runs` is exceeded. Task scheduling becomes absurdly slow, quickly 
> leading to perpetual DAG buildup. 
> `TriggerDagRunOperator` could throw exception if `max_active_runs` will be 
> exceeded, or maybe dags could be put into a `queued` state rather than 
> directly into `running`?
> Alternatively: the scheduler probably shouldn't slow task scheduling when 
> `max_active_runs` is exceeded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] potiuk edited a comment on issue #14128: Circular package imports in Airflow 2.0

2021-03-31 Thread GitBox


potiuk edited a comment on issue #14128:
URL: https://github.com/apache/airflow/issues/14128#issuecomment-811389466


   I don't belive it's an issue at all. Airflow  is only officialy installable 
with `pip` and it works nicely when you run installation including constraints 
in the way described here: 
http://airflow.apache.org/docs/apache-airflow/stable/installation.html#installation-tools.
 
   
   This is the only installation method that brings repeatable results. Other 
methods might fail, but w are not supporting a full range of tools that behave 
differently than PIP. If you are using poetry, pipenv or other tools like that 
you are pretty much on your own. If this is not a 'serious issue' but "some way 
of using some tool', it's likely to be closed as invalid. So possibly starting 
discussion and discussing it might be a better way to go about it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk edited a comment on issue #14128: Circular package imports in Airflow 2.0

2021-03-31 Thread GitBox


potiuk edited a comment on issue #14128:
URL: https://github.com/apache/airflow/issues/14128#issuecomment-811389466


   I don't belive it's an issue at all. Airflow  is only officialy installable 
with `pip` and it works nicely when you run it in including constraints in the 
way described here: 
http://airflow.apache.org/docs/apache-airflow/stable/installation.html#installation-tools.
 
   
   This is the only installation method that brings repeatable results. Other 
methods might fail, but w are not supporting a full range of tools that behave 
differently than PIP. If you are using poetry, pipenv or other tools like that 
you are pretty much on your own. If this is not a 'serious issue' but "some way 
of using some tool', it's likely to be closed as invalid. So possibly starting 
discussion and discussing it might be a better way to go about it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #15109: Store inventories in GitHub Action cache

2021-03-31 Thread GitBox


github-actions[bot] commented on pull request #15109:
URL: https://github.com/apache/airflow/pull/15109#issuecomment-811390591


   The PR most likely needs to run full matrix of tests because it modifies 
parts of the core of Airflow. However, committers might decide to merge it 
quickly and take the risk. If they don't merge it quickly - please rebase it to 
the latest master at your convenience, or amend the last commit of the PR, and 
push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #14128: Circular package imports in Airflow 2.0

2021-03-31 Thread GitBox


potiuk commented on issue #14128:
URL: https://github.com/apache/airflow/issues/14128#issuecomment-811389466


   I don't belive it's an issue at all. Airflow  is only officialyl installable 
with `pip` and it works nicely when you run it in including constraints in the 
way described here: 
http://airflow.apache.org/docs/apache-airflow/stable/installation.html#installation-tools.
 
   
   This is the only installation method that brings repeatable results. Other 
methods might fail, but w are not supporting a full range of tools that behave 
differently than PIP. If you are using poetry, pipenv or other tools like that 
you are pretty much on your own. If this is not a 'serious issue' but "some way 
of using some tool', it's likely to be closed as invalid. So possibly starting 
discussion and discussing it might be a better way to go about it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-2535) TriggerDagRunOperator ignores max_active_runs, leading to scheduler breakdown

2021-03-31 Thread Chris (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312673#comment-17312673
 ] 

Chris commented on AIRFLOW-2535:


This is making me avoid the TriggerDagRunOperator like the plague. I think I'll 
stick with Sensors until this gets sorted out.

> TriggerDagRunOperator ignores max_active_runs, leading to scheduler breakdown
> -
>
> Key: AIRFLOW-2535
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2535
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators, scheduler
>Affects Versions: 1.9.0
> Environment: CeleryExec, Rabbitmq, mysql
>Reporter: Tylar Murray
>Priority: Minor
>
> `TriggerDagRunOperator` does not respect the `max_active_runs` setting. This 
> in itself is not a huge issue, but the scheduler breaks down if 
> `max_active_runs` is exceeded. Task scheduling becomes absurdly slow, quickly 
> leading to perpetual DAG buildup. 
> `TriggerDagRunOperator` could throw exception if `max_active_runs` will be 
> exceeded, or maybe dags could be put into a `queued` state rather than 
> directly into `running`?
> Alternatively: the scheduler probably shouldn't slow task scheduling when 
> `max_active_runs` is exceeded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] XD-DENG commented on a change in pull request #15112: Avoid scheduler/parser manager deadlock by using non-blocking IO

2021-03-31 Thread GitBox


XD-DENG commented on a change in pull request #15112:
URL: https://github.com/apache/airflow/pull/15112#discussion_r605163605



##
File path: tests/utils/test_dag_processing.py
##
@@ -521,17 +524,90 @@ def test_dag_with_system_exit(self):
 
 manager._run_parsing_loop()
 
+result = None
 while parent_pipe.poll(timeout=None):
 result = parent_pipe.recv()
 if isinstance(result, DagParsingStat) and result.done:
 break
 
 # Three files in folder should be processed
-assert len(result.file_paths) == 3
+assert result.num_file_paths == 3
 
 with create_session() as session:
 assert session.query(DagModel).get(dag_id) is not None
 
+@conf_vars({('core', 'load_examples'): 'False'})
+@pytest.mark.backend("mysql", "postgres")
+@pytest.mark.timeout(60)
+def test_pipe_full_deadlock(self):
+dag_filepath = TEST_DAG_FOLDER / "test_scheduler_dags.py"
+
+child_pipe, parent_pipe = multiprocessing.Pipe()
+
+# Shrink the buffers to exacerbate the problem!
+for fd in (parent_pipe.fileno(),):
+sock = socket.socket(fileno=fd)
+sock.setsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF, 1024)
+sock.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 1024)
+sock.detach()
+
+exit_event = threading.Event()
+
+# To test this behaviour we need something that continually fills the
+# parent pipe's bufffer (and keeps it full).

Review comment:
   ```suggestion
   # parent pipe's buffer (and keeps it full).
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] thekyz commented on issue #14128: Circular package imports in Airflow 2.0

2021-03-31 Thread GitBox


thekyz commented on issue #14128:
URL: https://github.com/apache/airflow/issues/14128#issuecomment-811315008


   Sorry I jumped to conclusions here. I get exactly the same issue as 
described above when installing apache-airflow[kubernetes] thus I thought it 
was the same root cause. Should I create a new ticket then?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #15117: Remove user_id from API schema

2021-03-31 Thread GitBox


github-actions[bot] commented on pull request #15117:
URL: https://github.com/apache/airflow/pull/15117#issuecomment-811314368


   The PR is likely OK to be merged with just subset of tests for default 
Python and Database versions without running the full matrix of tests, because 
it does not modify the core of Airflow. If the committers decide that the full 
tests matrix is needed, they will add the label 'full tests needed'. Then you 
should rebase to the latest master or amend the last commit of the PR, and push 
it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Davide95 commented on issue #15108: PostgresHook do not use the specified cursor

2021-03-31 Thread GitBox


Davide95 commented on issue #15108:
URL: https://github.com/apache/airflow/issues/15108#issuecomment-811302486


   > Yes, as the implementation currently stands.
   
   Ok, thanks for the clarification
   
   > It is definitely a bit inconvinient, but why would you need to mix both 
kinds of cursors in the first place? It’s a nice feature to have, but 
_particluarly tricky_ wouldn’t be my description to the situation.
   
   I'm inheriting a large and old codebase and migrating everything could be an 
issue. Integrating different DAGs with different settings would be a nice 
feature to have
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on issue #15108: PostgresHook do not use the specified cursor

2021-03-31 Thread GitBox


uranusjr commented on issue #15108:
URL: https://github.com/apache/airflow/issues/15108#issuecomment-811300433


   > Does this mean that if you want to use different cursor types you have to 
create different connections?
   
   Yes, as the implementation currently stands.
   
   > Isn't this behavior particularly tricky?
   
   It is definitely a bit inconvinient, but why would you need to mix both 
kinds of cursors in the first place? It’s a nice feature to have, but 
*particluarly tricky* wouldn’t be my description to the situation.
   
   But nontheless, it would still be a good feature request and contribution if 
the interface can be designed reasonably.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Davide95 commented on issue #15108: PostgresHook do not use the specified cursor

2021-03-31 Thread GitBox


Davide95 commented on issue #15108:
URL: https://github.com/apache/airflow/issues/15108#issuecomment-811295403


   Hi @uranusjr 
   > I believe the documentation is saying you need to set extra on your 
Connection, not the hook.
   
   Does this mean that if you want to use different cursor types you have to 
create different connections? Isn't this behavior particularly tricky?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on issue #8665: subdag is not able to correctly render {{ next_ds }} jinja template when start_date uses timezone info.

2021-03-31 Thread GitBox


uranusjr commented on issue #8665:
URL: https://github.com/apache/airflow/issues/8665#issuecomment-811290623


   The displayed time here is actually pretty weird. `Asia/Calcutta` is +05:30 
(`pendulum` seems to implement this correctly), so I'd expect the output to be 
`2020-04-29T05:30:00+00:00` (the same time as supplied by the user, converted 
to UTC). I'm less sure the two issues are related (still could be).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #15112: Avoid scheduler/parser manager deadlock by using non-blocking IO

2021-03-31 Thread GitBox


ashb commented on a change in pull request #15112:
URL: https://github.com/apache/airflow/pull/15112#discussion_r605095651



##
File path: tests/utils/test_dag_processing.py
##
@@ -17,10 +17,13 @@
 # under the License.
 
 import multiprocessing
+import logging

Review comment:
   ```suggestion
   import logging
   import multiprocessing
   ```
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (6e99ae0 -> 1bec3b2)

2021-03-31 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 6e99ae0  Allow pathlib.Path in DagBag and various util fns (#15110)
 add 1bec3b2  Fix doc link permission name (#14972)

No new revisions were added by this update.

Summary of changes:
 airflow/security/permissions.py |  2 +-
 airflow/www/extensions/init_appbuilder_links.py | 10 +-
 airflow/www/security.py |  2 +-
 tests/www/test_security.py  |  2 +-
 4 files changed, 8 insertions(+), 8 deletions(-)


[GitHub] [airflow] kaxil merged pull request #14972: Fix doc link permission name

2021-03-31 Thread GitBox


kaxil merged pull request #14972:
URL: https://github.com/apache/airflow/pull/14972


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #15112: Avoid scheduler/parser manager deadlock by using non-blocking IO

2021-03-31 Thread GitBox


github-actions[bot] commented on pull request #15112:
URL: https://github.com/apache/airflow/pull/15112#issuecomment-811278484


   [The Workflow run](https://github.com/apache/airflow/actions/runs/705808662) 
is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static 
checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm 
tests$,^Test OpenAPI*.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #15109: Store inventories in GitHub Action cache

2021-03-31 Thread GitBox


kaxil commented on a change in pull request #15109:
URL: https://github.com/apache/airflow/pull/15109#discussion_r605092312



##
File path: scripts/ci/docs/ci_docs.sh
##
@@ -30,8 +30,7 @@ export PYTHONPATH=${AIRFLOW_SOURCES}
 
 pip install --upgrade pip==20.2.4
 
-pip install .[doc] --upgrade --constraint \
-
"https://raw.githubusercontent.com/apache/airflow/constraints-${DEFAULT_BRANCH}/constraints-${PYTHON_MAJOR_MINOR_VERSION}.txt;
+pip install .[doc] --upgrade --constraint "${AIRFLOW_SOURCES}/constraints.txt"

Review comment:
   Wait is this change required?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #14176: Simplify CONTRIBUTING.rst

2021-03-31 Thread GitBox


potiuk commented on pull request #14176:
URL: https://github.com/apache/airflow/pull/14176#issuecomment-811276689


   Cool!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] SamWheating commented on issue #15097: Errors when launching many pods simultaneously on GKE

2021-03-31 Thread GitBox


SamWheating commented on issue #15097:
URL: https://github.com/apache/airflow/issues/15097#issuecomment-811274036


    - Can you assign this issue to me and I'll open a PR to add some backoff 
logic to the operator?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on pull request #15109: Store inventories in GitHub Action cache

2021-03-31 Thread GitBox


uranusjr commented on pull request #15109:
URL: https://github.com/apache/airflow/pull/15109#issuecomment-811272928


   Cool ️ 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #15034: Taskgroup decorator

2021-03-31 Thread GitBox


github-actions[bot] commented on pull request #15034:
URL: https://github.com/apache/airflow/pull/15034#issuecomment-811272342


   The PR most likely needs to run full matrix of tests because it modifies 
parts of the core of Airflow. However, committers might decide to merge it 
quickly and take the risk. If they don't merge it quickly - please rebase it to 
the latest master at your convenience, or amend the last commit of the PR, and 
push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #15109: Store inventories in GitHub Action cache

2021-03-31 Thread GitBox


potiuk commented on pull request #15109:
URL: https://github.com/apache/airflow/pull/15109#issuecomment-811272172


   > I'm not yet sure how best to implement the part pulling version from 
constraints to fetch the currect inventory versions. I'll probably start with 
checking each listed documentation and figure out if they have a version 
inventory (not so sure for e.g. boto).
   
   Actually, I think that's IT. We do not need all those version changes 
actually when you look at it now. This will beautifully work as-is now without 
any further changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #15109: Store inventories in GitHub Action cache

2021-03-31 Thread GitBox


potiuk commented on pull request #15109:
URL: https://github.com/apache/airflow/pull/15109#issuecomment-811270930


   Nice! Waay simpler than the initial idea :). Would you mind `git commit 
--amend` and `git --force-push` to re-push it again (and change it to 
non-draft).  This way we will see if the cache works (it's been successfully 
pushed it seems)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr edited a comment on issue #15108: PostgresHook do not use the specified cursor

2021-03-31 Thread GitBox


uranusjr edited a comment on issue #15108:
URL: https://github.com/apache/airflow/issues/15108#issuecomment-811268206


   That said, it is definitely a bit confusing the hook just throws away unused 
arguments like this. Maybe it should raise an exception instead?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on issue #15108: PostgresHook do not use the specified cursor

2021-03-31 Thread GitBox


uranusjr commented on issue #15108:
URL: https://github.com/apache/airflow/issues/15108#issuecomment-811268206


   That said, it is definitely a bit confusing the hook just throws away unused 
arguments like this. Maybe it should throw an error instead?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on issue #15108: PostgresHook do not use the specified cursor

2021-03-31 Thread GitBox


uranusjr commented on issue #15108:
URL: https://github.com/apache/airflow/issues/15108#issuecomment-811267577


   I believe the documentation is saying you need to set `extra` on your 
*Connection*, not the hook.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #15112: Avoid scheduler/parser manager deadlock by using non-blocking IO

2021-03-31 Thread GitBox


ashb commented on a change in pull request #15112:
URL: https://github.com/apache/airflow/pull/15112#discussion_r605082015



##
File path: tests/utils/test_dag_processing.py
##
@@ -521,17 +523,93 @@ def test_dag_with_system_exit(self):
 
 manager._run_parsing_loop()
 
+result = None
 while parent_pipe.poll(timeout=None):
 result = parent_pipe.recv()
 if isinstance(result, DagParsingStat) and result.done:
 break
 
 # Three files in folder should be processed
-assert len(result.file_paths) == 3
+assert result.num_file_paths == 3
 
 with create_session() as session:
 assert session.query(DagModel).get(dag_id) is not None
 
+@conf_vars({('core', 'load_examples'): 'False'})
+@pytest.mark.backend("mysql", "postgres")
+def test_pipe_full_deadlock(self):
+dag_filepath = TEST_DAG_FOLDER / "test_scheduler_dags.py"
+
+child_pipe, parent_pipe = multiprocessing.Pipe()
+
+# Shrink the buffers to exacerbate the problem!
+for fd in (parent_pipe.fileno(),):
+sock = socket.socket(fileno=fd)
+sock.setsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF, 1024)
+sock.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 1024)
+sock.detach()
+
+exit_event = threading.Event()
+
+# To test this behaviour we need something that continually fills the
+# parent pipe's bufffer (and keeps it full).
+def keep_pipe_full(pipe, exit_event):
+import logging
+
+n = 0
+while True:
+if exit_event.is_set():
+break
+
+req = CallbackRequest(str(dag_filepath))
+try:
+logging.debug("Sending CallbackRequests %d", n + 1)
+pipe.send(req)
+except TypeError:
+# This is actually the error you get when the parent pipe
+# is closed! Nicely handled, eh?
+break
+except OSError:
+break
+n += 1
+logging.debug("   Sent %d CallbackRequests", n)
+
+thread = threading.Thread(target=keep_pipe_full, args=(parent_pipe, 
exit_event))
+
+fake_processors = []
+
+def fake_processor_factory(*args, **kwargs):
+nonlocal fake_processors
+processor = 
FakeDagFileProcessorRunner._fake_dag_processor_factory(*args, **kwargs)
+fake_processors.append(processor)
+return processor
+
+manager = DagFileProcessorManager(
+dag_directory=dag_filepath,
+dag_ids=[],
+# A reasonable large number to ensure that we trigger the deadlock
+max_runs=100,
+processor_factory=fake_processor_factory,
+processor_timeout=timedelta(seconds=5),
+signal_conn=child_pipe,
+pickle_dags=False,
+async_mode=True,
+)
+
+try:
+thread.start()
+
+# If this completes without hanging, then the test is good!
+manager._run_parsing_loop()
+exit_event.set()
+finally:
+import logging

Review comment:
   
https://github.com/apache/airflow/pull/15112/commits/5e7405f5574b36e8bc9cf8df0d0d788ba7238ad4




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #15112: Avoid scheduler/parser manager deadlock by using non-blocking IO

2021-03-31 Thread GitBox


ashb commented on a change in pull request #15112:
URL: https://github.com/apache/airflow/pull/15112#discussion_r605081881



##
File path: tests/utils/test_dag_processing.py
##
@@ -521,17 +523,93 @@ def test_dag_with_system_exit(self):
 
 manager._run_parsing_loop()
 
+result = None
 while parent_pipe.poll(timeout=None):
 result = parent_pipe.recv()
 if isinstance(result, DagParsingStat) and result.done:
 break
 
 # Three files in folder should be processed
-assert len(result.file_paths) == 3
+assert result.num_file_paths == 3
 
 with create_session() as session:
 assert session.query(DagModel).get(dag_id) is not None
 
+@conf_vars({('core', 'load_examples'): 'False'})
+@pytest.mark.backend("mysql", "postgres")
+def test_pipe_full_deadlock(self):
+dag_filepath = TEST_DAG_FOLDER / "test_scheduler_dags.py"
+
+child_pipe, parent_pipe = multiprocessing.Pipe()
+
+# Shrink the buffers to exacerbate the problem!
+for fd in (parent_pipe.fileno(),):
+sock = socket.socket(fileno=fd)
+sock.setsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF, 1024)
+sock.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 1024)
+sock.detach()
+
+exit_event = threading.Event()
+
+# To test this behaviour we need something that continually fills the
+# parent pipe's bufffer (and keeps it full).
+def keep_pipe_full(pipe, exit_event):
+import logging

Review comment:
   
https://github.com/apache/airflow/pull/15112/commits/5e7405f5574b36e8bc9cf8df0d0d788ba7238ad4




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #15112: Avoid scheduler/parser manager deadlock by using non-blocking IO

2021-03-31 Thread GitBox


ashb commented on a change in pull request #15112:
URL: https://github.com/apache/airflow/pull/15112#discussion_r605081483



##
File path: tests/utils/test_dag_processing.py
##
@@ -521,17 +523,93 @@ def test_dag_with_system_exit(self):
 
 manager._run_parsing_loop()
 
+result = None
 while parent_pipe.poll(timeout=None):
 result = parent_pipe.recv()
 if isinstance(result, DagParsingStat) and result.done:
 break
 
 # Three files in folder should be processed
-assert len(result.file_paths) == 3
+assert result.num_file_paths == 3
 
 with create_session() as session:
 assert session.query(DagModel).get(dag_id) is not None
 
+@conf_vars({('core', 'load_examples'): 'False'})
+@pytest.mark.backend("mysql", "postgres")
+def test_pipe_full_deadlock(self):
+dag_filepath = TEST_DAG_FOLDER / "test_scheduler_dags.py"
+
+child_pipe, parent_pipe = multiprocessing.Pipe()
+
+# Shrink the buffers to exacerbate the problem!
+for fd in (parent_pipe.fileno(),):
+sock = socket.socket(fileno=fd)
+sock.setsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF, 1024)
+sock.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 1024)
+sock.detach()
+
+exit_event = threading.Event()
+
+# To test this behaviour we need something that continually fills the
+# parent pipe's bufffer (and keeps it full).
+def keep_pipe_full(pipe, exit_event):
+import logging
+
+n = 0
+while True:
+if exit_event.is_set():
+break
+
+req = CallbackRequest(str(dag_filepath))
+try:
+logging.debug("Sending CallbackRequests %d", n + 1)
+pipe.send(req)
+except TypeError:
+# This is actually the error you get when the parent pipe
+# is closed! Nicely handled, eh?
+break
+except OSError:
+break
+n += 1
+logging.debug("   Sent %d CallbackRequests", n)
+
+thread = threading.Thread(target=keep_pipe_full, args=(parent_pipe, 
exit_event))
+
+fake_processors = []
+
+def fake_processor_factory(*args, **kwargs):
+nonlocal fake_processors
+processor = 
FakeDagFileProcessorRunner._fake_dag_processor_factory(*args, **kwargs)
+fake_processors.append(processor)
+return processor
+
+manager = DagFileProcessorManager(
+dag_directory=dag_filepath,
+dag_ids=[],
+# A reasonable large number to ensure that we trigger the deadlock
+max_runs=100,
+processor_factory=fake_processor_factory,
+processor_timeout=timedelta(seconds=5),
+signal_conn=child_pipe,
+pickle_dags=False,
+async_mode=True,
+)
+
+try:
+thread.start()
+
+# If this completes without hanging, then the test is good!
+manager._run_parsing_loop()

Review comment:
   
https://github.com/apache/airflow/pull/15112/commits/5e7405f5574b36e8bc9cf8df0d0d788ba7238ad4




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk edited a comment on issue #14128: Circular package imports in Airflow 2.0

2021-03-31 Thread GitBox


potiuk edited a comment on issue #14128:
URL: https://github.com/apache/airflow/issues/14128#issuecomment-811266078


   > @potiuk any clue when airflow will get a new release with this issue fixed?
   
   It's already fixed in released 2.0.1 
https://github.com/apache/airflow/pull/13314
   
   > also looks like the issue is not fixed at least for 
apache-airflow-providers-cncf-kubernetes
   
   It's never been an issue for cncf-kubernetes. This was only issue for the 4 
providers that are preinstalled: 'ftp', 'http', 'imap',  'sqlite'. Airflow has 
only "extra" dependency on cncf.kubernetes but extras are not counting as 
"install-requires" so this is not a problem as far as I know (unless you 
specifically have some problem that you would like to describe). 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #14128: Circular package imports in Airflow 2.0

2021-03-31 Thread GitBox


potiuk commented on issue #14128:
URL: https://github.com/apache/airflow/issues/14128#issuecomment-811266078


   > @potiuk any clue when airflow will get a new release with this issue fixed?
   
   It's already fixed in released 2.0.1 
https://github.com/apache/airflow/pull/13314
   
   > also looks like the issue is not fixed at least for 
apache-airflow-providers-cncf-kubernetes
   It's never been an issue for cncf-kubernetes. This was only issue for the 4 
providers that are preinstalled: 'ftp', 'http', 'imap',  'sqlite'. Airflow has 
only "extra" dependency on cncf.kubernetes but extras are not counting as 
"install-requires" so this is not a problem as far as I know (unless you 
specifically have some problem that you would like to describe). 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #15114: Pin pandas-gbq to <0.15.0

2021-03-31 Thread GitBox


github-actions[bot] commented on pull request #15114:
URL: https://github.com/apache/airflow/pull/15114#issuecomment-811266066


   [The Workflow run](https://github.com/apache/airflow/actions/runs/705783193) 
is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static 
checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm 
tests$,^Test OpenAPI*.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #15118: Fixed all remaining usage of the old task import

2021-03-31 Thread GitBox


github-actions[bot] commented on pull request #15118:
URL: https://github.com/apache/airflow/pull/15118#issuecomment-811264197


   The PR most likely needs to run full matrix of tests because it modifies 
parts of the core of Airflow. However, committers might decide to merge it 
quickly and take the risk. If they don't merge it quickly - please rebase it to 
the latest master at your convenience, or amend the last commit of the PR, and 
push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #15112: Avoid scheduler/parser manager deadlock by using non-blocking IO

2021-03-31 Thread GitBox


ashb commented on a change in pull request #15112:
URL: https://github.com/apache/airflow/pull/15112#discussion_r605077882



##
File path: tests/utils/test_dag_processing.py
##
@@ -20,7 +20,9 @@
 import os

Review comment:
   ```suggestion
   import logging
   import os
   ```
   

##
File path: tests/utils/test_dag_processing.py
##
@@ -521,17 +523,93 @@ def test_dag_with_system_exit(self):
 
 manager._run_parsing_loop()
 
+result = None
 while parent_pipe.poll(timeout=None):
 result = parent_pipe.recv()
 if isinstance(result, DagParsingStat) and result.done:
 break
 
 # Three files in folder should be processed
-assert len(result.file_paths) == 3
+assert result.num_file_paths == 3
 
 with create_session() as session:
 assert session.query(DagModel).get(dag_id) is not None
 
+@conf_vars({('core', 'load_examples'): 'False'})
+@pytest.mark.backend("mysql", "postgres")
+def test_pipe_full_deadlock(self):
+dag_filepath = TEST_DAG_FOLDER / "test_scheduler_dags.py"
+
+child_pipe, parent_pipe = multiprocessing.Pipe()
+
+# Shrink the buffers to exacerbate the problem!
+for fd in (parent_pipe.fileno(),):
+sock = socket.socket(fileno=fd)
+sock.setsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF, 1024)
+sock.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 1024)
+sock.detach()
+
+exit_event = threading.Event()
+
+# To test this behaviour we need something that continually fills the
+# parent pipe's bufffer (and keeps it full).
+def keep_pipe_full(pipe, exit_event):
+import logging
+

Review comment:
   ```suggestion
   ```
   

##
File path: tests/utils/test_dag_processing.py
##
@@ -521,17 +523,93 @@ def test_dag_with_system_exit(self):
 
 manager._run_parsing_loop()
 
+result = None
 while parent_pipe.poll(timeout=None):
 result = parent_pipe.recv()
 if isinstance(result, DagParsingStat) and result.done:
 break
 
 # Three files in folder should be processed
-assert len(result.file_paths) == 3
+assert result.num_file_paths == 3
 
 with create_session() as session:
 assert session.query(DagModel).get(dag_id) is not None
 
+@conf_vars({('core', 'load_examples'): 'False'})
+@pytest.mark.backend("mysql", "postgres")
+def test_pipe_full_deadlock(self):
+dag_filepath = TEST_DAG_FOLDER / "test_scheduler_dags.py"
+
+child_pipe, parent_pipe = multiprocessing.Pipe()
+
+# Shrink the buffers to exacerbate the problem!
+for fd in (parent_pipe.fileno(),):
+sock = socket.socket(fileno=fd)
+sock.setsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF, 1024)
+sock.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 1024)
+sock.detach()
+
+exit_event = threading.Event()
+
+# To test this behaviour we need something that continually fills the
+# parent pipe's bufffer (and keeps it full).
+def keep_pipe_full(pipe, exit_event):
+import logging
+
+n = 0
+while True:
+if exit_event.is_set():
+break
+
+req = CallbackRequest(str(dag_filepath))
+try:
+logging.debug("Sending CallbackRequests %d", n + 1)
+pipe.send(req)
+except TypeError:
+# This is actually the error you get when the parent pipe
+# is closed! Nicely handled, eh?
+break
+except OSError:
+break
+n += 1
+logging.debug("   Sent %d CallbackRequests", n)
+
+thread = threading.Thread(target=keep_pipe_full, args=(parent_pipe, 
exit_event))
+
+fake_processors = []
+
+def fake_processor_factory(*args, **kwargs):
+nonlocal fake_processors
+processor = 
FakeDagFileProcessorRunner._fake_dag_processor_factory(*args, **kwargs)
+fake_processors.append(processor)
+return processor
+
+manager = DagFileProcessorManager(
+dag_directory=dag_filepath,
+dag_ids=[],
+# A reasonable large number to ensure that we trigger the deadlock
+max_runs=100,
+processor_factory=fake_processor_factory,
+processor_timeout=timedelta(seconds=5),
+signal_conn=child_pipe,
+pickle_dags=False,
+async_mode=True,
+)
+
+try:
+thread.start()
+
+# If this completes without hanging, then the test is good!
+manager._run_parsing_loop()
+exit_event.set()
+finally:
+import 

[GitHub] [airflow] ashb commented on a change in pull request #15112: Avoid scheduler/parser manager deadlock by using non-blocking IO

2021-03-31 Thread GitBox


ashb commented on a change in pull request #15112:
URL: https://github.com/apache/airflow/pull/15112#discussion_r605076144



##
File path: tests/utils/test_dag_processing.py
##
@@ -521,17 +523,93 @@ def test_dag_with_system_exit(self):
 
 manager._run_parsing_loop()
 
+result = None
 while parent_pipe.poll(timeout=None):
 result = parent_pipe.recv()
 if isinstance(result, DagParsingStat) and result.done:
 break
 
 # Three files in folder should be processed
-assert len(result.file_paths) == 3
+assert result.num_file_paths == 3
 
 with create_session() as session:
 assert session.query(DagModel).get(dag_id) is not None
 
+@conf_vars({('core', 'load_examples'): 'False'})
+@pytest.mark.backend("mysql", "postgres")
+def test_pipe_full_deadlock(self):
+dag_filepath = TEST_DAG_FOLDER / "test_scheduler_dags.py"
+
+child_pipe, parent_pipe = multiprocessing.Pipe()
+
+# Shrink the buffers to exacerbate the problem!
+for fd in (parent_pipe.fileno(),):
+sock = socket.socket(fileno=fd)
+sock.setsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF, 1024)
+sock.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 1024)
+sock.detach()
+
+exit_event = threading.Event()
+
+# To test this behaviour we need something that continually fills the
+# parent pipe's bufffer (and keeps it full).
+def keep_pipe_full(pipe, exit_event):
+import logging
+
+n = 0
+while True:
+if exit_event.is_set():
+break
+
+req = CallbackRequest(str(dag_filepath))
+try:
+logging.debug("Sending CallbackRequests %d", n + 1)
+pipe.send(req)
+except TypeError:
+# This is actually the error you get when the parent pipe
+# is closed! Nicely handled, eh?
+break
+except OSError:
+break
+n += 1
+logging.debug("   Sent %d CallbackRequests", n)
+
+thread = threading.Thread(target=keep_pipe_full, args=(parent_pipe, 
exit_event))
+
+fake_processors = []
+
+def fake_processor_factory(*args, **kwargs):
+nonlocal fake_processors
+processor = 
FakeDagFileProcessorRunner._fake_dag_processor_factory(*args, **kwargs)
+fake_processors.append(processor)
+return processor
+
+manager = DagFileProcessorManager(
+dag_directory=dag_filepath,
+dag_ids=[],
+# A reasonable large number to ensure that we trigger the deadlock
+max_runs=100,
+processor_factory=fake_processor_factory,
+processor_timeout=timedelta(seconds=5),
+signal_conn=child_pipe,
+pickle_dags=False,
+async_mode=True,
+)
+
+try:
+thread.start()
+
+# If this completes without hanging, then the test is good!
+manager._run_parsing_loop()

Review comment:
   Oh in tests, yes we probably should. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on issue #15116: Logging to S3 fails using eventlet workers (maximum recursion depth exceeded)

2021-03-31 Thread GitBox


uranusjr commented on issue #15116:
URL: https://github.com/apache/airflow/issues/15116#issuecomment-811258624


   As a hack, maybe it would be possible to call something on the hook before 
gevent patches? The connection object (creation of which triggers `SSLContext` 
modifications) is cached, so once it’s created, subsequent S3 calls won’t 
modify `SSLContext` (I think).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on issue #15116: Logging to S3 fails using eventlet workers (maximum recursion depth exceeded)

2021-03-31 Thread GitBox


uranusjr commented on issue #15116:
URL: https://github.com/apache/airflow/issues/15116#issuecomment-811254578


   Seems like this is a known gevent issue: gevent/gevent#1777 and 
gevent/gevent#1531. Note that both issues have been closed without a fix.
   
   Reading the issues, it seems like this happens when the monkeypatch happens 
before an option update, so a fix in Airflow would need to be delaying the 
patch until all possible context modifications. But since those modifications 
come from providers, it can literally happen anywhere, so unfortunately I don't 
see a good way to fix this in Airflow. Hopefully someone can provide a workable 
solution.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   3   >