[airflow] branch master updated (54019e8 -> 8505d2f)

2021-05-31 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 54019e8  Update the Python client version (#16191)
 add 8505d2f  Fix typo. (#16192)

No new revisions were added by this update.

Summary of changes:
 dev/airflow-github | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


[GitHub] [airflow] potiuk merged pull request #16192: Fix typo.

2021-05-31 Thread GitBox


potiuk merged pull request #16192:
URL: https://github.com/apache/airflow/pull/16192


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk opened a new issue #16193: Building Prod image Python 3.6 on CI

2021-05-31 Thread GitBox


potiuk opened a new issue #16193:
URL: https://github.com/apache/airflow/issues/16193


   Seems that we started having a problem with building PROD image for Python 
3.6 - it consistently fails. 
   
   Needs investigation and fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jhtimmins opened a new pull request #16192: Fix typo.

2021-05-31 Thread GitBox


jhtimmins opened a new pull request #16192:
URL: https://github.com/apache/airflow/pull/16192


   Remove unnecessary apostrophe


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #16189: Fix S3 Select payload join

2021-05-31 Thread GitBox


github-actions[bot] commented on pull request #16189:
URL: https://github.com/apache/airflow/pull/16189#issuecomment-851867267


   The PR is likely OK to be merged with just subset of tests for default 
Python and Database versions without running the full matrix of tests, because 
it does not modify the core of Airflow. If the committers decide that the full 
tests matrix is needed, they will add the label 'full tests needed'. Then you 
should rebase to the latest master or amend the last commit of the PR, and push 
it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (3f0d4e8 -> 54019e8)

2021-05-31 Thread msumit
This is an automated email from the ASF dual-hosted git repository.

msumit pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 3f0d4e8  Fix docs for ``dag_concurrency`` (#16177)
 add 54019e8  Update the Python client version (#16191)

No new revisions were added by this update.

Summary of changes:
 clients/gen/python.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


[GitHub] [airflow] msumit merged pull request #16191: Update the Python client version

2021-05-31 Thread GitBox


msumit merged pull request #16191:
URL: https://github.com/apache/airflow/pull/16191


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #16169: Mark `test_send_tasks_to_celery_hang` as quarantined

2021-05-31 Thread GitBox


potiuk commented on pull request #16169:
URL: https://github.com/apache/airflow/pull/16169#issuecomment-851854371


   I assigned both of us to that for now. I will take a look and will keep you 
posted on it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #16169: Mark `test_send_tasks_to_celery_hang` as quarantined

2021-05-31 Thread GitBox


potiuk commented on pull request #16169:
URL: https://github.com/apache/airflow/pull/16169#issuecomment-851853976


   I think we need to stabilize the CI self-hosted runners a bit - we need to 
(very soon) switch to GCP-hosted ones and there are some instabilities 
currently due to the way it is run for auto-scaling which we will fix. Not sure 
if this is related but I think I only saw it in few cases. The quarantined 
tests will be running with every master merge so we can take a look if the 
situation re-occurs (and we can move it out of quarantine then).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #16191: Update the Python client version

2021-05-31 Thread GitBox


github-actions[bot] commented on pull request #16191:
URL: https://github.com/apache/airflow/pull/16191#issuecomment-851852415


   The PR is likely ready to be merged. No tests are needed as no important 
environment files, nor python files were modified by it. However, committers 
might decide that full test matrix is needed and add the 'full tests needed' 
label. Then you should rebase it to the latest master or amend the last commit 
of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #16170: Adding extra requirements for build and runtime of the PROD image.

2021-05-31 Thread GitBox


potiuk commented on a change in pull request #16170:
URL: https://github.com/apache/airflow/pull/16170#discussion_r642810142



##
File path: docs/apache-airflow/start/docker-compose.yaml
##
@@ -23,16 +23,25 @@
 # This configuration supports basic configuration using environment variables 
or an .env file
 # The following variables are supported:
 #
-# AIRFLOW_IMAGE_NAME - Docker image name used to run Airflow.
-#  Default: apache/airflow:|version|
-# AIRFLOW_UID- User ID in Airflow containers
-#  Default: 5
-# AIRFLOW_GID- Group ID in Airflow containers
-#  Default: 5
-# _AIRFLOW_WWW_USER_USERNAME - Username for the administrator account.
-#  Default: airflow
-# _AIRFLOW_WWW_USER_PASSWORD - Password for the administrator account.
-#  Default: airflow
+# AIRFLOW_IMAGE_NAME   - Docker image name used to run Airflow.
+#Default: apache/airflow:|version|
+# AIRFLOW_UID  - User ID in Airflow containers
+#Default: 5
+# AIRFLOW_GID  - Group ID in Airflow containers
+#Default: 5
+#
+# Those configurations are useful mostly in case of standalone testing/running 
Airflow in test/try-out mode
+#
+# _AIRFLOW_WWW_USER_CREATE - Whether to create administrator account.
+#Default: true
+# _AIRFLOW_WWW_USER_USERNAME   - Username for the administrator account (if 
requested).
+#Default: airflow
+# _AIRFLOW_WWW_USER_PASSWORD   - Password for the administrator account (if 
requested).
+#Default: airflow
+# _AIRFLOW_DB_UPGRADE  - Whether to perform DB upgrade in the init 
container

Review comment:
   Yep. Removed it (Also create user) which was the same.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #16170: Adding extra requirements for build and runtime of the PROD image.

2021-05-31 Thread GitBox


potiuk commented on pull request #16170:
URL: https://github.com/apache/airflow/pull/16170#issuecomment-851851457


   > I wonder if we should add new sections to the Helm Chart documentation to 
better promote this feature. What do you think to create a new page, e.g. FAQ 
and add the answer to the question "How to install extra pip packages?" ? I 
mainly think of users migrating from alternative Helm Charts as 
[`airflow-helm/airflow`](https://artifacthub.io/packages/helm/airflow-helm/airflow#how-to-install-extra-pip-packages)
 or 
[`bitnami/airflow`](https://artifacthub.io/packages/helm/bitnami/airflow#install-extra-python-packages)
 has such a section.
   
   Good point @mik-laj.
   
   But rather than creating a new section, I extended two sections in the helm 
chart docs:
   
   * production deployment - I mentioned typical scenarios when you need custom 
image and referred to "Build Images" for details
   
   * quick-start with kind - I copied a few typical examples (adding apt/PypI 
packages) next to "Adding DAGs" with step-by-step instructions  on how to build 
the images. This would be a gentle introduction to image building by users who 
do not know how to do it or even do not know that they could build and use 
their own image easily (plus see-more reference to "Build Images"). 
   
   I think that hits the sweet-spot between copy/pasting some parts of 
documentation where users might need it and having common source of examples. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] yuqian90 commented on pull request #16169: Mark `test_send_tasks_to_celery_hang` as quarantined

2021-05-31 Thread GitBox


yuqian90 commented on pull request #16169:
URL: https://github.com/apache/airflow/pull/16169#issuecomment-851849148


   > I added the link and copied the stack-trace: @yuqian90 in #16168. We can 
observe if it is happening in master builds (they always run on self-hosted 
infra) in the Quarantined job.
   
   Hi, @potiuk thanks for the info. Unfortunately I don't have a convenient way 
to reproduce the environment on my end. I tried to run 
`test_send_tas_to_celery_hang` thousands of times on the debian, ubuntu and osx 
machines I have access to, it never hanged. The test is passing through the 
same code path that celery_executor would be using if it were to run in 
production. I'm wondering if celery_executor hangs in this environment too with 
or without the fix in #15989 .
   
   Worst case, is there a way to make `test_send_tas_to_celery_hang` only run 
when it's not in the self-hosted runner?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] pateash commented on pull request #15574: #12401 - Duplicating connections from UI

2021-05-31 Thread GitBox


pateash commented on pull request #15574:
URL: https://github.com/apache/airflow/pull/15574#issuecomment-851846914


   > As others have said, we should have it as `_copy` not `_Copy`.
   > 
   > Additionally, it should be `copy1` not `copy(1)` so it can be set via env 
vars if _has_ to. (You can't put `()` in env var names, so 
`AIRFLOW_CONN_FOO_COPY(1)` would not be possible
   
   done


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] msumit opened a new pull request #16191: Update the Python client version

2021-05-31 Thread GitBox


msumit opened a new pull request #16191:
URL: https://github.com/apache/airflow/pull/16191


   We've released version 2.1.0 of the client, so updating the same here. 
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #16185: Combined Executor - KubernetesExecutor + LocalExecutor

2021-05-31 Thread GitBox


potiuk commented on issue #16185:
URL: https://github.com/apache/airflow/issues/16185#issuecomment-851827998


   Take a look at ``CeleryKubernetesExecutor``. Although this one would require 
a little more intrinsic knowledge of Airflow - so please take a little more 
knowledge of Airflow  - like integration with Helm Chart etc. Maybe marking it 
as ``first-good-issue`` was not the best idea :) . Just take a look at the C-K 
executor and see if this is something you would like to take on.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] hafid-d commented on issue #16107: Airflow backfilling can't be disabled

2021-05-31 Thread GitBox


hafid-d commented on issue #16107:
URL: https://github.com/apache/airflow/issues/16107#issuecomment-851827881


   @motherhubbard tried using 2.1.0 but still have the issue :-/ 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jhtimmins commented on pull request #16190: call resource based fab methods.

2021-05-31 Thread GitBox


jhtimmins commented on pull request #16190:
URL: https://github.com/apache/airflow/pull/16190#issuecomment-851800644


   @ashb @jedcunningham @kaxil If one of y'all can take a look at this. This 
can be merged whenever it passes.
   
   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jhtimmins opened a new pull request #16190: call resource based fab methods.

2021-05-31 Thread GitBox


jhtimmins opened a new pull request #16190:
URL: https://github.com/apache/airflow/pull/16190


   The previous PR (#16077) added wrappers that use the updated resource/action 
names around the default FAB methods. This PR updates `security.py` to use 
those new methods.
   
   Successor to #16077 and the next step in #15398.
   
   This PR can be merged whenever it passes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] sawaca96 commented on issue #15572: import error

2021-05-31 Thread GitBox


sawaca96 commented on issue #15572:
URL: https://github.com/apache/airflow/issues/15572#issuecomment-851786756


   @JavierLopezT 
   I'm not sure this is working
   ```
   version: "3"
   services:
   airflow: 
 image: apache/airflow
 environment:
   - PYTHONPATH: /opt/airflow/dags/repo
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] TAKEDA-Takashi opened a new pull request #16189: Fix S3 Select payload join

2021-05-31 Thread GitBox


TAKEDA-Takashi opened a new pull request #16189:
URL: https://github.com/apache/airflow/pull/16189


   The original code has a potential bug that decodes fail if the payload is 
split in the middle of a multibyte character.
   This can be avoided by joining and then decoding.
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] hbrls commented on a change in pull request #14535: Fix: Mysql 5.7 id utf8mb3

2021-05-31 Thread GitBox


hbrls commented on a change in pull request #14535:
URL: https://github.com/apache/airflow/pull/14535#discussion_r642737509



##
File path: airflow/migrations/versions/e3a246e0dc1_current_schema.py
##
@@ -38,6 +38,8 @@
 depends_on = None
 
 
+print(COLLATION_ARGS)

Review comment:
   Yes, that's for debug. I will revert the line.

##
File path: 
airflow/migrations/versions/bbf4a7ad0465_remove_id_column_from_xcom.py
##
@@ -110,7 +110,8 @@ def upgrade():
 bop.drop_index('idx_xcom_dag_task_date')
 # mssql doesn't allow primary keys with nullable columns
 if conn.dialect.name != 'mssql':
-bop.create_primary_key('pk_xcom', ['dag_id', 'task_id', 'key', 
'execution_date'])
+#bop.create_primary_key('pk_xcom', ['dag_id', 'task_id', 
'key', 'execution_date'])
+bop.create_primary_key('pk_xcom', ['dag_id', 'task_id', 'key'])

Review comment:
   My mysql throws when creating the PK longer than 700. I have no idea how 
to fix it properly.
   
   But since it's not related to utf8mb3,I will revert this line.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (eb85c9d -> 3f0d4e8)

2021-05-31 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from eb85c9d  Replace deprecated ``dag.sub_dag`` with 
``dag.partial_subset`` (#16179)
 add 3f0d4e8  Fix docs for ``dag_concurrency`` (#16177)

No new revisions were added by this update.

Summary of changes:
 airflow/config_templates/config.yml  | 6 ++
 airflow/config_templates/default_airflow.cfg | 6 ++
 airflow/models/dag.py| 2 +-
 3 files changed, 5 insertions(+), 9 deletions(-)


[GitHub] [airflow] kaxil merged pull request #16177: Fix docs for ``dag_concurrency``

2021-05-31 Thread GitBox


kaxil merged pull request #16177:
URL: https://github.com/apache/airflow/pull/16177


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #16177: Fix docs for ``dag_concurrency``

2021-05-31 Thread GitBox


kaxil commented on a change in pull request #16177:
URL: https://github.com/apache/airflow/pull/16177#discussion_r642734741



##
File path: airflow/config_templates/config.yml
##
@@ -167,10 +167,8 @@
   default: "32"
 - name: dag_concurrency
   description: |
-The maximum number of task instances allowed to run concurrently in 
each DAG. To calculate
-the number of tasks that is running concurrently for a DAG, add up the 
number of running
-tasks for all DAG runs of the DAG. This is configurable at the DAG 
level with ``concurrency``,
-which is defaulted as ``dag_concurrency``.
+The maximum number of task instances allowed to run concurrently in 
each DAG Run for that

Review comment:
   That line is coming after `task_concurrency` check, are you sure you 
haven't set `task_concurency`. I am on leave on June 1, will be back on June 2 
-- can take a look at your DAG together:
   
   
https://github.com/apache/airflow/blob/eb85c9d191e452aab596333b9e82d3a9e6428542/airflow/jobs/scheduler_job.py#L1047-L1058




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #16177: Fix docs for ``dag_concurrency``

2021-05-31 Thread GitBox


kaxil commented on a change in pull request #16177:
URL: https://github.com/apache/airflow/pull/16177#discussion_r642734741



##
File path: airflow/config_templates/config.yml
##
@@ -167,10 +167,8 @@
   default: "32"
 - name: dag_concurrency
   description: |
-The maximum number of task instances allowed to run concurrently in 
each DAG. To calculate
-the number of tasks that is running concurrently for a DAG, add up the 
number of running
-tasks for all DAG runs of the DAG. This is configurable at the DAG 
level with ``concurrency``,
-which is defaulted as ``dag_concurrency``.
+The maximum number of task instances allowed to run concurrently in 
each DAG Run for that

Review comment:
   That line is coming after `task_concurrency` check, are you sure you 
haven't set `task_concurency`. I am on leave on June 1, will be back on June 2 
-- can take a look at your DAG together




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (352fefa -> eb85c9d)

2021-05-31 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 352fefa  Uses bind volume instead of docker volume for MSSQL docker in 
tmpfs (#16159)
 add eb85c9d  Replace deprecated ``dag.sub_dag`` with 
``dag.partial_subset`` (#16179)

No new revisions were added by this update.

Summary of changes:
 airflow/models/dag.py  |  4 ++--
 airflow/www/views.py   | 16 
 tests/jobs/test_backfill_job.py|  6 --
 tests/models/test_dag.py   |  2 +-
 tests/sensors/test_external_task_sensor.py |  2 +-
 tests/utils/test_task_group.py |  4 ++--
 6 files changed, 18 insertions(+), 16 deletions(-)


[GitHub] [airflow] kaxil merged pull request #16179: Replace deprecated ``dag.sub_dag`` with ``dag.partial_subset``

2021-05-31 Thread GitBox


kaxil merged pull request #16179:
URL: https://github.com/apache/airflow/pull/16179


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] acmh commented on issue #16185: Combined Executor - KubernetesExecutor + LocalExecutor

2021-05-31 Thread GitBox


acmh commented on issue #16185:
URL: https://github.com/apache/airflow/issues/16185#issuecomment-851749078


   @potiuk  Can I grab this issue? If yes, where I can start looking?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #16170: Adding extra requirements for build and runtime of the PROD image.

2021-05-31 Thread GitBox


mik-laj commented on a change in pull request #16170:
URL: https://github.com/apache/airflow/pull/16170#discussion_r642714658



##
File path: docs/apache-airflow/start/docker-compose.yaml
##
@@ -23,16 +23,25 @@
 # This configuration supports basic configuration using environment variables 
or an .env file
 # The following variables are supported:
 #
-# AIRFLOW_IMAGE_NAME - Docker image name used to run Airflow.
-#  Default: apache/airflow:|version|
-# AIRFLOW_UID- User ID in Airflow containers
-#  Default: 5
-# AIRFLOW_GID- Group ID in Airflow containers
-#  Default: 5
-# _AIRFLOW_WWW_USER_USERNAME - Username for the administrator account.
-#  Default: airflow
-# _AIRFLOW_WWW_USER_PASSWORD - Password for the administrator account.
-#  Default: airflow
+# AIRFLOW_IMAGE_NAME   - Docker image name used to run Airflow.
+#Default: apache/airflow:|version|
+# AIRFLOW_UID  - User ID in Airflow containers
+#Default: 5
+# AIRFLOW_GID  - Group ID in Airflow containers
+#Default: 5
+#
+# Those configurations are useful mostly in case of standalone testing/running 
Airflow in test/try-out mode
+#
+# _AIRFLOW_WWW_USER_CREATE - Whether to create administrator account.
+#Default: true
+# _AIRFLOW_WWW_USER_USERNAME   - Username for the administrator account (if 
requested).
+#Default: airflow
+# _AIRFLOW_WWW_USER_PASSWORD   - Password for the administrator account (if 
requested).
+#Default: airflow
+# _AIRFLOW_DB_UPGRADE  - Whether to perform DB upgrade in the init 
container

Review comment:
   This configuraiton is not customizable by a docker-compose environment 
variable. It is. a docker image variable. It is hardcoded to `true`. See: 
https://github.com/apache/airflow/blob/fb982e809aefde68de8aaae4a6d69edd960f/docs/apache-airflow/start/docker-compose.yaml#L152




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on pull request #16170: Adding extra requirements for build and runtime of the PROD image.

2021-05-31 Thread GitBox


mik-laj commented on pull request #16170:
URL: https://github.com/apache/airflow/pull/16170#issuecomment-851732846


   I wonder if we should add new sections to the Helm Chart documentation to 
better promote this feature. What do you think to create a new page, e.g. FAQ 
and add the answer to the question "How to install extra pip packages?" ?  I 
mainly think of users migrating from alternative Helm Charts as 
[`airflow-helm/airflow`](https://artifacthub.io/packages/helm/airflow-helm/airflow#how-to-install-extra-pip-packages)
 or 
[`bitnami/airflow`](https://artifacthub.io/packages/helm/bitnami/airflow#install-extra-python-packages)
 has such a section.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #16170: Adding extra requirements for build and runtime of the PROD image.

2021-05-31 Thread GitBox


mik-laj commented on a change in pull request #16170:
URL: https://github.com/apache/airflow/pull/16170#discussion_r642714658



##
File path: docs/apache-airflow/start/docker-compose.yaml
##
@@ -23,16 +23,25 @@
 # This configuration supports basic configuration using environment variables 
or an .env file
 # The following variables are supported:
 #
-# AIRFLOW_IMAGE_NAME - Docker image name used to run Airflow.
-#  Default: apache/airflow:|version|
-# AIRFLOW_UID- User ID in Airflow containers
-#  Default: 5
-# AIRFLOW_GID- Group ID in Airflow containers
-#  Default: 5
-# _AIRFLOW_WWW_USER_USERNAME - Username for the administrator account.
-#  Default: airflow
-# _AIRFLOW_WWW_USER_PASSWORD - Password for the administrator account.
-#  Default: airflow
+# AIRFLOW_IMAGE_NAME   - Docker image name used to run Airflow.
+#Default: apache/airflow:|version|
+# AIRFLOW_UID  - User ID in Airflow containers
+#Default: 5
+# AIRFLOW_GID  - Group ID in Airflow containers
+#Default: 5
+#
+# Those configurations are useful mostly in case of standalone testing/running 
Airflow in test/try-out mode
+#
+# _AIRFLOW_WWW_USER_CREATE - Whether to create administrator account.
+#Default: true
+# _AIRFLOW_WWW_USER_USERNAME   - Username for the administrator account (if 
requested).
+#Default: airflow
+# _AIRFLOW_WWW_USER_PASSWORD   - Password for the administrator account (if 
requested).
+#Default: airflow
+# _AIRFLOW_DB_UPGRADE  - Whether to perform DB upgrade in the init 
container

Review comment:
   This configuraiton is not customizable by a docker-compose environment 
variable. It is. a docker image variable. It is hardcoded to true. See: 
https://github.com/apache/airflow/blob/fb982e809aefde68de8aaae4a6d69edd960f/docs/apache-airflow/start/docker-compose.yaml#L152




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] closed issue #12959: Skipped tasks deadlock when cleared

2021-05-31 Thread GitBox


github-actions[bot] closed issue #12959:
URL: https://github.com/apache/airflow/issues/12959


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #12822: Removing redundant calls to session.commit()

2021-05-31 Thread GitBox


github-actions[bot] commented on pull request #12822:
URL: https://github.com/apache/airflow/pull/12822#issuecomment-851727555


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed in 5 days if no further activity occurs. 
Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] closed issue #12545: Python3.8 and botocore install issue for docutils

2021-05-31 Thread GitBox


github-actions[bot] closed issue #12545:
URL: https://github.com/apache/airflow/issues/12545


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on issue #12545: Python3.8 and botocore install issue for docutils

2021-05-31 Thread GitBox


github-actions[bot] commented on issue #12545:
URL: https://github.com/apache/airflow/issues/12545#issuecomment-851727560


   This issue has been closed because it has not received response from the 
issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #13405: KubernetesPodOperator Guide

2021-05-31 Thread GitBox


github-actions[bot] commented on pull request #13405:
URL: https://github.com/apache/airflow/pull/13405#issuecomment-851727531


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed in 5 days if no further activity occurs. 
Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on issue #12959: Skipped tasks deadlock when cleared

2021-05-31 Thread GitBox


github-actions[bot] commented on issue #12959:
URL: https://github.com/apache/airflow/issues/12959#issuecomment-851727541


   This issue has been closed because it has not received response from the 
issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ephraimbuddy commented on a change in pull request #16182: Do not queue tasks when the DAG file goes missing

2021-05-31 Thread GitBox


ephraimbuddy commented on a change in pull request #16182:
URL: https://github.com/apache/airflow/pull/16182#discussion_r642684378



##
File path: airflow/jobs/scheduler_job.py
##
@@ -901,6 +901,32 @@ def __get_concurrency_maps(
 task_map[(dag_id, task_id)] = count
 return dag_map, task_map
 
+# pylint: disable=too-many-nested-blocks
+def _remove_tis_with_missing_dag(self, task_instances, session=None):
+"""
+Fail task instances and the corresponding DagRun if the dag can't be 
found in
+the dags folder but exists in SerializedDag table.
+Return task instances that exists in SerializedDag table as well as 
dags folder.
+If the dag can't be found in DagBag, just return the task instance. 
This is common in
+unittest where subdir is os.devnull
+"""
+tis = []
+for ti in task_instances:
+try:
+dag = self.dagbag.get_dag(ti.dag_id, session=session)
+if os.path.exists(dag.fileloc):
+tis.append(ti)
+else:
+dagrun = dag.get_dagrun(execution_date=ti.execution_date, 
session=session)
+if ti.state not in State.finished:
+ti.set_state(State.FAILED, session=session)
+self.log.error("Failing task: %s because DAG: %s is 
missing", ti.task_id, ti.dag_id)
+if dagrun.state not in State.finished:
+dagrun.set_state(State.FAILED)
+except SerializedDagNotFound:
+tis.append(ti)
+return tis

Review comment:
   @uranusjr Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ephraimbuddy commented on a change in pull request #16182: Do not queue tasks when the DAG file goes missing

2021-05-31 Thread GitBox


ephraimbuddy commented on a change in pull request #16182:
URL: https://github.com/apache/airflow/pull/16182#discussion_r642681719



##
File path: airflow/jobs/scheduler_job.py
##
@@ -901,6 +901,32 @@ def __get_concurrency_maps(
 task_map[(dag_id, task_id)] = count
 return dag_map, task_map
 
+# pylint: disable=too-many-nested-blocks
+def _remove_tis_with_missing_dag(self, task_instances, session=None):
+"""
+Fail task instances and the corresponding DagRun if the dag can't be 
found in
+the dags folder but exists in SerializedDag table.
+Return task instances that exists in SerializedDag table as well as 
dags folder.
+If the dag can't be found in DagBag, just return the task instance. 
This is common in
+unittest where subdir is os.devnull
+"""
+tis = []
+for ti in task_instances:
+try:
+dag = self.dagbag.get_dag(ti.dag_id, session=session)
+if os.path.exists(dag.fileloc):
+tis.append(ti)
+else:
+dagrun = dag.get_dagrun(execution_date=ti.execution_date, 
session=session)
+if ti.state not in State.finished:
+ti.set_state(State.FAILED, session=session)
+self.log.error("Failing task: %s because DAG: %s is 
missing", ti.task_id, ti.dag_id)
+if dagrun.state not in State.finished:
+dagrun.set_state(State.FAILED)
+except SerializedDagNotFound:
+tis.append(ti)
+return tis

Review comment:
   Ok. I will try that.
   The `SerializedDagNotFound` is raised at `dag = 
self.dagbag.get_dag(ti.dag_id, session=session)`. 
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on issue #16138: doc_md code block collapsing lines

2021-05-31 Thread GitBox


uranusjr commented on issue #16138:
URL: https://github.com/apache/airflow/issues/16138#issuecomment-851688673


   The cause is [the Markdown-HTML rendering 
function](https://github.com/apache/airflow/blob/9c8391a13f6ba29749675cf23f2f874f96b0cc8c/airflow/www/utils.py#L343-L350)
 uses `lstrip()`, so all leading indentations are just gone. It should be 
modified to use a more sophisticated method, such as `textwrap.dedent()` (I 
didn’t think entirely though whether changing to the function would just work).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on a change in pull request #16182: Do not queue tasks when the DAG file goes missing

2021-05-31 Thread GitBox


uranusjr commented on a change in pull request #16182:
URL: https://github.com/apache/airflow/pull/16182#discussion_r642676994



##
File path: airflow/jobs/scheduler_job.py
##
@@ -901,6 +901,32 @@ def __get_concurrency_maps(
 task_map[(dag_id, task_id)] = count
 return dag_map, task_map
 
+# pylint: disable=too-many-nested-blocks
+def _remove_tis_with_missing_dag(self, task_instances, session=None):
+"""
+Fail task instances and the corresponding DagRun if the dag can't be 
found in
+the dags folder but exists in SerializedDag table.
+Return task instances that exists in SerializedDag table as well as 
dags folder.
+If the dag can't be found in DagBag, just return the task instance. 
This is common in
+unittest where subdir is os.devnull
+"""
+tis = []
+for ti in task_instances:
+try:
+dag = self.dagbag.get_dag(ti.dag_id, session=session)
+if os.path.exists(dag.fileloc):
+tis.append(ti)
+else:
+dagrun = dag.get_dagrun(execution_date=ti.execution_date, 
session=session)
+if ti.state not in State.finished:
+ti.set_state(State.FAILED, session=session)
+self.log.error("Failing task: %s because DAG: %s is 
missing", ti.task_id, ti.dag_id)
+if dagrun.state not in State.finished:
+dagrun.set_state(State.FAILED)
+except SerializedDagNotFound:
+tis.append(ti)
+return tis

Review comment:
   Pylint is sort of right here, the function could benefit from some 
refactoring with `continue` and more fine-grained try-except (when is 
`SerializedDagNotFound` raised?)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] stephsamson closed issue #16176: Quickstart Helm Chart fails post-install

2021-05-31 Thread GitBox


stephsamson closed issue #16176:
URL: https://github.com/apache/airflow/issues/16176


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] stephsamson commented on issue #16176: Quickstart Helm Chart fails post-install

2021-05-31 Thread GitBox


stephsamson commented on issue #16176:
URL: https://github.com/apache/airflow/issues/16176#issuecomment-851676236


   @ephraimbuddy thanks that worked!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on issue #15965: sqlite3.OperationalError: no such table: dag

2021-05-31 Thread GitBox


mik-laj commented on issue #15965:
URL: https://github.com/apache/airflow/issues/15965#issuecomment-851673507


   We had race condition, but I fixed the problem, so we are safe now. 
https://github.com/apache/airflow/pull/16180


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ephraimbuddy commented on issue #16176: Quickstart Helm Chart fails post-install

2021-05-31 Thread GitBox


ephraimbuddy commented on issue #16176:
URL: https://github.com/apache/airflow/issues/16176#issuecomment-851668403


   @stephsamson can you delete the namespace and recreate it. Then run `helm 
repo update` before install?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] stephsamson commented on issue #16176: Quickstart Helm Chart fails post-install

2021-05-31 Thread GitBox


stephsamson commented on issue #16176:
URL: https://github.com/apache/airflow/issues/16176#issuecomment-851650204


   > Can you provide logs of `airflow-run-airflow-migrations` job please
   
   ```
   ❯ kubectl logs -n airflow airflow-run-airflow-migrations-hw9lz
   BACKEND=postgresql
   DB_HOST=airflow-postgresql.airflow
   DB_PORT=5432
   
   DB: 
postgresql://postgres:***@airflow-postgresql.airflow:5432/postgres?sslmode=disable
   [2021-05-31 19:39:05,756] {db.py:684} INFO - Creating tables
   INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
   INFO  [alembic.runtime.migration] Will assume transactional DDL.
   WARNI [airflow.providers_manager] Exception when importing 
'airflow.providers.google.common.hooks.leveldb.LevelDBHook' from 
'apache-airflow-providers-google' package: No module named 
'airflow.providers.google.common.hooks.leveldb'
   WARNI [airflow.providers_manager] Exception when importing 
'airflow.providers.google.common.hooks.leveldb.LevelDBHook' from 
'apache-airflow-providers-google' package: No module named 
'airflow.providers.google.common.hooks.leveldb'
   Traceback (most recent call last):
 File 
"/home/airflow/.local/lib/python3.6/site-packages/alembic/script/base.py", line 
171, in _catch_revision_errors
   yield
 File 
"/home/airflow/.local/lib/python3.6/site-packages/alembic/script/base.py", line 
365, in _upgrade_revs
   revs = list(revs)
 File 
"/home/airflow/.local/lib/python3.6/site-packages/alembic/script/revision.py", 
line 904, in _iterate_revisions
   requested_lowers = self.get_revisions(lower)
 File 
"/home/airflow/.local/lib/python3.6/site-packages/alembic/script/revision.py", 
line 455, in get_revisions
   return sum([self.get_revisions(id_elem) for id_elem in id_], ())
 File 
"/home/airflow/.local/lib/python3.6/site-packages/alembic/script/revision.py", 
line 455, in 
   return sum([self.get_revisions(id_elem) for id_elem in id_], ())
 File 
"/home/airflow/.local/lib/python3.6/site-packages/alembic/script/revision.py", 
line 460, in get_revisions
   for rev_id in resolved_id
 File 
"/home/airflow/.local/lib/python3.6/site-packages/alembic/script/revision.py", 
line 460, in 
   for rev_id in resolved_id
 File 
"/home/airflow/.local/lib/python3.6/site-packages/alembic/script/revision.py", 
line 536, in _revision_for_ident
   resolved_id,
   alembic.script.revision.ResolutionError: No such revision or branch 
'a13f7613ad25'
   
   The above exception was the direct cause of the following exception:
   
   Traceback (most recent call last):
 File "/home/airflow/.local/bin/airflow", line 8, in 
   sys.exit(main())
 File 
"/home/airflow/.local/lib/python3.6/site-packages/airflow/__main__.py", line 
40, in main
   args.func(args)
 File 
"/home/airflow/.local/lib/python3.6/site-packages/airflow/cli/cli_parser.py", 
line 48, in command
   return func(*args, **kwargs)
 File 
"/home/airflow/.local/lib/python3.6/site-packages/airflow/utils/cli.py", line 
89, in wrapper
   return f(*args, **kwargs)
 File 
"/home/airflow/.local/lib/python3.6/site-packages/airflow/cli/commands/db_command.py",
 line 48, in upgradedb
   db.upgradedb()
 File 
"/home/airflow/.local/lib/python3.6/site-packages/airflow/utils/db.py", line 
694, in upgradedb
   command.upgrade(config, 'heads')
 File 
"/home/airflow/.local/lib/python3.6/site-packages/alembic/command.py", line 
294, in upgrade
   script.run_env()
 File 
"/home/airflow/.local/lib/python3.6/site-packages/alembic/script/base.py", line 
490, in run_env
   util.load_python_file(self.dir, "env.py")
 File 
"/home/airflow/.local/lib/python3.6/site-packages/alembic/util/pyfiles.py", 
line 97, in load_python_file
   module = load_module_py(module_id, path)
 File 
"/home/airflow/.local/lib/python3.6/site-packages/alembic/util/compat.py", line 
182, in load_module_py
   spec.loader.exec_module(module)
 File "", line 678, in exec_module
 File "", line 219, in 
_call_with_frames_removed
 File 
"/home/airflow/.local/lib/python3.6/site-packages/airflow/migrations/env.py", 
line 108, in 
   run_migrations_online()
 File 
"/home/airflow/.local/lib/python3.6/site-packages/airflow/migrations/env.py", 
line 102, in run_migrations_online
   context.run_migrations()
 File "", line 8, in run_migrations
 File 
"/home/airflow/.local/lib/python3.6/site-packages/alembic/runtime/environment.py",
 line 813, in run_migrations
   self.get_context().run_migrations(**kw)
 File 
"/home/airflow/.local/lib/python3.6/site-packages/alembic/runtime/migration.py",
 line 548, in run_migrations
   for step in self._migrations_fn(heads, self):
 File 
"/home/airflow/.local/lib/python3.6/site-packages/alembic/command.py", line 
283, in upgrade
   return script._upgrade_revs(revision, rev)
 File 
"/home/airflow/.local/lib/python3.6/site-packages/alembic/script/base.py", li

[GitHub] [airflow] ephraimbuddy commented on a change in pull request #16177: Fix docs for ``dag_concurrency``

2021-05-31 Thread GitBox


ephraimbuddy commented on a change in pull request #16177:
URL: https://github.com/apache/airflow/pull/16177#discussion_r642650643



##
File path: airflow/config_templates/config.yml
##
@@ -167,10 +167,8 @@
   default: "32"
 - name: dag_concurrency
   description: |
-The maximum number of task instances allowed to run concurrently in 
each DAG. To calculate
-the number of tasks that is running concurrently for a DAG, add up the 
number of running
-tasks for all DAG runs of the DAG. This is configurable at the DAG 
level with ``concurrency``,
-which is defaulted as ``dag_concurrency``.
+The maximum number of task instances allowed to run concurrently in 
each DAG Run for that

Review comment:
   ```suggestion
   The maximum number of task instances allowed to run concurrently 
across the DAG Runs for that
   ```
   It appears the behaviour is across the DagRuns for that DAG. Wrong 
implementation? I checked by setting the dag_concurrency to 5 and triggered the 
dag multiple times. Only 5 tasks run across the DagRuns at a time and the log 
message is saying DAG's `task concurrency`( a bit confusing) 
   ```
   [2021-05-30 06:28:06,727] {scheduler_job.py:1025} INFO - DAG 
example_bash_operator has 5/5 running and queued tasks
   [2021-05-30 06:28:06,727] {scheduler_job.py:1033} INFO - Not executing 
 since the number of tasks running or queued 
from DAG example_bash_operator is >= to the DAG's task concurrency limit of 5
   ```
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Alxander64 commented on issue #16125: KubernetesPodOperator env_vars Jinja Templates Not Used

2021-05-31 Thread GitBox


Alxander64 commented on issue #16125:
URL: https://github.com/apache/airflow/issues/16125#issuecomment-851635188


   @jedcunningham Thanks, I will be sure to try this and report back!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil closed pull request #16179: Replace deprecated ``dag.sub_dag`` with ``dag.partial_subset``

2021-05-31 Thread GitBox


kaxil closed pull request #16179:
URL: https://github.com/apache/airflow/pull/16179


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil closed pull request #16177: Fix docs for ``dag_concurrency``

2021-05-31 Thread GitBox


kaxil closed pull request #16177:
URL: https://github.com/apache/airflow/pull/16177


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #16170: Adding extra requirements for build and runtime of the PROD image.

2021-05-31 Thread GitBox


potiuk commented on a change in pull request #16170:
URL: https://github.com/apache/airflow/pull/16170#discussion_r642623460



##
File path: docs/docker-stack/entrypoint.rst
##
@@ -185,66 +259,28 @@ database and creating an ``admin/admin`` Admin user with 
the following command:
 The commands above perform initialization of the SQLite database, create admin 
user with admin password
 and Admin role. They also forward local port ``8080`` to the webserver port 
and finally start the webserver.
 
-Waits for celery broker connection
---
-
-In case Postgres or MySQL DB is used, and one of the ``scheduler``, 
``celery``, ``worker``, or ``flower``
-commands are used the entrypoint will wait until the celery broker DB 
connection is available.
-
-The script detects backend type depending on the URL schema and assigns 
default port numbers if not specified
-in the URL. Then it loops until connection to the host/port specified can be 
established
-It tries :envvar:`CONNECTION_CHECK_MAX_COUNT` times and sleeps 
:envvar:`CONNECTION_CHECK_SLEEP_TIME` between checks.
-To disable check, set ``CONNECTION_CHECK_MAX_COUNT=0``.
-
-Supported schemes:
-
-* ``amqp(s)://``  (rabbitmq) - default port 5672
-* ``redis://``   - default port 6379
-* ``postgres://``- default port 5432
-* ``mysql://``   - default port 3306
-
-Waiting for connection involves checking if a matching port is open.
-The host information is derived from the variables 
:envvar:`AIRFLOW__CELERY__BROKER_URL` and
-:envvar:`AIRFLOW__CELERY__BROKER_URL_CMD`. If 
:envvar:`AIRFLOW__CELERY__BROKER_URL_CMD` variable
-is passed to the container, it is evaluated as a command to execute and result 
of this evaluation is used
-as :envvar:`AIRFLOW__CELERY__BROKER_URL`. The 
:envvar:`AIRFLOW__CELERY__BROKER_URL_CMD` variable
-takes precedence over the :envvar:`AIRFLOW__CELERY__BROKER_URL` variable.
+Installing additional requirements
+..
 
-.. _entrypoint:commands:
+Installing additional requirements can be done by specifying 
``_PIP_ADDITIONAL_REQUIREMENTS`` variable.
+The variable should contain a list of requirements that should be installed 
additionally when entering
+the containers. Note that this option slows down starting of Airflow as every 
time any container starts
+it must install new packages. Therefore this option should only be used for 
testing. When testing is
+finished, you should create your custom image with dependencies baked in.
 
-Executing commands
---
+Not all dependencies can be installed this way. Dependencies that require 
compiling cannot be installed
+because they need ``build-essentials`` installed. In case you get compilation 
problem, you should revert
+to ``customizing image`` - this is the only good way to install dependencies 
that require compilation. 

Review comment:
   Indeed, thanks for pointing out. I misunderstood the last point where 
they were talking about ``docker clusters`` where they really meant ``docker 
registries``. In such case it is not as useful as I assumed for the build 
scenario, so I removed it entirely. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #16170: Adding extra requirements for build and runtime of the PROD image.

2021-05-31 Thread GitBox


potiuk commented on a change in pull request #16170:
URL: https://github.com/apache/airflow/pull/16170#discussion_r642623460



##
File path: docs/docker-stack/entrypoint.rst
##
@@ -185,66 +259,28 @@ database and creating an ``admin/admin`` Admin user with 
the following command:
 The commands above perform initialization of the SQLite database, create admin 
user with admin password
 and Admin role. They also forward local port ``8080`` to the webserver port 
and finally start the webserver.
 
-Waits for celery broker connection
---
-
-In case Postgres or MySQL DB is used, and one of the ``scheduler``, 
``celery``, ``worker``, or ``flower``
-commands are used the entrypoint will wait until the celery broker DB 
connection is available.
-
-The script detects backend type depending on the URL schema and assigns 
default port numbers if not specified
-in the URL. Then it loops until connection to the host/port specified can be 
established
-It tries :envvar:`CONNECTION_CHECK_MAX_COUNT` times and sleeps 
:envvar:`CONNECTION_CHECK_SLEEP_TIME` between checks.
-To disable check, set ``CONNECTION_CHECK_MAX_COUNT=0``.
-
-Supported schemes:
-
-* ``amqp(s)://``  (rabbitmq) - default port 5672
-* ``redis://``   - default port 6379
-* ``postgres://``- default port 5432
-* ``mysql://``   - default port 3306
-
-Waiting for connection involves checking if a matching port is open.
-The host information is derived from the variables 
:envvar:`AIRFLOW__CELERY__BROKER_URL` and
-:envvar:`AIRFLOW__CELERY__BROKER_URL_CMD`. If 
:envvar:`AIRFLOW__CELERY__BROKER_URL_CMD` variable
-is passed to the container, it is evaluated as a command to execute and result 
of this evaluation is used
-as :envvar:`AIRFLOW__CELERY__BROKER_URL`. The 
:envvar:`AIRFLOW__CELERY__BROKER_URL_CMD` variable
-takes precedence over the :envvar:`AIRFLOW__CELERY__BROKER_URL` variable.
+Installing additional requirements
+..
 
-.. _entrypoint:commands:
+Installing additional requirements can be done by specifying 
``_PIP_ADDITIONAL_REQUIREMENTS`` variable.
+The variable should contain a list of requirements that should be installed 
additionally when entering
+the containers. Note that this option slows down starting of Airflow as every 
time any container starts
+it must install new packages. Therefore this option should only be used for 
testing. When testing is
+finished, you should create your custom image with dependencies baked in.
 
-Executing commands
---
+Not all dependencies can be installed this way. Dependencies that require 
compiling cannot be installed
+because they need ``build-essentials`` installed. In case you get compilation 
problem, you should revert
+to ``customizing image`` - this is the only good way to install dependencies 
that require compilation. 

Review comment:
   Indeed, thanks for pointing out. I misunderstood the last point where 
they were talking about ``docker clusters`` where they really meand ``docker 
registries``. In such case it is not as useful as I assumed for the build 
scenario, so I removed it entirely. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated: Uses bind volume instead of docker volume for MSSQL docker in tmpfs (#16159)

2021-05-31 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/master by this push:
 new 352fefa  Uses bind volume instead of docker volume for MSSQL docker in 
tmpfs (#16159)
352fefa is described below

commit 352fefaef1712bf5e60e3e79a86214279be69e16
Author: Jarek Potiuk 
AuthorDate: Mon May 31 19:46:59 2021 +0200

Uses bind volume instead of docker volume for MSSQL docker in tmpfs (#16159)

Seems that MSSQL is not able to use data volume when it is mounted
from tmpfs filesystem. See 
https://github.com/microsoft/mssql-docker/issues/13

In such case, instead of mounting docker-created volume we mount
a volume mounted from home directory of the user which is unlikely
to be a tmpfs volume.
---
 breeze | 23 ++
 ...end-mssql.yml => backend-mssql-bind-volume.yml} | 28 ++
 ...d-mssql.yml => backend-mssql-docker-volume.yml} | 20 
 scripts/ci/docker-compose/backend-mssql.yml|  2 --
 scripts/ci/libraries/_initialization.sh| 14 ++-
 .../ci_run_single_airflow_test_in_docker.sh| 26 +++-
 6 files changed, 68 insertions(+), 45 deletions(-)

diff --git a/breeze b/breeze
index a257ee7..792e7cf 100755
--- a/breeze
+++ b/breeze
@@ -673,6 +673,22 @@ function breeze::prepare_command_files() {
 local 
remove_sources_docker_compose_file=${SCRIPTS_CI_DIR}/docker-compose/remove-sources.yml
 local 
forward_credentials_docker_compose_file=${SCRIPTS_CI_DIR}/docker-compose/forward-credentials.yml
 
+if [[ ${BACKEND} == "mssql" ]]; then
+local docker_filesystem
+docker_filesystem=$(stat "-f" "-c" "%T" /var/lib/docker || echo 
"unknown")
+if [[ ${docker_filesystem} == "tmpfs" ]]; then
+# In case of tmpfs backend for docker, mssql fails because TMPFS 
does not support
+# O_DIRECT parameter for direct writing to the filesystem
+# https://github.com/microsoft/mssql-docker/issues/13
+# so we need to mount an external volume for its db location
+# specified by MSSQL_DATA_VOLUME
+
backend_docker_compose_file="${backend_docker_compose_file}:${SCRIPTS_CI_DIR}/docker-compose/backend-mssql-bind-volume.yml"
+else
+
backend_docker_compose_file="${backend_docker_compose_file}:${SCRIPTS_CI_DIR}/docker-compose/backend-mssql-docker-volume.yml"
+fi
+fi
+
+
 local 
compose_ci_file=${main_ci_docker_compose_file}:${backend_docker_compose_file}:${files_docker_compose_file}
 local 
compose_prod_file=${main_prod_docker_compose_file}:${backend_docker_compose_file}:${files_docker_compose_file}
 
@@ -1422,6 +1438,13 @@ function breeze::parse_arguments() {
 INTEGRATIONS+=("${INTEGRATION}")
 fi
 done
+# In case of tmpfs backend for docker, mssql fails because TMPFS 
does not support
+# O_DIRECT parameter for direct writing to the filesystem
+# https://github.com/microsoft/mssql-docker/issues/13
+# so we need to mount an external volume for its db location
+# the external db must allow for parallel testing so external 
volume is mapped
+# to the data volume. Stop should also clean the volume
+rm -rf "${MSSQL_DATA_VOLUME:?"MSSQL_DATA_VOLUME should never be 
empty!"}"/*
 shift
 ;;
 restart)
diff --git a/scripts/ci/docker-compose/backend-mssql.yml 
b/scripts/ci/docker-compose/backend-mssql-bind-volume.yml
similarity index 50%
copy from scripts/ci/docker-compose/backend-mssql.yml
copy to scripts/ci/docker-compose/backend-mssql-bind-volume.yml
index b4574ef..7c827a4 100644
--- a/scripts/ci/docker-compose/backend-mssql.yml
+++ b/scripts/ci/docker-compose/backend-mssql-bind-volume.yml
@@ -17,26 +17,12 @@
 ---
 version: "2.2"
 services:
-  airflow:
-environment:
-  - BACKEND=mssql
-  - 
AIRFLOW__CORE__SQL_ALCHEMY_CONN=mssql+pyodbc://sa:Airflow123@mssql:1433/master?driver=ODBC+Driver+17+for+SQL+Server
-  - 
AIRFLOW__CELERY__RESULT_BACKEND=db+mssql+pyodbc://sa:Airflow123@mssql:1433/master?driver=ODBC+Driver+17+for+SQL+Server
-  - AIRFLOW__CORE__EXECUTOR=LocalExecutor
-depends_on:
-  mssql:
-condition: service_healthy
   mssql:
-image: mcr.microsoft.com/mssql/server:${MSSQL_VERSION}
-environment:
-  - ACCEPT_EULA=Y
-  - SA_PASSWORD=Airflow123
 volumes:
-  - mssql-db-volume:/var/opt/mssql
-healthcheck:
-  test: ["CMD", "/opt/mssql-tools/bin/sqlcmd", "-S", "localhost",
- "-U", "sa", "-P", "Airflow123", "-Q", "SELECT 1"]
-  interval: 10s
-  timeout: 10s
-  retries: 10
-restart: always
+  # In case of tmpfs backend for docker, mssql fail

[GitHub] [airflow] potiuk merged pull request #16159: Uses bind volume instead of docker volume for MSSQL docker in tmpfs

2021-05-31 Thread GitBox


potiuk merged pull request #16159:
URL: https://github.com/apache/airflow/pull/16159


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #16170: Adding extra requirements for build and runtime of the PROD image.

2021-05-31 Thread GitBox


potiuk commented on pull request #16170:
URL: https://github.com/apache/airflow/pull/16170#issuecomment-851612630


   I also renamed some chapters and copied the "Embedding DAG" as yet another 
often used simple case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #16170: Adding extra requirements for build and runtime of the PROD image.

2021-05-31 Thread GitBox


potiuk commented on a change in pull request #16170:
URL: https://github.com/apache/airflow/pull/16170#discussion_r642612961



##
File path: scripts/in_container/prod/entrypoint_prod.sh
##
@@ -311,6 +311,10 @@ if [[ -n "${_AIRFLOW_WWW_USER_CREATE=}" ]] ; then
 create_www_user
 fi
 
+if [[ -n "${_PIP_ADDITIONAL_REQUIREMENTS=}" ]] ; then
+pip install --no-cache-dir --user "${_PIP_ADDITIONAL_REQUIREMENTS=}"
+fi
+

Review comment:
   I think it was lost during rebase :(




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] MarkusTeufelberger commented on a change in pull request #16170: Adding extra requirements for build and runtime of the PROD image.

2021-05-31 Thread GitBox


MarkusTeufelberger commented on a change in pull request #16170:
URL: https://github.com/apache/airflow/pull/16170#discussion_r642612873



##
File path: docs/docker-stack/entrypoint.rst
##
@@ -185,66 +259,28 @@ database and creating an ``admin/admin`` Admin user with 
the following command:
 The commands above perform initialization of the SQLite database, create admin 
user with admin password
 and Admin role. They also forward local port ``8080`` to the webserver port 
and finally start the webserver.
 
-Waits for celery broker connection
---
-
-In case Postgres or MySQL DB is used, and one of the ``scheduler``, 
``celery``, ``worker``, or ``flower``
-commands are used the entrypoint will wait until the celery broker DB 
connection is available.
-
-The script detects backend type depending on the URL schema and assigns 
default port numbers if not specified
-in the URL. Then it loops until connection to the host/port specified can be 
established
-It tries :envvar:`CONNECTION_CHECK_MAX_COUNT` times and sleeps 
:envvar:`CONNECTION_CHECK_SLEEP_TIME` between checks.
-To disable check, set ``CONNECTION_CHECK_MAX_COUNT=0``.
-
-Supported schemes:
-
-* ``amqp(s)://``  (rabbitmq) - default port 5672
-* ``redis://``   - default port 6379
-* ``postgres://``- default port 5432
-* ``mysql://``   - default port 3306
-
-Waiting for connection involves checking if a matching port is open.
-The host information is derived from the variables 
:envvar:`AIRFLOW__CELERY__BROKER_URL` and
-:envvar:`AIRFLOW__CELERY__BROKER_URL_CMD`. If 
:envvar:`AIRFLOW__CELERY__BROKER_URL_CMD` variable
-is passed to the container, it is evaluated as a command to execute and result 
of this evaluation is used
-as :envvar:`AIRFLOW__CELERY__BROKER_URL`. The 
:envvar:`AIRFLOW__CELERY__BROKER_URL_CMD` variable
-takes precedence over the :envvar:`AIRFLOW__CELERY__BROKER_URL` variable.
+Installing additional requirements
+..
 
-.. _entrypoint:commands:
+Installing additional requirements can be done by specifying 
``_PIP_ADDITIONAL_REQUIREMENTS`` variable.
+The variable should contain a list of requirements that should be installed 
additionally when entering
+the containers. Note that this option slows down starting of Airflow as every 
time any container starts
+it must install new packages. Therefore this option should only be used for 
testing. When testing is
+finished, you should create your custom image with dependencies baked in.
 
-Executing commands
---
+Not all dependencies can be installed this way. Dependencies that require 
compiling cannot be installed
+because they need ``build-essentials`` installed. In case you get compilation 
problem, you should revert
+to ``customizing image`` - this is the only good way to install dependencies 
that require compilation. 

Review comment:
   Maybe you misunderstood the talos case? They explicitly tell you to run 
the `registry:2` container which (unsurprisingly) _is_ a docker registry. This 
is NOT the local docker cache, it is a fully fledged implementation of a docker 
registry. All you do then is to push your images to this local registry and 
tell your cluster to pull from it in case the one in the big bad internet is 
not available (as would be the case for air-gaped systems). From the PoV of the 
cluster it is accessing an external registry, from the PoV of your machine it 
runs a container with a docker registry in it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #16170: Adding extra requirements for build and runtime of the PROD image.

2021-05-31 Thread GitBox


potiuk commented on a change in pull request #16170:
URL: https://github.com/apache/airflow/pull/16170#discussion_r642611713



##
File path: scripts/in_container/prod/entrypoint_prod.sh
##
@@ -311,6 +311,10 @@ if [[ -n "${_AIRFLOW_WWW_USER_CREATE=}" ]] ; then
 create_www_user
 fi
 
+if [[ -n "${_PIP_ADDITIONAL_REQUIREMENTS=}" ]] ; then
+pip install --no-cache-dir --user "${_PIP_ADDITIONAL_REQUIREMENTS=}"
+fi
+

Review comment:
   Added! Thanks for reminding me @mik-laj !




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #16170: Adding extra requirements for build and runtime of the PROD image.

2021-05-31 Thread GitBox


potiuk commented on a change in pull request #16170:
URL: https://github.com/apache/airflow/pull/16170#discussion_r642609947



##
File path: scripts/in_container/prod/entrypoint_prod.sh
##
@@ -311,6 +311,10 @@ if [[ -n "${_AIRFLOW_WWW_USER_CREATE=}" ]] ; then
 create_www_user
 fi
 
+if [[ -n "${_PIP_ADDITIONAL_REQUIREMENTS=}" ]] ; then
+pip install --no-cache-dir --user "${_PIP_ADDITIONAL_REQUIREMENTS=}"
+fi
+

Review comment:
   Ah. I forgot about it.. Sorry. Adding it now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #16170: Adding extra requirements for build and runtime of the PROD image.

2021-05-31 Thread GitBox


potiuk commented on a change in pull request #16170:
URL: https://github.com/apache/airflow/pull/16170#discussion_r642609374



##
File path: docs/docker-stack/entrypoint.rst
##
@@ -185,66 +259,28 @@ database and creating an ``admin/admin`` Admin user with 
the following command:
 The commands above perform initialization of the SQLite database, create admin 
user with admin password
 and Admin role. They also forward local port ``8080`` to the webserver port 
and finally start the webserver.
 
-Waits for celery broker connection
---
-
-In case Postgres or MySQL DB is used, and one of the ``scheduler``, 
``celery``, ``worker``, or ``flower``
-commands are used the entrypoint will wait until the celery broker DB 
connection is available.
-
-The script detects backend type depending on the URL schema and assigns 
default port numbers if not specified
-in the URL. Then it loops until connection to the host/port specified can be 
established
-It tries :envvar:`CONNECTION_CHECK_MAX_COUNT` times and sleeps 
:envvar:`CONNECTION_CHECK_SLEEP_TIME` between checks.
-To disable check, set ``CONNECTION_CHECK_MAX_COUNT=0``.
-
-Supported schemes:
-
-* ``amqp(s)://``  (rabbitmq) - default port 5672
-* ``redis://``   - default port 6379
-* ``postgres://``- default port 5432
-* ``mysql://``   - default port 3306
-
-Waiting for connection involves checking if a matching port is open.
-The host information is derived from the variables 
:envvar:`AIRFLOW__CELERY__BROKER_URL` and
-:envvar:`AIRFLOW__CELERY__BROKER_URL_CMD`. If 
:envvar:`AIRFLOW__CELERY__BROKER_URL_CMD` variable
-is passed to the container, it is evaluated as a command to execute and result 
of this evaluation is used
-as :envvar:`AIRFLOW__CELERY__BROKER_URL`. The 
:envvar:`AIRFLOW__CELERY__BROKER_URL_CMD` variable
-takes precedence over the :envvar:`AIRFLOW__CELERY__BROKER_URL` variable.
+Installing additional requirements
+..
 
-.. _entrypoint:commands:
+Installing additional requirements can be done by specifying 
``_PIP_ADDITIONAL_REQUIREMENTS`` variable.
+The variable should contain a list of requirements that should be installed 
additionally when entering
+the containers. Note that this option slows down starting of Airflow as every 
time any container starts
+it must install new packages. Therefore this option should only be used for 
testing. When testing is
+finished, you should create your custom image with dependencies baked in.
 
-Executing commands
---
+Not all dependencies can be installed this way. Dependencies that require 
compiling cannot be installed
+because they need ``build-essentials`` installed. In case you get compilation 
problem, you should revert
+to ``customizing image`` - this is the only good way to install dependencies 
that require compilation. 

Review comment:
   I think this is a very useful case to mention - for `minikube` and 
`kind` users. The users need to be aware of the options they have in different 
situations. We have to remember that this documentation is for different kinds 
of users (for example in the same PR we added _PIP_ADDITIONAL_REQUIREMENTS for 
those kind of users - which should never be considered as production use).
   
   If we add this, there is no reason we should not add the other. 
   
   I modified it a bit and spelled out minikube and kind explicitly.  I also 
thought a bit and added he Talos case as another register-less way.  It is 
really interesting how they implemented pass-through to the local docker 
cluster cache and I like it a lot - it's better than the `load` methods of kind 
and minikube - still providing register-less usage of locally built images. It 
allows even faster iterations (not mentioning the air-gaped use which is super 
important for some of our users as we've learned). It was cool I've learned 
that.
   
   So finally we have four methods - each for different purpose and with 
different requirements/dependencies.
   
   ```
  * For ``docker-compose`` deployment, that's all you need. The image is 
stored in docker engine cache
and docker compose will use it from there.
   
  * For some - development targeted clusters - Kubernetes deployments you 
can load the images directly to
Kubernetes clusters. Clusters such as `kind` or `minikube` have 
dedicated ``load`` method to load the
images to the cluster.
   
 * In some cases (for example in `Talos 
`_)
   you can configure Kubernetes cluster to also use the local docker cache 
rather than remote registry - this is
   very similar as Docker-Compose case and it is often used in air-gaped 
systems to provide
   Kubernetes cluster access to container images.
   
 * Last but not least - you can push your image to a remote reg

[GitHub] [airflow] mik-laj commented on a change in pull request #16170: Adding extra requirements for build and runtime of the PROD image.

2021-05-31 Thread GitBox


mik-laj commented on a change in pull request #16170:
URL: https://github.com/apache/airflow/pull/16170#discussion_r642607987



##
File path: scripts/in_container/prod/entrypoint_prod.sh
##
@@ -311,6 +311,10 @@ if [[ -n "${_AIRFLOW_WWW_USER_CREATE=}" ]] ; then
 create_www_user
 fi
 
+if [[ -n "${_PIP_ADDITIONAL_REQUIREMENTS=}" ]] ; then
+pip install --no-cache-dir --user "${_PIP_ADDITIONAL_REQUIREMENTS=}"
+fi
+

Review comment:
   What is the status of the warning? Should we add it or not?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] sglickman commented on issue #15572: import error

2021-05-31 Thread GitBox


sglickman commented on issue #15572:
URL: https://github.com/apache/airflow/issues/15572#issuecomment-851594955


   Not sure if this is exactly your issue, but I was seeing something like this 
on Kubernetes, where I got UI errors about modules not found which I knew were 
present and in $PYTHONPATH (I confirmed this by running `kubectl exec $POD_NAME 
-it sh` and opening a Python interactive session on the scheduler container).
   
   What ended up doing the trick for me was ensuring that the gitsync completed 
before the scheduler was brought up - that is, creating an git-sync 
initcontainer with the `GIT_SYNC_ONE_TIME` flag. hope this helps!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] MarkusTeufelberger commented on a change in pull request #16170: Adding extra requirements for build and runtime of the PROD image.

2021-05-31 Thread GitBox


MarkusTeufelberger commented on a change in pull request #16170:
URL: https://github.com/apache/airflow/pull/16170#discussion_r642596616



##
File path: docs/docker-stack/entrypoint.rst
##
@@ -185,66 +259,28 @@ database and creating an ``admin/admin`` Admin user with 
the following command:
 The commands above perform initialization of the SQLite database, create admin 
user with admin password
 and Admin role. They also forward local port ``8080`` to the webserver port 
and finally start the webserver.
 
-Waits for celery broker connection
---
-
-In case Postgres or MySQL DB is used, and one of the ``scheduler``, 
``celery``, ``worker``, or ``flower``
-commands are used the entrypoint will wait until the celery broker DB 
connection is available.
-
-The script detects backend type depending on the URL schema and assigns 
default port numbers if not specified
-in the URL. Then it loops until connection to the host/port specified can be 
established
-It tries :envvar:`CONNECTION_CHECK_MAX_COUNT` times and sleeps 
:envvar:`CONNECTION_CHECK_SLEEP_TIME` between checks.
-To disable check, set ``CONNECTION_CHECK_MAX_COUNT=0``.
-
-Supported schemes:
-
-* ``amqp(s)://``  (rabbitmq) - default port 5672
-* ``redis://``   - default port 6379
-* ``postgres://``- default port 5432
-* ``mysql://``   - default port 3306
-
-Waiting for connection involves checking if a matching port is open.
-The host information is derived from the variables 
:envvar:`AIRFLOW__CELERY__BROKER_URL` and
-:envvar:`AIRFLOW__CELERY__BROKER_URL_CMD`. If 
:envvar:`AIRFLOW__CELERY__BROKER_URL_CMD` variable
-is passed to the container, it is evaluated as a command to execute and result 
of this evaluation is used
-as :envvar:`AIRFLOW__CELERY__BROKER_URL`. The 
:envvar:`AIRFLOW__CELERY__BROKER_URL_CMD` variable
-takes precedence over the :envvar:`AIRFLOW__CELERY__BROKER_URL` variable.
+Installing additional requirements
+..
 
-.. _entrypoint:commands:
+Installing additional requirements can be done by specifying 
``_PIP_ADDITIONAL_REQUIREMENTS`` variable.
+The variable should contain a list of requirements that should be installed 
additionally when entering
+the containers. Note that this option slows down starting of Airflow as every 
time any container starts
+it must install new packages. Therefore this option should only be used for 
testing. When testing is
+finished, you should create your custom image with dependencies baked in.
 
-Executing commands
---
+Not all dependencies can be installed this way. Dependencies that require 
compiling cannot be installed
+because they need ``build-essentials`` installed. In case you get compilation 
problem, you should revert
+to ``customizing image`` - this is the only good way to install dependencies 
that require compilation. 

Review comment:
   "Loading directly to a cluster" is not a very typical feature of 
production clusters - as you saw in the talos case it is more likely that there 
is a registry somewhere (typically artifactory or similar) instead. I would 
remove that sentence with and only mention registries.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #16170: Adding extra requirements for build and runtime of the PROD image.

2021-05-31 Thread GitBox


potiuk commented on a change in pull request #16170:
URL: https://github.com/apache/airflow/pull/16170#discussion_r642579464



##
File path: docs/docker-stack/entrypoint.rst
##
@@ -185,66 +259,28 @@ database and creating an ``admin/admin`` Admin user with 
the following command:
 The commands above perform initialization of the SQLite database, create admin 
user with admin password
 and Admin role. They also forward local port ``8080`` to the webserver port 
and finally start the webserver.
 
-Waits for celery broker connection
---
-
-In case Postgres or MySQL DB is used, and one of the ``scheduler``, 
``celery``, ``worker``, or ``flower``
-commands are used the entrypoint will wait until the celery broker DB 
connection is available.
-
-The script detects backend type depending on the URL schema and assigns 
default port numbers if not specified
-in the URL. Then it loops until connection to the host/port specified can be 
established
-It tries :envvar:`CONNECTION_CHECK_MAX_COUNT` times and sleeps 
:envvar:`CONNECTION_CHECK_SLEEP_TIME` between checks.
-To disable check, set ``CONNECTION_CHECK_MAX_COUNT=0``.
-
-Supported schemes:
-
-* ``amqp(s)://``  (rabbitmq) - default port 5672
-* ``redis://``   - default port 6379
-* ``postgres://``- default port 5432
-* ``mysql://``   - default port 3306
-
-Waiting for connection involves checking if a matching port is open.
-The host information is derived from the variables 
:envvar:`AIRFLOW__CELERY__BROKER_URL` and
-:envvar:`AIRFLOW__CELERY__BROKER_URL_CMD`. If 
:envvar:`AIRFLOW__CELERY__BROKER_URL_CMD` variable
-is passed to the container, it is evaluated as a command to execute and result 
of this evaluation is used
-as :envvar:`AIRFLOW__CELERY__BROKER_URL`. The 
:envvar:`AIRFLOW__CELERY__BROKER_URL_CMD` variable
-takes precedence over the :envvar:`AIRFLOW__CELERY__BROKER_URL` variable.
+Installing additional requirements
+..
 
-.. _entrypoint:commands:
+Installing additional requirements can be done by specifying 
``_PIP_ADDITIONAL_REQUIREMENTS`` variable.
+The variable should contain a list of requirements that should be installed 
additionally when entering
+the containers. Note that this option slows down starting of Airflow as every 
time any container starts
+it must install new packages. Therefore this option should only be used for 
testing. When testing is
+finished, you should create your custom image with dependencies baked in.
 
-Executing commands
---
+Not all dependencies can be installed this way. Dependencies that require 
compiling cannot be installed
+because they need ``build-essentials`` installed. In case you get compilation 
problem, you should revert
+to ``customizing image`` - this is the only good way to install dependencies 
that require compilation. 

Review comment:
   I've added both - cross-references as well as more context on why you 
want to build your images together with very short guide on how to do it and 
exposing two most common cases for building the image. I also mentioned 
`kaniko` and `podman` as alternatives to docker and explained the `load` method 
available to load the image. Based on the discussion in 
https://github.com/airflow-helm/charts/issues/211#issuecomment-851421093




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #16179: Replace deprecated ``dag.sub_dag`` with ``dag.partial_subset``

2021-05-31 Thread GitBox


github-actions[bot] commented on pull request #16179:
URL: https://github.com/apache/airflow/pull/16179#issuecomment-851570707


   The PR most likely needs to run full matrix of tests because it modifies 
parts of the core of Airflow. However, committers might decide to merge it 
quickly and take the risk. If they don't merge it quickly - please rebase it to 
the latest master at your convenience, or amend the last commit of the PR, and 
push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #16177: Fix docs for ``dag_concurrency``

2021-05-31 Thread GitBox


github-actions[bot] commented on pull request #16177:
URL: https://github.com/apache/airflow/pull/16177#issuecomment-851570513


   The PR most likely needs to run full matrix of tests because it modifies 
parts of the core of Airflow. However, committers might decide to merge it 
quickly and take the risk. If they don't merge it quickly - please rebase it to 
the latest master at your convenience, or amend the last commit of the PR, and 
push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #14535: Fix: Mysql 5.7 id utf8mb3

2021-05-31 Thread GitBox


potiuk commented on a change in pull request #14535:
URL: https://github.com/apache/airflow/pull/14535#discussion_r642560346



##
File path: airflow/migrations/versions/e3a246e0dc1_current_schema.py
##
@@ -38,6 +38,8 @@
 depends_on = None
 
 
+print(COLLATION_ARGS)

Review comment:
   I guess we shoudl remove it ?

##
File path: 
airflow/migrations/versions/bbf4a7ad0465_remove_id_column_from_xcom.py
##
@@ -110,7 +110,8 @@ def upgrade():
 bop.drop_index('idx_xcom_dag_task_date')
 # mssql doesn't allow primary keys with nullable columns
 if conn.dialect.name != 'mssql':
-bop.create_primary_key('pk_xcom', ['dag_id', 'task_id', 'key', 
'execution_date'])
+#bop.create_primary_key('pk_xcom', ['dag_id', 'task_id', 
'key', 'execution_date'])
+bop.create_primary_key('pk_xcom', ['dag_id', 'task_id', 'key'])

Review comment:
   Why this




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #16002: fix remote logging when blob already exists

2021-05-31 Thread GitBox


potiuk commented on pull request #16002:
URL: https://github.com/apache/airflow/pull/16002#issuecomment-851556006


   Can you please rebase so that we can run the tests?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #16148: Downloading files from S3 broken in 2.1.0

2021-05-31 Thread GitBox


potiuk commented on issue #16148:
URL: https://github.com/apache/airflow/issues/16148#issuecomment-851547784


   maybe dump all the environment variables in both worker run Python operator, 
and when you "exec" into the same container. This is the only way I can see 
those can differ. Also - just to check again - make sure you are using the same 
user/group as the workers.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ConstantinoSchillebeeckx edited a comment on issue #16148: Downloading files from S3 broken in 2.1.0

2021-05-31 Thread GitBox


ConstantinoSchillebeeckx edited a comment on issue #16148:
URL: https://github.com/apache/airflow/issues/16148#issuecomment-851543703


   > Are you running the manual part in the same instances/containers as 
Airflow ? 
   
   I am indeed; I spin up all my docker services (locally), and then login to 
the worker service to execute the small script shown above (and confirm that I 
can correctly download the S3 file).
   
   > I believe the reason is environmental (of where/how your workers are run) 
not Airflow itself.
   
   I'm not sure how this could be; I'm using the same `docker-compose.yaml` for 
both Airflow environments (hence the same environmental configuration) - the 
only change between them is the upgrade from 2.0.2 to 2.1.0 in my 
requirements.txt
   
   Thanks for the link to a possible solution; this one confuses me too as S3 
buckets are specified globally (i.e. have no region).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ConstantinoSchillebeeckx commented on issue #16148: Downloading files from S3 broken in 2.1.0

2021-05-31 Thread GitBox


ConstantinoSchillebeeckx commented on issue #16148:
URL: https://github.com/apache/airflow/issues/16148#issuecomment-851543703


   > Are you running the manual part in the same instances/containers as 
Airflow ? 
   
   I am indeed; I spin up all my docker services (locally), and then login to 
the worker service to execute the small script shown below (and confirm that I 
can correctly download the S3 file).
   
   > I believe the reason is environmental (of where/how your workers are run) 
not Airflow itself.
   
   I'm not sure how this could be; I'm using the same `docker-compose.yaml` for 
both Airflow environments (hence the same environmental configuration) - the 
only change between them is the upgrade from 2.0.2 to 2.1.0 in my 
requirements.txt
   
   Thanks for the link to a possible solution; this one confuses me too as S3 
buckets are specified globally (i.e. have no region).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ConstantinoSchillebeeckx edited a comment on issue #16148: Downloading files from S3 broken in 2.1.0

2021-05-31 Thread GitBox


ConstantinoSchillebeeckx edited a comment on issue #16148:
URL: https://github.com/apache/airflow/issues/16148#issuecomment-851023108


   > f.seek(0)?
   
   Didn't work either
   
   ---
   
   I was curious to see if I could take Airflow out of the equation, so I 
created a new virtual env and installed:
   ```
   pip install apache-airflow==2.1.0 --constraint 
"https://raw.githubusercontent.com/apache/airflow/constraints-2.1.0/constraints-3.7.txt";
   pip install apache-airflow-providers-amazon==1.4.0
   ```
   
   Then I executed the following script:
   ```python
   # -*- coding: utf-8 -*-
   import boto3
   
   def download_file_from_s3():
   
   s3 = boto3.resource('s3')
   
   bucket = 'secret-bucket'
   key = 'tmp.txt'
   
   with open('/tmp/s3_hook.txt', 'w') as f:
   s3.Bucket(bucket).Object(key).download_file(f.name)
   print(f"File downloaded: {f.name}")
   
   
   with open(f.name, 'r') as f_in:
   print(f"FILE CONTENT {f_in.read()}")
   
   
   download_file_from_s3()
   ```
   
   
![image](https://user-images.githubusercontent.com/8518288/120111201-a674cf80-c136-11eb-8d31-448d56aa7be3.png)
   
   So, `boto3` is not the issue here?
   
   Finally, as a sanity check, I updated the DAG to match the script above:
   ```python
   # -*- coding: utf-8 -*-
   import os
   import boto3
   import logging
   
   from airflow import DAG
   from airflow.operators.python import PythonOperator
   from airflow.utils.dates import days_ago
   from airflow.providers.amazon.aws.hooks.s3 import S3Hook
   
   
   def download_file_from_s3():
   
   # authed with ENVIRONMENT variables
   s3 = boto3.resource('s3')
   
   bucket = 'secret-bucket'
   key = 'tmp.txt'
   
   with open('/tmp/s3_hook.txt', 'w') as f:
   s3.Bucket(bucket).Object(key).download_file(f.name)
   logging.info(f"File downloaded: {f.name}")
   
   with open(f.name, 'r') as f_in:
   logging.info(f"FILE CONTENT {f_in.read()}")
   
   dag = DAG(
   "tmp",
   catchup=False,
   default_args={
   "start_date": days_ago(1),
   },
   schedule_interval=None,
   )
   
   download_file_from_s3ile = PythonOperator(
   task_id="download_file_from_s3ile", 
python_callable=download_file_from_s3, dag=dag
   )
   ```
   
   This resulted in the same, erroneous (empty downloaded file) behavior. 😢 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #16188: Running airflow upgrade_check failed on DatabaseVersionCheckRule with postgres images

2021-05-31 Thread GitBox


potiuk commented on issue #16188:
URL: https://github.com/apache/airflow/issues/16188#issuecomment-851533480


   Ah cool !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] eladkal closed issue #16164: Weekly Run Dag not trigerring

2021-05-31 Thread GitBox


eladkal closed issue #16164:
URL: https://github.com/apache/airflow/issues/16164


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] eladkal commented on issue #16164: Weekly Run Dag not trigerring

2021-05-31 Thread GitBox


eladkal commented on issue #16164:
URL: https://github.com/apache/airflow/issues/16164#issuecomment-851533318


   I think your issue is because you set `days_ago(6)` so it's not a full week 
ago when I changed it to be  `days_ago(10)` it worked fine.
   In any case the `start_date` needs to be static date and in both cases ` 
schedule_interval='15 5 * * SUN'` works fine on master branch:
   
   ```
   from airflow.models import DAG
   from airflow.operators.bash import BashOperator
   from datetime import timedelta, datetime
   
   args = {
   'owner': 'partha',
   'depends_on_past': False,
   'start_date': datetime(2015, 12, 1),
   'retries': 0,
   'retry_delay': timedelta(minutes=5)
   }
   
   dag = DAG(
   dag_id='scheduler_interval_102',
   schedule_interval='15 5 * * SUN',
   default_args=args,
   catchup=False,
   tags=['Test1']
   )
   
   hello_my_task = BashOperator(
   task_id='hello_task_1',
   bash_command='echo "hello_world %s %s"' % (dag.dag_id, 
dag.schedule_interval),
   dag=dag,
   )
   
   ```
   
   ![Screen Shot 2021-05-31 at 17 32 
15](https://user-images.githubusercontent.com/45845474/120208607-5715c880-c236-11eb-806c-2f943e2e5874.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] eladkal commented on issue #16188: Running airflow upgrade_check failed on DatabaseVersionCheckRule with postgres images

2021-05-31 Thread GitBox


eladkal commented on issue #16188:
URL: https://github.com/apache/airflow/issues/16188#issuecomment-851522019


   It's fixed in https://github.com/apache/airflow/pull/15122
   
   @potiuk we have several PRs already merged for Airflow Upgrade Check - 1.4.0 
including a fix for this one :) 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] eladkal closed issue #16188: Running airflow upgrade_check failed on DatabaseVersionCheckRule with postgres images

2021-05-31 Thread GitBox


eladkal closed issue #16188:
URL: https://github.com/apache/airflow/issues/16188


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk edited a comment on issue #16188: Running airflow upgrade_check failed on DatabaseVersionCheckRule with postgres images

2021-05-31 Thread GitBox


potiuk edited a comment on issue #16188:
URL: https://github.com/apache/airflow/issues/16188#issuecomment-851500083


   Yeah. Bad check. The reported version seems to be OK so you can proceed.
   
   I am not sure though if we will release a new upgrade check version for that 
one failure. 
   
   As a quick workaround, you can simply disable the `DatabaseVersionCheckRule` 
as described here:
   
   
https://airflow.apache.org/docs/apache-airflow/stable/upgrade-check.html#turning-off-checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #16188: Running airflow upgrade_check failed on DatabaseVersionCheckRule with postgres images

2021-05-31 Thread GitBox


potiuk commented on issue #16188:
URL: https://github.com/apache/airflow/issues/16188#issuecomment-851500083


   Yeah. Bad check. The reported version seems to be OK so you can proceed.
   
   I am not sure though if we will release a new upgrade check version. 
   
   As a quick workaround, you can simply disable the `DatabaseVersionCheckRule` 
as described here:
   
   
https://airflow.apache.org/docs/apache-airflow/stable/upgrade-check.html#turning-off-checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk edited a comment on issue #16185: Combined Executor - KubernetesExecutor + LocalExecutor

2021-05-31 Thread GitBox


potiuk edited a comment on issue #16185:
URL: https://github.com/apache/airflow/issues/16185#issuecomment-851439696


   Worth looking at creating it !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated: Fix loading CI images from new `airflow-ci` location (#16187)

2021-05-31 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/master by this push:
 new 61cddb3  Fix loading CI images from new `airflow-ci` location (#16187)
61cddb3 is described below

commit 61cddb333a62491a104e114d5d92885aa6848500
Author: Jarek Potiuk 
AuthorDate: Mon May 31 15:27:36 2021 +0200

Fix loading CI images from new `airflow-ci` location (#16187)
---
 scripts/ci/docker-compose/integration-kerberos.yml | 2 +-
 scripts/ci/docker-compose/integration-openldap.yml | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/ci/docker-compose/integration-kerberos.yml 
b/scripts/ci/docker-compose/integration-kerberos.yml
index 3582fad..b9ca4cc 100644
--- a/scripts/ci/docker-compose/integration-kerberos.yml
+++ b/scripts/ci/docker-compose/integration-kerberos.yml
@@ -18,7 +18,7 @@
 version: "2.2"
 services:
   kdc-server-example-com:
-image: apache/airflow:krb5-kdc-server-2021.04.28
+image: apache/airflow-ci:krb5-kdc-server-2021.04.28
 hostname: krb5-kdc-server-example-com
 domainname: example.com
 networks:
diff --git a/scripts/ci/docker-compose/integration-openldap.yml 
b/scripts/ci/docker-compose/integration-openldap.yml
index af81551..60ff58a 100644
--- a/scripts/ci/docker-compose/integration-openldap.yml
+++ b/scripts/ci/docker-compose/integration-openldap.yml
@@ -18,7 +18,7 @@
 version: "2.2"
 services:
   openldap:
-image: apache/airflow:openldap-2020.07.10-2.4.50
+image: apache/airflow-ci:openldap-2020.07.10-2.4.50
 command: "--copy-service"
 environment:
   - LDAP_DOMAIN=example.com


[GitHub] [airflow] potiuk merged pull request #16187: Fix loading CI images from new `airflow-ci` location

2021-05-31 Thread GitBox


potiuk merged pull request #16187:
URL: https://github.com/apache/airflow/pull/16187


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] boring-cyborg[bot] commented on issue #16188: Running airflow upgrade_check failed on DatabaseVersionCheckRule with postgres images

2021-05-31 Thread GitBox


boring-cyborg[bot] commented on issue #16188:
URL: https://github.com/apache/airflow/issues/16188#issuecomment-851487533


   Thanks for opening your first issue here! Be sure to follow the issue 
template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk opened a new pull request #16187: Fix loading CI images from new `airflow-ci` location

2021-05-31 Thread GitBox


potiuk opened a new pull request #16187:
URL: https://github.com/apache/airflow/pull/16187


   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] alonrolnik opened a new issue #16188: Running airflow upgrade_check failed on DatabaseVersionCheckRule with postgres images

2021-05-31 Thread GitBox


alonrolnik opened a new issue #16188:
URL: https://github.com/apache/airflow/issues/16188


   **Apache Airflow version**: 1.10.15
   
   **Environment**:
   
   **What happened**:
   
   Running airflow upgrade_check failed on DatabaseVersionCheckRule with 
postgres image, for example, postgres:13.3
   
   ```
   (app-root) sh-4.4$ airflow upgrade_check
   steps_version_dict: {'sample-step-service': 'latest', 'aws-training-step': 
'latest', 'aws-preprocess-step': 'latest', 'aws-forecast-step': 'latest', 
'csv-cleanup-step': 'latest', 'customer-data-export-step': 'latest', 
'customer-data-import-step': 'latest', 'forecasting-statistics-step': 'latest', 
'aws-predictor-delete-step': 'latest', 'aws-multi-predictor-delete-step': 
'latest', 'aws-customer-delete-trigger-step': 'latest', 's3-delete-step': 
'latest', 'customer-db-delete-step': 'latest', 'flow-status-step': 'latest', 
'airflow-sidecar': 'local_dev'}
   Using: 
docker-unstable.anaplan-np.net/planning-ai/planning-ai-airflow-sidecar:local_dev
 for airflow sidecar
   
   
=== 
STATUS 
==
   
   Check for latest versions of apache-airflow and 
checkerSUCCESS
   Remove airflow.AirflowMacroPlugin 
classSUCCESS
   Ensure users are not using custom metaclasses in custom 
operators..SUCCESS
   Chain between DAG and operator not 
allowed.SUCCESS
   Connection.conn_type is not 
nullable...SUCCESS
   Custom Executors now require full 
path.SUCCESS
   Traceback (most recent call last):
 File "/opt/app-root/bin/airflow", line 37, in 
   args.func(args)
 File 
"/opt/app-root/lib/python3.6/site-packages/airflow/upgrade/checker.py", line 
118, in run
   all_problems = check_upgrade(formatter, rules)
 File 
"/opt/app-root/lib/python3.6/site-packages/airflow/upgrade/checker.py", line 
38, in check_upgrade
   rule_status = RuleStatus.from_rule(rule)
 File 
"/opt/app-root/lib/python3.6/site-packages/airflow/upgrade/problem.py", line 
44, in from_rule
   result = rule.check()
 File "/opt/app-root/lib/python3.6/site-packages/airflow/utils/db.py", line 
74, in wrapper
   return func(*args, **kwargs)
 File 
"/opt/app-root/lib/python3.6/site-packages/airflow/upgrade/rules/postgres_mysql_sqlite_version_upgrade_check.py",
 line 52, in check
   installed_postgres_version = Version(session.execute('SHOW 
server_version;').scalar())
 File "/opt/app-root/lib/python3.6/site-packages/packaging/version.py", 
line 298, in __init__
   raise InvalidVersion("Invalid version: '{0}'".format(version))
   packaging.version.InvalidVersion: Invalid version: '13.3 (Debian 
13.3-1.pgdg100+1)'
   ```
   **What you expected to happen**:
   I expect the check to pass.
   
   **How to reproduce it**:
   Spin up postgres container from postgres repository and run `airflow 
upgrade_check`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Dr-Denzy commented on issue #15456: KubernetesPodOperator raises 404 Not Found when `is_delete_operator_pod=True` and the Pod fails.

2021-05-31 Thread GitBox


Dr-Denzy commented on issue #15456:
URL: https://github.com/apache/airflow/issues/15456#issuecomment-851456503


   Oh yeah... I will work on it once I finish with the present issue I am 
working on.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow-client-python] tag 2.1.0 created (now a80a087)

2021-05-31 Thread msumit
This is an automated email from the ASF dual-hosted git repository.

msumit pushed a change to tag 2.1.0
in repository https://gitbox.apache.org/repos/asf/airflow-client-python.git.


  at a80a087  (commit)
No new revisions were added by this update.


[GitHub] [airflow-client-python] msumit merged pull request #23: Update changelog for 2.1.0 release

2021-05-31 Thread GitBox


msumit merged pull request #23:
URL: https://github.com/apache/airflow-client-python/pull/23


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow-client-python] branch master updated: Update changelog for 2.1.0 release (#23)

2021-05-31 Thread msumit
This is an automated email from the ASF dual-hosted git repository.

msumit pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow-client-python.git


The following commit(s) were added to refs/heads/master by this push:
 new a80a087  Update changelog for 2.1.0 release (#23)
a80a087 is described below

commit a80a087479147b6c7073161e5973edd06af4ce7f
Author: Sumit Maheshwari 
AuthorDate: Mon May 31 17:31:12 2021 +0530

Update changelog for 2.1.0 release (#23)
---
 CHANGELOG.md | 20 
 1 file changed, 20 insertions(+)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index cb03d0c..aab3aa5 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -17,6 +17,26 @@
  under the License.
  -->
 
+# v2.1.0
+
+Apache Airflow API version: 2.1.x
+
+###Major changes:
+
+ - Client code is generated using OpenApi's 5.1.1 generator CLI
+
+###Major fixes:
+
+ - Fixed the iteration issue on array items caused by unsupported class 
'object' (issue #15)
+
+###New API supported:
+
+ - Permissions
+ - Plugins
+ - Providers
+ - Roles
+ - Users
+
 # v2.0.0
 
 Apache Airflow API version: 2.0.x


[GitHub] [airflow-client-python] msumit opened a new pull request #23: Update changelog for 2.1.0 release

2021-05-31 Thread GitBox


msumit opened a new pull request #23:
URL: https://github.com/apache/airflow-client-python/pull/23


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #16185: Combined Executor - KubernetesExecutor + LocalExecutor

2021-05-31 Thread GitBox


potiuk commented on issue #16185:
URL: https://github.com/apache/airflow/issues/16185#issuecomment-851439696


   Worth looking at creating !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




svn commit: r48020 - /release/airflow/clients/python/2.1.0/

2021-05-31 Thread msumit
Author: msumit
Date: Mon May 31 11:46:50 2021
New Revision: 48020

Log:
Release Apache Airflow Python Client 2.1.0 from 2.1.0rc1

Added:
release/airflow/clients/python/2.1.0/
release/airflow/clients/python/2.1.0/apache-airflow-client-2.1.0-bin.tar.gz
  - copied unchanged from r47927, 
dev/airflow/clients/python/2.1.0rc1/apache-airflow-client-2.1.0rc1-bin.tar.gz

release/airflow/clients/python/2.1.0/apache-airflow-client-2.1.0-bin.tar.gz.asc
  - copied unchanged from r47927, 
dev/airflow/clients/python/2.1.0rc1/apache-airflow-client-2.1.0rc1-bin.tar.gz.asc

release/airflow/clients/python/2.1.0/apache-airflow-client-2.1.0-bin.tar.gz.sha512
  - copied unchanged from r47927, 
dev/airflow/clients/python/2.1.0rc1/apache-airflow-client-2.1.0rc1-bin.tar.gz.sha512

release/airflow/clients/python/2.1.0/apache-airflow-client-2.1.0-source.tar.gz
  - copied unchanged from r47927, 
dev/airflow/clients/python/2.1.0rc1/apache-airflow-client-2.1.0rc1-source.tar.gz

release/airflow/clients/python/2.1.0/apache-airflow-client-2.1.0-source.tar.gz.asc
  - copied unchanged from r47927, 
dev/airflow/clients/python/2.1.0rc1/apache-airflow-client-2.1.0rc1-source.tar.gz.asc

release/airflow/clients/python/2.1.0/apache-airflow-client-2.1.0-source.tar.gz.sha512
  - copied unchanged from r47927, 
dev/airflow/clients/python/2.1.0rc1/apache-airflow-client-2.1.0rc1-source.tar.gz.sha512

release/airflow/clients/python/2.1.0/apache_airflow_client-2.1.0-py3-none-any.whl
  - copied unchanged from r47927, 
dev/airflow/clients/python/2.1.0rc1/apache_airflow_client-2.1.0rc1-py3-none-any.whl

release/airflow/clients/python/2.1.0/apache_airflow_client-2.1.0-py3-none-any.whl.asc
  - copied unchanged from r47927, 
dev/airflow/clients/python/2.1.0rc1/apache_airflow_client-2.1.0rc1-py3-none-any.whl.asc

release/airflow/clients/python/2.1.0/apache_airflow_client-2.1.0-py3-none-any.whl.sha512
  - copied unchanged from r47927, 
dev/airflow/clients/python/2.1.0rc1/apache_airflow_client-2.1.0rc1-py3-none-any.whl.sha512



[GitHub] [airflow] hendoxc opened a new issue #16185: Combined Executor - KubernetesExecutor + LocalExecutor

2021-05-31 Thread GitBox


hendoxc opened a new issue #16185:
URL: https://github.com/apache/airflow/issues/16185


   **Description**
   In short: There exists already combined executor support for Celery + 
Kubernetes, I think that also adding Local + Kubernetes
   would be warmly welcomed.
   
   **Use case / motivation**
   
   Many tasks are low-resource intensive tasks on the airflow pod, It makes 
sense to keep things running on LocalExecutor, 0 overhead. However I also have 
a large number of dags, that have some long running tasks which are somewhat 
resource hungry, for these I currently use KubernetesPodOperator, this makes 
the Dags more scalable and overall workload elastic. Being able to easily 
switch between easily between KubernetesExecutor and LocalExecutor, just feels 
natural in this sense.
   
   
   **Related Issues**
   
   N/A
   @potiuk 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] boring-cyborg[bot] commented on issue #16185: Combined Executor - KubernetesExecutor + LocalExecutor

2021-05-31 Thread GitBox


boring-cyborg[bot] commented on issue #16185:
URL: https://github.com/apache/airflow/issues/16185#issuecomment-851421366


   Thanks for opening your first issue here! Be sure to follow the issue 
template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] sweco commented on issue #16138: doc_md code block collapsing lines

2021-05-31 Thread GitBox


sweco commented on issue #16138:
URL: https://github.com/apache/airflow/issues/16138#issuecomment-851416191


   Hey @Dr-Denzy, thank you for looking into this, but I believe that what you 
managed to produce are not multi-line code blocks but rather:
   
   - In the first case just regular paragraphs
   - In the second case, regular paragraphs each with inline (one-line) code
   
   If I'm right, these are not code blocks. According to the Markdown Guide, 
you can create a code block in two ways: [classic code 
block](https://www.markdownguide.org/basic-syntax/#code-blocks) and [fenced 
code block](https://www.markdownguide.org/extended-syntax/#fenced-code-blocks).
   
   ### Classic code block
   
   A classic code block is created by indenting a block by 4 spaces (Bitbucket 
requires this to be 8 spaces), but it does not seem to work with either 4 or 8 
spaces.
   
   ```python
   DOC_MC = """\
   # Markdown code block
   
   Inline `code` works well.
   
   Code block
   does not
   respect
   newlines
   
   """
   ```
   
   Normally this would look as in the Markdown guide, but Airflow just creates 
a normal paragraph.
   
   https://user-images.githubusercontent.com/11132999/120184077-d7293780-c210-11eb-9505-e43a9c8a64db.png";
 />
   
   ### Fenced code block
   Created by delimiting the code block by three backticks (that's what I used 
in the first example):
   
   python
   DOC_MC = """\
   # Markdown code block
   
   Inline `code` works well.
   
   ```
   Code block
   does not
   respect
   newlines
   ```
   """
   
   
   This creates a `code` element in the HTML DOM but it joins the lines:
   https://user-images.githubusercontent.com/11132999/120184471-50288f00-c211-11eb-86b2-92bc53dcf076.png";
 />
   https://user-images.githubusercontent.com/11132999/120184765-a39add00-c211-11eb-8c9a-e63b818842d8.png";
 />


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated: Fixes failing static checks after recent pre-commit upgrade (#16183)

2021-05-31 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/master by this push:
 new f47e10c  Fixes failing static checks after recent pre-commit upgrade 
(#16183)
f47e10c is described below

commit f47e10c3885a028e7c45c10c317a7dbbff9e3ab9
Author: Jarek Potiuk 
AuthorDate: Mon May 31 12:52:32 2021 +0200

Fixes failing static checks after recent pre-commit upgrade (#16183)
---
 airflow/cli/simple_table.py  | 2 +-
 airflow/jobs/scheduler_job.py| 4 ++--
 airflow/providers/apache/sqoop/operators/sqoop.py| 2 +-
 airflow/serialization/serialized_objects.py  | 5 -
 airflow/www/views.py | 4 ++--
 tests/always/test_project_structure.py   | 6 +++---
 tests/providers/google/cloud/operators/test_functions.py | 2 +-
 7 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/airflow/cli/simple_table.py b/airflow/cli/simple_table.py
index 20444fa..d17f948 100644
--- a/airflow/cli/simple_table.py
+++ b/airflow/cli/simple_table.py
@@ -61,7 +61,7 @@ class AirflowConsole(Console):
 table.add_column(col)
 
 for row in data:
-table.add_row(*[str(d) for d in row.values()])
+table.add_row(*(str(d) for d in row.values()))
 self.print(table)
 
 def print_as_plain_table(self, data: List[Dict]):
diff --git a/airflow/jobs/scheduler_job.py b/airflow/jobs/scheduler_job.py
index 6d07fd7..dc3f144 100644
--- a/airflow/jobs/scheduler_job.py
+++ b/airflow/jobs/scheduler_job.py
@@ -1600,13 +1600,13 @@ class SchedulerJob(BaseJob):  # pylint: 
disable=too-many-instance-attributes
 
 if session.bind.dialect.name == 'mssql':
 active_dagruns_filter = or_(
-*[
+*(
 and_(
 DagRun.dag_id == dm.dag_id,
 DagRun.execution_date == dm.next_dagrun,
 )
 for dm in dag_models
-]
+)
 )
 else:
 active_dagruns_filter = tuple_(DagRun.dag_id, 
DagRun.execution_date).in_(
diff --git a/airflow/providers/apache/sqoop/operators/sqoop.py 
b/airflow/providers/apache/sqoop/operators/sqoop.py
index a790e49..33fb66f 100644
--- a/airflow/providers/apache/sqoop/operators/sqoop.py
+++ b/airflow/providers/apache/sqoop/operators/sqoop.py
@@ -246,7 +246,7 @@ class SqoopOperator(BaseOperator):
 if self.hook is None:
 self.hook = self._get_hook()
 self.log.info('Sending SIGTERM signal to bash process group')
-os.killpg(os.getpgid(self.hook.sub_process.pid), signal.SIGTERM)
+os.killpg(os.getpgid(self.hook.sub_process.pid), signal.SIGTERM)  # 
pylint: disable=no-member
 
 def _get_hook(self) -> SqoopHook:
 return SqoopHook(
diff --git a/airflow/serialization/serialized_objects.py 
b/airflow/serialization/serialized_objects.py
index 451305c..9415e49 100644
--- a/airflow/serialization/serialized_objects.py
+++ b/airflow/serialization/serialized_objects.py
@@ -502,7 +502,10 @@ class SerializedBaseOperator(BaseOperator, 
BaseSerialization):
 
 elif k == "deps":
 v = cls._deserialize_deps(v)
-elif k in cls._decorated_fields or k not in 
op.get_serialized_fields():
+elif (
+k in cls._decorated_fields
+or k not in op.get_serialized_fields()  # pylint: 
disable=unsupported-membership-test
+):
 v = cls._deserialize(v)
 # else use v as it is
 
diff --git a/airflow/www/views.py b/airflow/www/views.py
index 5fb4639..03698e0 100644
--- a/airflow/www/views.py
+++ b/airflow/www/views.py
@@ -2633,7 +2633,7 @@ class Airflow(AirflowBaseView):  # noqa: D101  pylint: 
disable=too-many-public-m
 tis = sorted(tis, key=lambda ti: ti.start_date)
 ti_fails = list(
 itertools.chain(
-*[
+*(
 (
 session.query(TaskFail)
 .filter(
@@ -2644,7 +2644,7 @@ class Airflow(AirflowBaseView):  # noqa: D101  pylint: 
disable=too-many-public-m
 .all()
 )
 for ti in tis
-]
+)
 )
 )
 
diff --git a/tests/always/test_project_structure.py 
b/tests/always/test_project_structure.py
index 560f379..d4d8645 100644
--- a/tests/always/test_project_structure.py
+++ b/tests/always/test_project_structure.py
@@ -225,7 +225,7 @@ class TestGoogleProviderProjectStructure(unittest.TestCase):
 
 def test_example_dags(self):
 operators_modules = itertools.chain(
-*[self.find_resource_files(resource_type=d) for d in ["operato

[GitHub] [airflow] potiuk merged pull request #16183: Fixes failing static checks after recent pre-commit upgrade

2021-05-31 Thread GitBox


potiuk merged pull request #16183:
URL: https://github.com/apache/airflow/pull/16183


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Dr-Denzy commented on issue #16138: doc_md code block collapsing lines

2021-05-31 Thread GitBox


Dr-Denzy commented on issue #16138:
URL: https://github.com/apache/airflow/issues/16138#issuecomment-851404403


   @kaxil consider closing this issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Dr-Denzy edited a comment on issue #16138: doc_md code block collapsing lines

2021-05-31 Thread GitBox


Dr-Denzy edited a comment on issue #16138:
URL: https://github.com/apache/airflow/issues/16138#issuecomment-851401276


   @sweco to create a code block, you have to leave a blank line between each 
line. See [PEP 257](https://www.python.org/dev/peps/pep-0257/) and [Markdown 
guidelines](https://www.markdownguide.org/basic-syntax/) for more details.
   
   This is how I did it:
   
   ```python
   from airflow import DAG
   
   DOC_MD = """\
   # Markdown code block
   
   Inline `code` works well.
   
   Code block
   
   does 
   
   respect
   
   newlines
   
   """
   
   dag = DAG(
   dag_id='md-doc',
   doc_md=DOC_MD
   )
   
   ```
   
![md-doc-2](https://user-images.githubusercontent.com/9834450/120181457-63396000-c20d-11eb-9818-25e050804ab2.png)
   
   ```python
   from airflow import DAG
   
   DOC_MD = """\
   # Markdown code block
   
   Inline `code` works well.
   
   `Code block`
   
  `does`
   
   `respect`
   
   `newlines`
   
   """
   
   dag = DAG(
   dag_id='md-doc',
   doc_md=DOC_MD
   )
   
   ```
   
![md-doc](https://user-images.githubusercontent.com/9834450/120181913-fa9eb300-c20d-11eb-868f-e710501cd264.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   >