[GitHub] [airflow] uranusjr commented on pull request #18042: Fixing ses email backend

2021-09-14 Thread GitBox


uranusjr commented on pull request #18042:
URL: https://github.com/apache/airflow/pull/18042#issuecomment-919721743


   Also see #18200.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on pull request #18200: Add from_{address,name} configuration parameters

2021-09-14 Thread GitBox


uranusjr commented on pull request #18200:
URL: https://github.com/apache/airflow/pull/18200#issuecomment-919721653


   This conficts with #18042. We should figure out a conhesive plan on what 
exactly to do here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] edwardwang888 commented on pull request #18033: Always draw borders if task instance state is null or undefined

2021-09-14 Thread GitBox


edwardwang888 commented on pull request #18033:
URL: https://github.com/apache/airflow/pull/18033#issuecomment-919709962


   @bbovenzi Added the screenshots above!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr edited a comment on pull request #18243: Fix deleting of zipped Dags in Serialized Dag Table

2021-09-14 Thread GitBox


uranusjr edited a comment on pull request #18243:
URL: https://github.com/apache/airflow/pull/18243#issuecomment-919699445


   I think this needs a rebase. Not sure if the `max_retries` failures are 
related; don’t feel like it, but I don’t see anything in main modifying that 
recently either.
   
   ```
tests/models/test_cleartasks.py::TestClearTasks::test_clear_task_instances: 
assert 3 == 1
+  where 3 = .max_tries
   
tests/models/test_cleartasks.py::TestClearTasks::test_clear_task_instances_with_task_reschedule:
 AssertionError: assert 0 == 1
+  where 0 = .count_task_reschedule
 at 0x7f49a824fd40>('1')
+where '1' = .task_id
   tests/models/test_cleartasks.py::TestClearTasks::test_dag_clear: assert 3 == 
1
+  where 3 = .max_tries
   tests/models/test_cleartasks.py::TestClearTasks::test_operator_clear: assert 
1 == 0
+  where 1 = .max_tries
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jedcunningham closed issue #18246: Spurious message complaining of no heartbeat from Triggerer

2021-09-14 Thread GitBox


jedcunningham closed issue #18246:
URL: https://github.com/apache/airflow/issues/18246


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jedcunningham commented on issue #18246: Spurious message complaining of no heartbeat from Triggerer

2021-09-14 Thread GitBox


jedcunningham commented on issue #18246:
URL: https://github.com/apache/airflow/issues/18246#issuecomment-919699571


   This should be fixed by #18129 already (it didn't make it in before b1).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on pull request #18243: Fix deleting of zipped Dags in Serialized Dag Table

2021-09-14 Thread GitBox


uranusjr commented on pull request #18243:
URL: https://github.com/apache/airflow/pull/18243#issuecomment-919699445


   I think this needs a rebase. Not sure if the `max_retries` failures are 
related; don’t feel like it, but I don’t see anything in main modifying that 
recently either.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on pull request #18228: Fix provider test acessing importlib-resources

2021-09-14 Thread GitBox


uranusjr commented on pull request #18228:
URL: https://github.com/apache/airflow/pull/18228#issuecomment-919698087


   WTH this seems to break `mock_kinesis` somehow?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jedcunningham commented on a change in pull request #18257: Ability to access http k8s via multiple hostnames

2021-09-14 Thread GitBox


jedcunningham commented on a change in pull request #18257:
URL: https://github.com/apache/airflow/pull/18257#discussion_r708834400



##
File path: chart/values.schema.json
##
@@ -126,10 +126,15 @@
 "default": ""
 },
 "host": {
-"description": "The hostname for the web Ingress.",
+"description": "The hostname for the web Ingress. 
(Deprecated - renamed to `ingress.web.hosts`)",
 "type": "string",
 "default": ""
 },
+"hosts": {
+"description": "The hostnames for the web 
Ingress.",
+"type": "array",

Review comment:
   ```suggestion
   "type": "array",
   "items": {
   "type": "string"
   },
   ```

##
File path: chart/values.schema.json
##
@@ -126,10 +126,15 @@
 "default": ""
 },
 "host": {
-"description": "The hostname for the web Ingress.",
+"description": "The hostname for the web Ingress. 
(Deprecated - renamed to `ingress.web.hosts`)",

Review comment:
   Also in `UPDATING.rst`?

##
File path: chart/values.schema.json
##
@@ -126,10 +126,15 @@
 "default": ""
 },
 "host": {
-"description": "The hostname for the web Ingress.",
+"description": "The hostname for the web Ingress. 
(Deprecated - renamed to `ingress.web.hosts`)",
 "type": "string",
 "default": ""
 },
+"hosts": {
+"description": "The hostnames for the web 
Ingress.",
+"type": "array",
+"default": []

Review comment:
   There should be a corresponding param added to `values.yaml` too.

##
File path: chart/templates/webserver/webserver-ingress.yaml
##
@@ -39,31 +39,38 @@ spec:
   {{- if .Values.ingress.web.tls.enabled }}
   tls:
 - hosts:
-- {{ .Values.ingress.web.host }}
+{{- if .Values.ingress.web.tls.enabled }}
+{{- range .Values.ingress.web.hosts | default (list 
.Values.ingress.web.host) }}
+- {{ . | quote }}
+{{- end }}

Review comment:
   ```suggestion
   {{- .Values.ingress.web.hosts | default (list 
.Values.ingress.web.host) | toYaml | nindent 8 }}
   ```
   
   I think using `toYaml` is cleaner, thoughts?

##
File path: chart/templates/webserver/webserver-ingress.yaml
##
@@ -39,31 +39,38 @@ spec:
   {{- if .Values.ingress.web.tls.enabled }}
   tls:
 - hosts:
-- {{ .Values.ingress.web.host }}
+{{- if .Values.ingress.web.tls.enabled }}
+{{- range .Values.ingress.web.hosts | default (list 
.Values.ingress.web.host) }}
+- {{ . | quote }}
+{{- end }}
+{{- end }}
   secretName: {{ .Values.ingress.web.tls.secretName }}
   {{- end }}
   rules:
+{{- range .Values.ingress.web.hosts | default (list 
.Values.ingress.web.host) }}
 - http:
 paths:
-  {{- range .Values.ingress.web.precedingPaths }}
+  {{- range $.Values.ingress.web.precedingPaths }}
   - path: {{ .path }}
 backend:
   serviceName: {{ .serviceName }}
   servicePort: {{ .servicePort }}
   {{- end }}
   - backend:
-  serviceName: {{ .Release.Name }}-webserver
+  serviceName: {{ $.Release.Name }}-webserver
   servicePort: airflow-ui
-{{- if .Values.ingress.web.path }}
-path: {{ .Values.ingress.web.path }}
+{{- if $.Values.ingress.web.path }}
+path: {{ $.Values.ingress.web.path }}
 {{- end }}
-  {{- range .Values.ingress.web.succeedingPaths }}
+  {{- range $.Values.ingress.web.succeedingPaths }}
   - path: {{ .path }}
 backend:
   serviceName: {{ .serviceName }}
   servicePort: {{ .servicePort }}
   {{- end }}
-{{- if .Values.ingress.web.host }}
-  host: {{ .Values.ingress.web.host }}
+
+{{- if . }}
+  host: {{ . | quote }}
 {{- end }}

Review comment:
   ```suggestion
 host: {{ . | quote }}
   ```
   
   Not needed, right?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:

[GitHub] [airflow] uranusjr commented on a change in pull request #18258: Improve coverage for airflow.security.kerberos module

2021-09-14 Thread GitBox


uranusjr commented on a change in pull request #18258:
URL: https://github.com/apache/airflow/pull/18258#discussion_r708834881



##
File path: tests/security/test_kerberos.py
##
@@ -104,3 +105,255 @@ def test_args_from_cli(self):
 )
 
 assert ctx.value.code == 1
+
+
+class TestKerberosUnit(unittest.TestCase):

Review comment:
   Should we use a pytest setup instead? `pytest.mark.paramtrize`, 
`caplog`, and `pytest.raises` are easier to use than `parameterized`, 
`assertLogs`, and `assertRaises` IMO. But this is minor and I’m OK if you feel 
the current approach is more familiar etc.

##
File path: tests/security/test_kerberos.py
##
@@ -104,3 +105,255 @@ def test_args_from_cli(self):
 )
 
 assert ctx.value.code == 1
+
+
+class TestKerberosUnit(unittest.TestCase):
+@parameterized.expand(
+[
+(
+{('kerberos', 'reinit_frequency'): '42'},
+[
+'kinit',
+'-f',
+'-a',
+'-r',
+'42m',
+'-k',
+'-t',
+'keytab',
+'-c',
+'/tmp/airflow_krb5_ccache',
+'test-principal',
+],
+),
+(
+{('kerberos', 'forwardable'): 'True', ('kerberos', 
'include_ip'): 'True'},
+[
+'kinit',
+'-f',
+'-a',
+'-r',
+'3600m',
+'-k',
+'-t',
+'keytab',
+'-c',
+'/tmp/airflow_krb5_ccache',
+'test-principal',
+],
+),
+(
+{('kerberos', 'forwardable'): 'False', ('kerberos', 
'include_ip'): 'False'},
+[
+'kinit',
+'-F',
+'-A',
+'-r',
+'3600m',
+'-k',
+'-t',
+'keytab',
+'-c',
+'/tmp/airflow_krb5_ccache',
+'test-principal',
+],
+),
+]
+)
+def test_renew_from_kt(self, kerberos_config, expected_cmd):
+with self.assertLogs(kerberos.log) as log_ctx, 
conf_vars(kerberos_config), mock.patch(
+'airflow.security.kerberos.subprocess'
+) as mock_subprocess, mock.patch(
+'airflow.security.kerberos.NEED_KRB181_WORKAROUND', None
+), mock.patch(
+'airflow.security.kerberos.open', 
mock.mock_open(read_data=b'X-CACHECONF:')
+), mock.patch(
+'time.sleep', return_value=None
+):
+
mock_subprocess.Popen.return_value.__enter__.return_value.returncode = 0
+mock_subprocess.call.return_value = 0
+renew_from_kt(principal="test-principal", keytab="keytab")
+
+assert mock_subprocess.Popen.call_args[0][0] == expected_cmd
+
+expected_cmd_text = " ".join(shlex.quote(f) for f in expected_cmd)
+assert log_ctx.output == [
+f'INFO:airflow.security.kerberos:Re-initialising kerberos from 
keytab: {expected_cmd_text}',
+'INFO:airflow.security.kerberos:Renewing kerberos ticket to work 
around kerberos 1.8.1: '
+'kinit -c /tmp/airflow_krb5_ccache -R',
+]
+
+mock_subprocess.assert_has_calls(
+[
+mock.call.Popen(
+expected_cmd,
+bufsize=-1,
+close_fds=True,
+stderr=mock_subprocess.PIPE,
+stdout=mock_subprocess.PIPE,
+universal_newlines=True,
+),
+mock.call.Popen().__enter__(),
+mock.call.Popen().__enter__().wait(),
+mock.call.Popen().__exit__(None, None, None),
+mock.call.call(['kinit', '-c', '/tmp/airflow_krb5_ccache', 
'-R'], close_fds=True),
+]
+)

Review comment:
   `assert_has_calls` [is 
non-exhaustive](https://docs.python.org/3/library/unittest.mock.html#unittest.mock.Mock.assert_has_calls).
 Are we expecting `subprocess.Popen` to be called by anything else here? If 
not, we should use `assert mock_subprocess.mock_calls == [...]` isntead.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jedcunningham commented on a change in pull request #18249: Add support for securityContext per deployment

2021-09-14 Thread GitBox


jedcunningham commented on a change in pull request #18249:
URL: https://github.com/apache/airflow/pull/18249#discussion_r708827609



##
File path: chart/values.schema.json
##
@@ -70,6 +70,332 @@
 "default": "2.1.3",
 "x-docsSection": "Common"
 },
+"podSecurity": {
+"description": "Set security contexts for certain containers",
+"type": "object",
+"x-docsSection": "Kubernetes",
+"additionalProperties": false,
+"properties": {
+"default": {
+"description": "Default global security context.",
+"type": "object",
+"additionalProperties": false,
+"properties": {
+"securityContext": {
+"description": "Global Pod security context as 
defined in 
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/#podsecuritycontext-v1-core;,
+"type": "object",
+"default": "See values.yaml",

Review comment:
   ```suggestion
   "default": {"runAsUser": 5, "fsGroup": 0, 
"runAsGroup": 0},
   ```
   
   I think we should bring the actual defaults into the docs instead.

##
File path: chart/files/pod-template-file.kubernetes-helm-yaml
##
@@ -45,6 +45,7 @@ spec:
   {{- end }}
   containers:
 - args: []
+  securityContext: {{- omit 
.Values.podSecurity.pod_template.containerSecurityContext "enabled" | default 
(.Values.podSecurity.default.containerSecurityContext) | toYaml | nindent 8 }}

Review comment:
   Instead of this `omit` pattern, would this work instead? I think it will 
and is more intuitive imo.
   
   values.yaml:
   ```
   podSecurity.pod_template.containerSecurityContext: {}
   ```
   
   and:
   
   ```suggestion
 securityContext: {{- 
.Values.podSecurity.pod_template.containerSecurityContext | default 
(.Values.podSecurity.default.containerSecurityContext) | toYaml | nindent 8 }}
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jedcunningham commented on a change in pull request #18147: Allow airflow standard images to run in openshift utilising the official helm chart #18136

2021-09-14 Thread GitBox


jedcunningham commented on a change in pull request #18147:
URL: https://github.com/apache/airflow/pull/18147#discussion_r708819222



##
File path: docs/helm-chart/production-guide.rst
##
@@ -200,3 +200,24 @@ By default, the chart will deploy Redis. However, you can 
use any supported Cele
 
 For more information about setting up a Celery broker, refer to the
 exhaustive `Celery documentation on the topic 
`_.
+
+Security Context Constraints
+-
+
+A ``Security Context Constraint`` (SCC) is a OpenShift construct that works as 
a RBAC rule however it targets PODs instead of users.
+When defining a SCC, one can control actions and resources a POD can perform 
or access during startup and runtime.

Review comment:
   ```suggestion
   A ``Security Context Constraint`` (SCC) is a OpenShift construct that works 
as a RBAC rule however it targets Pods instead of users.
   When defining a SCC, one can control actions and resources a Pod can perform 
or access during startup and runtime.
   ```

##
File path: docs/helm-chart/production-guide.rst
##
@@ -200,3 +200,24 @@ By default, the chart will deploy Redis. However, you can 
use any supported Cele
 
 For more information about setting up a Celery broker, refer to the
 exhaustive `Celery documentation on the topic 
`_.
+
+Security Context Constraints
+-
+
+A ``Security Context Constraint`` (SCC) is a OpenShift construct that works as 
a RBAC rule however it targets PODs instead of users.
+When defining a SCC, one can control actions and resources a POD can perform 
or access during startup and runtime.
+
+The SCCs are split into different levels or categories with the ``restricted`` 
SCC being the default one assigned to PODs.
+When deploying airflow to OpenShift, one can leverage the SCCs and allow the 
PODs to start containers utilizing the ``anyuid`` SCC.

Review comment:
   ```suggestion
   The SCCs are split into different levels or categories with the 
``restricted`` SCC being the default one assigned to Pods.
   When deploying Airflow to OpenShift, one can leverage the SCCs and allow the 
Pods to start containers utilizing the ``anyuid`` SCC.
   ```

##
File path: docs/helm-chart/production-guide.rst
##
@@ -200,3 +200,24 @@ By default, the chart will deploy Redis. However, you can 
use any supported Cele
 
 For more information about setting up a Celery broker, refer to the
 exhaustive `Celery documentation on the topic 
`_.
+
+Security Context Constraints
+-
+
+A ``Security Context Constraint`` (SCC) is a OpenShift construct that works as 
a RBAC rule however it targets PODs instead of users.
+When defining a SCC, one can control actions and resources a POD can perform 
or access during startup and runtime.
+
+The SCCs are split into different levels or categories with the ``restricted`` 
SCC being the default one assigned to PODs.
+When deploying airflow to OpenShift, one can leverage the SCCs and allow the 
PODs to start containers utilizing the ``anyuid`` SCC.
+
+In order to enable the usage of SCCs, one must set the parameter 
:ref:`rbac.createSCCRoleBinding ` to ``true`` as shown 
below:
+
+.. code-block:: yaml
+
+  rbac:
+create: true
+createSCCRoleBinding: true
+
+In this chart, SCCs are bound to the PODs via RoleBindings meaning that the 
option ``rbac.create`` must also be set to ``true`` in order to fully enable 
the SCC usage.

Review comment:
   ```suggestion
   In this chart, SCCs are bound to the Pods via RoleBindings meaning that the 
option ``rbac.create`` must also be set to ``true`` in order to fully enable 
the SCC usage.
   ```

##
File path: chart/templates/rbac/security-context-constraint-rolebinding.yaml
##
@@ -0,0 +1,93 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+
+## Airflow SCC Role Binding
+#
+{{- if and .Values.rbac.create .Values.rbac.createSCCRoleBinding }}
+{{- 

[GitHub] [airflow] uranusjr closed pull request #18228: Fix provider test acessing importlib-resources

2021-09-14 Thread GitBox


uranusjr closed pull request #18228:
URL: https://github.com/apache/airflow/pull/18228


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil edited a comment on pull request #17552: AIP 39: Documentation

2021-09-14 Thread GitBox


kaxil edited a comment on pull request #17552:
URL: https://github.com/apache/airflow/pull/17552#issuecomment-919587937


   cc @jwitz


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on a change in pull request #17552: AIP 39: Documentation

2021-09-14 Thread GitBox


uranusjr commented on a change in pull request #17552:
URL: https://github.com/apache/airflow/pull/17552#discussion_r708791466



##
File path: docs/apache-airflow/best-practices.rst
##
@@ -43,21 +43,26 @@ Please follow our guide on :ref:`custom Operators 
`.
 Creating a task
 ---
 
-You should treat tasks in Airflow equivalent to transactions in a database. 
This implies that you should never produce
-incomplete results from your tasks. An example is not to produce incomplete 
data in ``HDFS`` or ``S3`` at the end of a task.
-
-Airflow can retry a task if it fails. Thus, the tasks should produce the same 
outcome on every re-run.
-Some of the ways you can avoid producing a different result -
-
-* Do not use INSERT during a task re-run, an INSERT statement might lead to 
duplicate rows in your database.
-  Replace it with UPSERT.
-* Read and write in a specific partition. Never read the latest available data 
in a task.
-  Someone may update the input data between re-runs, which results in 
different outputs.
-  A better way is to read the input data from a specific partition. You can 
use ``execution_date`` as a partition.
-  You should follow this partitioning method while writing data in S3/HDFS, as 
well.
-* The Python datetime ``now()`` function gives the current datetime object.
-  This function should never be used inside a task, especially to do the 
critical computation, as it leads to different outcomes on each run.
-  It's fine to use it, for example, to generate a temporary log.
+You should treat tasks in Airflow equivalent to transactions in a database. 
This
+implies that you should never produce incomplete results from your tasks. An
+example is not to produce incomplete data in ``HDFS`` or ``S3`` at the end of a
+task.
+
+Airflow can retry a task if it fails. Thus, the tasks should produce the same
+outcome on every re-run. Some of the ways you can avoid producing a different
+result -
+
+* Do not use INSERT during a task re-run, an INSERT statement might lead to
+  duplicate rows in your database. Replace it with UPSERT.
+* Read and write in a specific partition. Never read the latest available data
+  in a task. Someone may update the input data between re-runs, which results 
in
+  different outputs. A better way is to read the input data from a specific
+  partition. You can use ``data_interval_start`` as a partition. You should
+  follow this partitioning method while writing data in S3/HDFS, as well.

Review comment:
   Side note, there are a lot of *blah blah blah, as well* usages in the 
documentation, so it seems like whoever wrote it previously has a particular 
style. (I’ve removed the comma here.)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch aip-39-docs updated (201b375 -> 28bc998)

2021-09-14 Thread uranusjr
This is an automated email from the ASF dual-hosted git repository.

uranusjr pushed a change to branch aip-39-docs
in repository https://gitbox.apache.org/repos/asf/airflow.git.


 discard 201b375  Typos and wording/styling tweaks
 add 28bc998  Typos and wording/styling tweaks

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (201b375)
\
 N -- N -- N   refs/heads/aip-39-docs (28bc998)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

No new revisions were added by this update.

Summary of changes:
 docs/apache-airflow/best-practices.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


[airflow] branch aip-39-docs updated (be38888 -> 201b375)

2021-09-14 Thread uranusjr
This is an automated email from the ASF dual-hosted git repository.

uranusjr pushed a change to branch aip-39-docs
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from be3  Inline the example custom timetable DAG
 add 201b375  Typos and wording/styling tweaks

No new revisions were added by this update.

Summary of changes:
 docs/apache-airflow/dag-run.rst   |  2 +-
 docs/apache-airflow/faq.rst   | 13 +++--
 docs/apache-airflow/howto/timetable.rst   |  6 +++---
 docs/apache-airflow/logging-monitoring/errors.rst |  4 +++-
 4 files changed, 14 insertions(+), 11 deletions(-)


[GitHub] [airflow] uranusjr commented on a change in pull request #17552: AIP 39: Documentation

2021-09-14 Thread GitBox


uranusjr commented on a change in pull request #17552:
URL: https://github.com/apache/airflow/pull/17552#discussion_r708789842



##
File path: docs/apache-airflow/faq.rst
##
@@ -216,20 +216,35 @@ actually start. If this were not the case, the backfill 
just would not start.
 What does ``execution_date`` mean?
 --
 
-Airflow was developed as a solution for ETL needs. In the ETL world, you 
typically summarize data. So, if you want to
-summarize data for 2016-02-19, You would do it at 2016-02-20 midnight UTC, 
which would be right after all data for
-2016-02-19 becomes available.
-
-This datetime value is available to you as :ref:`Template 
variables` with various formats in Jinja templated
-fields. They are also included in the context dictionary given to an 
Operator's execute function.
+*Execution date* or ``execution_date`` is a historical name for what is called 
a
+*logical date*, and also usually the start of the data interval represented by 
a
+DAG run.
+
+Airflow was developed as a solution for ETL needs. In the ETL world, you
+typically summarize data. So, if you want to summarize data for 2016-02-19, You
+would do it at 2016-02-20 midnight UTC, which would be right after all data for
+2016-02-19 becomes available. This interval between midnights of 2016-02-19 and
+2016-02-20 is called the *data interval*, and since the it represents data in
+the date of 2016-02-19, this date is thus called the run's *logical date*, or
+the date that this DAG run is executed for, thus *execution date*.

Review comment:
   Not sure about making UTC inline code; added ticks around date.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on a change in pull request #17552: AIP 39: Documentation

2021-09-14 Thread GitBox


uranusjr commented on a change in pull request #17552:
URL: https://github.com/apache/airflow/pull/17552#discussion_r708788427



##
File path: docs/apache-airflow/howto/timetable.rst
##
@@ -0,0 +1,298 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+
+Customizing DAG Scheduling with Timetables
+==
+
+A DAG's scheduling strategy is determined by its internal "timetable". This
+timetable can be created by specifying the DAG's ``schedule_interval`` 
argument,
+as described in :doc:`DAG Run `. The timetable also dictates the data
+interval and the logical time of each run created for the DAG.
+
+However, there are situations when a cron expression or simple ``timedelta``
+periods cannot properly express the schedule. Some of the examples are:
+
+* Data intervals with "holes" between. (Instead of continuous, as both the cron
+  expression and ``timedelta`` schedules represent.)
+* Run tasks at different times each day. For example, an astronomer may find it
+  useful to run a task at dawn to process data collected from the previous
+  night-time period.
+* Schedules not following the Gregorian calendar. For example, create a run for
+  each month in the `Traditional Chinese Calendar`_. This is conceptually
+  similar to the sunset case above, but for a different time scale.
+* Rolling windows, or overlapping data intervals. For example, one may want to
+  have a run each day, but make each run cover the period of the previous seven
+  days. It is possible to "hack" this with a cron expression, but a custom data
+  interval would be a more natural representation.
+
+.. _`Traditional Chinese Calendar`: 
https://en.wikipedia.org/wiki/Chinese_calendar
+
+
+For our example, let's say a company wants to run a job after each weekday to
+process data collected during the work day. The first intuitive answer to this
+would be ``schedule_interval="0 0 * * 1-5"`` (midnight on Monday to Friday), 
but
+this means data collected on Friday will *not* be processed right after Friday
+ends, but on the next Monday, and that run's interval would be from midnight
+Friday to midnight *Monday*.
+
+This is, therefore, an example in the "holes" category above; the intended
+schedule should not include the two weekend days. What we want is:
+
+* Schedule a run for each Monday, Tuesday, Wednesday, Thursday, and Friday. The
+  run's data interval would cover from midnight of each day, to midnight of the
+  next day (e.g. 2021-01-01 00:00:00 to 2021-01-02 00:00:00).
+* Each run would be created right after the data interval ends. The run 
covering
+  Monday happens on midnight Tuesday and so on. The run covering Friday happens
+  on midnight Saturday. No runs happen on midnights Sunday and Monday.
+
+For simplicity, we will only deal with UTC datetimes in this example.
+
+
+Timetable Registration
+--
+
+A timetable must be a subclass of :class:`~airflow.timetables.base.Timetable`,
+and be registered as a part of a :doc:`plugin `. The following is a
+skeleton for us to implement a new timetable:
+
+.. code-block:: python
+
+from airflow.plugins_manager import AirflowPlugin
+from airflow.timetables.base import Timetable
+
+
+class AfterWorkdayTimetable(Timetable):
+pass
+
+
+class WorkdayTimetablePlugin(AirflowPlugin):
+name = "workday_timetable_plugin"
+timetables = [AfterWorkdayTimetable]
+
+Next, we'll start putting code into ``AfterWorkdayTimetable``. After the
+implementation is finished, we should be able to use the timetable in our DAG
+file:
+
+.. code-block:: python
+
+from airflow import DAG
+
+
+with DAG(timetable=AfterWorkdayTimetable(), tags=["example", "timetable"]) 
as dag:
+...
+
+
+Define Scheduling Logic
+---
+
+When Airflow's scheduler encounters a DAG, it calls one of the two methods to
+know when to schedule the DAG's next run.
+
+* ``next_dagrun_info``: The scheduler uses this to learn the timetable's 
regular
+  schedule, i.e. the "one for every workday, run at the end of it" part in our
+  example.
+* ``infer_data_interval``: When a DAG run is manually triggered (from the web
+  UI, for 

[airflow] branch constraints-main updated: Updating constraints. Build id:1235775074

2021-09-14 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch constraints-main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/constraints-main by this push:
 new 1e48096  Updating constraints. Build id:1235775074
1e48096 is described below

commit 1e480966dc6aa9edb8feb6ff8cfe94c334a1c1cf
Author: Automated GitHub Actions commit 
AuthorDate: Wed Sep 15 01:56:55 2021 +

Updating constraints. Build id:1235775074

This update in constraints is automatically committed by the CI 
'constraints-push' step based on
HEAD of 'refs/heads/main' in 'apache/airflow'
with commit sha 292751c2581d65c279886622e07c03819d2c9bda.

All tests passed in this build so we determined we can push the updated 
constraints.

See 
https://github.com/apache/airflow/blob/main/README.md#installing-from-pypi for 
details.
---
 constraints-3.6.txt  | 18 +-
 constraints-3.7.txt  | 18 +-
 constraints-3.8.txt  | 18 +-
 constraints-3.9.txt  | 18 +-
 constraints-no-providers-3.6.txt | 14 --
 constraints-no-providers-3.7.txt | 14 --
 constraints-no-providers-3.8.txt | 14 --
 constraints-no-providers-3.9.txt | 14 --
 constraints-source-providers-3.6.txt |  2 +-
 constraints-source-providers-3.7.txt |  2 +-
 constraints-source-providers-3.8.txt |  2 +-
 constraints-source-providers-3.9.txt |  2 +-
 12 files changed, 72 insertions(+), 64 deletions(-)

diff --git a/constraints-3.6.txt b/constraints-3.6.txt
index f74f6fe..585d81d 100644
--- a/constraints-3.6.txt
+++ b/constraints-3.6.txt
@@ -1,5 +1,5 @@
 #
-# This constraints file was automatically generated on 2021-09-13T01:23:56Z
+# This constraints file was automatically generated on 2021-09-15T01:54:32Z
 # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow.
 # This variant of constraints install uses the HEAD of the branch version for 
'apache-airflow' but installs
 # the providers from PIP-released packages at the moment of the constraint 
generation.
@@ -13,7 +13,7 @@ APScheduler==3.6.3
 Authlib==0.15.4
 Babel==2.9.1
 Deprecated==1.2.13
-Flask-AppBuilder==3.3.2
+Flask-AppBuilder==3.3.3
 Flask-Babel==1.0.0
 Flask-Bcrypt==0.7.1
 Flask-Caching==1.10.1
@@ -184,7 +184,7 @@ cffi==1.14.6
 cfgv==3.3.1
 cgroupspy==0.2.1
 chardet==4.0.0
-charset-normalizer==2.0.4
+charset-normalizer==2.0.5
 click-didyoumean==0.0.3
 click-plugins==1.1.1
 click-repl==0.2.0
@@ -210,7 +210,7 @@ defusedxml==0.7.1
 dill==0.3.1.1
 distlib==0.3.2
 distributed==2.19.0
-dnspython==1.16.0
+dnspython==2.1.0
 docker==5.0.2
 docopt==0.6.2
 docutils==0.16
@@ -248,7 +248,7 @@ google-cloud-appengine-logging==0.1.4
 google-cloud-audit-log==0.1.1
 google-cloud-automl==2.4.2
 google-cloud-bigquery-datatransfer==3.3.2
-google-cloud-bigquery-storage==2.7.0
+google-cloud-bigquery-storage==2.8.0
 google-cloud-bigquery==2.26.0
 google-cloud-bigtable==1.7.0
 google-cloud-container==1.0.1
@@ -286,7 +286,7 @@ gunicorn==20.1.0
 h11==0.12.0
 hdfs==2.6.0
 hmsclient==0.1.1
-httpcore==0.13.6
+httpcore==0.13.7
 httplib2==0.19.1
 httpx==0.19.0
 humanize==3.11.0
@@ -467,13 +467,13 @@ sentry-sdk==1.3.1
 setproctitle==1.2.2
 simple-salesforce==1.11.4
 six==1.16.0
-slack-sdk==3.10.1
+slack-sdk==3.11.0
 smbprotocol==1.6.2
 smmap==4.0.0
 snakebite-py3==3.0.5
 sniffio==1.2.0
 snowballstemmer==2.1.0
-snowflake-connector-python==2.6.0
+snowflake-connector-python==2.6.1
 snowflake-sqlalchemy==1.2.5
 sortedcontainers==2.4.0
 soupsieve==2.2.1
@@ -494,7 +494,7 @@ sphinxcontrib-qthelp==1.0.3
 sphinxcontrib-redoc==1.6.0
 sphinxcontrib-serializinghtml==1.1.5
 sphinxcontrib-spelling==7.2.1
-spython==0.1.15
+spython==0.1.16
 sqlalchemy-drill==1.1.1
 sqlparse==0.4.2
 sshtunnel==0.1.5
diff --git a/constraints-3.7.txt b/constraints-3.7.txt
index 9b4c221..a974a8b 100644
--- a/constraints-3.7.txt
+++ b/constraints-3.7.txt
@@ -1,5 +1,5 @@
 #
-# This constraints file was automatically generated on 2021-09-13T01:23:52Z
+# This constraints file was automatically generated on 2021-09-15T01:54:41Z
 # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow.
 # This variant of constraints install uses the HEAD of the branch version for 
'apache-airflow' but installs
 # the providers from PIP-released packages at the moment of the constraint 
generation.
@@ -13,7 +13,7 @@ APScheduler==3.6.3
 Authlib==0.15.4
 Babel==2.9.1
 Deprecated==1.2.13
-Flask-AppBuilder==3.3.2
+Flask-AppBuilder==3.3.3
 Flask-Babel==1.0.0
 Flask-Bcrypt==0.7.1
 Flask-Caching==1.10.1
@@ -183,7 +183,7 @@ cffi==1.14.6
 cfgv==3.3.1
 cgroupspy==0.2.1
 chardet==4.0.0
-charset-normalizer==2.0.4
+charset-normalizer==2.0.5
 click-didyoumean==0.0.3
 click-plugins==1.1.1
 click-repl==0.2.0
@@ -207,7 +207,7 @@ defusedxml==0.7.1
 dill==0.3.1.1
 distlib==0.3.2

[GitHub] [airflow] satyarthn edited a comment on issue #9486: ECSOperator failed to display logs from Cloudwatch after providing log configurations

2021-09-14 Thread GitBox


satyarthn edited a comment on issue #9486:
URL: https://github.com/apache/airflow/issues/9486#issuecomment-919635087


   Another take on trying to figure out what is happening.
   
   My ECS console shows task_id as this
   
   https://user-images.githubusercontent.com/85149961/133357721-2580ad0a-9510-4973-bc89-bcc856d3da8e.png;>
   
   
   
   But the ECSOperator logs task id as something else
   
   `[2021-09-15 01:46:33,311] {{ecs.py:257}} INFO - ECS Task started: {'tasks': 
[{'attachments': [{'id': 'e216354d-f05c-465c-967a-50efd86d7e0e', 'type': 
'ElasticNetworkInterface', 'status': 'PRECREATED', 'details': `
   
   
   Again, not sure why this is happening. Is it ECSOperator bug or is it me 
doing something wrong in the way I have defined ECSOperator or ECS Task 
Definition
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] satyarthn commented on issue #9486: ECSOperator failed to display logs from Cloudwatch after providing log configurations

2021-09-14 Thread GitBox


satyarthn commented on issue #9486:
URL: https://github.com/apache/airflow/issues/9486#issuecomment-919635087


   Another take on trying to figure out what is happening.
   
   My ECS console shows task_id as this
   
   https://user-images.githubusercontent.com/85149961/133357508-80b15b22-57fb-4868-8dc2-c7a9ae6ddce4.png;>
   
   
   But the ECSOperator logs task id as something else
   
   `[2021-09-15 01:46:33,311] {{ecs.py:257}} INFO - ECS Task started: {'tasks': 
[{'attachments': [{'id': 'e216354d-f05c-465c-967a-50efd86d7e0e', 'type': 
'ElasticNetworkInterface', 'status': 'PRECREATED', 'details': `
   
   
   Again, not sure why this is happening. Is it ECSOperator bug or is it me 
doing something wrong in the way I have defined ECSOperator or ECS Task 
Definition
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] subkanthi commented on a change in pull request #17068: Influxdb Hook

2021-09-14 Thread GitBox


subkanthi commented on a change in pull request #17068:
URL: https://github.com/apache/airflow/pull/17068#discussion_r708776531



##
File path: tests/providers/influxdb/hooks/test_influxDbHook.py
##
@@ -0,0 +1,56 @@
+# Licensed to the Apache Software Foundation (ASF) under one

Review comment:
   Fixed and added more tests, @eladkal , thanks




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] subkanthi commented on a change in pull request #17068: Influxdb Hook

2021-09-14 Thread GitBox


subkanthi commented on a change in pull request #17068:
URL: https://github.com/apache/airflow/pull/17068#discussion_r708776531



##
File path: tests/providers/influxdb/hooks/test_influxDbHook.py
##
@@ -0,0 +1,56 @@
+# Licensed to the Apache Software Foundation (ASF) under one

Review comment:
   Added unit tests, @eladkal , thanks




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] satyarthn edited a comment on issue #9486: ECSOperator failed to display logs from Cloudwatch after providing log configurations

2021-09-14 Thread GitBox


satyarthn edited a comment on issue #9486:
URL: https://github.com/apache/airflow/issues/9486#issuecomment-919600135


   Per the Documentation
   
   `If you don't specify a prefix with this option, then the log stream is 
named after the container ID that is assigned by the Docker daemon on the 
container instance. Because it is difficult to trace logs back to the container 
that sent them with just the Docker container ID (which is only available on 
the container instance), we recommend that you specify a prefix with this 
option.`
   
   I see that in my ECS Task definition, I have the correct settings.
   
   https://user-images.githubusercontent.com/85149961/133356787-88977a80-978b-4a99-a00c-17cbc1c0c432.png;>
   
   
   So shouldn't the log stream be generated with the ECS Task ID and not the 
Docker Runtime Container Id? What could I be missing?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr edited a comment on pull request #18228: Fix provider test acessing importlib-resources

2021-09-14 Thread GitBox


uranusjr edited a comment on pull request #18228:
URL: https://github.com/apache/airflow/pull/18228#issuecomment-919623649


   I’ve applied the importlib-resources version bump. My change is different 
from #18209 in several ways:
   
   1. I bumped to 5.2 instead of 5.0. This is because the mock behavioural 
change is introduced in 5.2.
   2. I removed the `importlib-resources` entry from `setup.py`. This package 
is already installed unconditionally (in `install_requires`) and does not need 
to be listed in the `devel` extra.
   3. I added the `python_version < '3.7'` marker. We only import 
`importlib_resources` for Python 3.6 (in `provider_manager.py`) so having it 
installed on 3.7+ is useless.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on pull request #18228: Fix provider test acessing importlib-resources

2021-09-14 Thread GitBox


uranusjr commented on pull request #18228:
URL: https://github.com/apache/airflow/pull/18228#issuecomment-919623649


   I’ve applied the importlib-resources version bump. My change is different 
from #18209 in several ways:
   
   1. I bumped to 5.2 instead of 5.0. This is because the mock behavioural 
change is introduced in 5.2.
   2. I removed the `importlib-resources` entry from `setup.py`. This package 
is already installed unconditionally (in `install_requires`) and does not need 
to be listed in the `devel` extra.
   3. I added the `python_version < 3.7'` marker. We only import 
`importlib_resources` for Python 3.6 (in `provider_manager.py`) so having it 
installed on 3.7+ is useless.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on pull request #18226: Add start date to trigger_dagrun operator

2021-09-14 Thread GitBox


kaxil commented on pull request #18226:
URL: https://github.com/apache/airflow/pull/18226#issuecomment-919621292


   ```
   === short test summary info 

 FAILED 
tests/models/test_cleartasks.py::TestClearTasks::test_clear_task_instances
 FAILED 
tests/models/test_cleartasks.py::TestClearTasks::test_clear_task_instances_with_task_reschedule
 FAILED tests/models/test_cleartasks.py::TestClearTasks::test_dag_clear - 
asse...
 = 3 failed, 2445 passed, 61 skipped, 3 xfailed, 4 warnings in 508.01s 
(0:08:28) =
   ```
   
   tests are failing


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil opened a new pull request #18259: Add ``triggerer`` to ``./breeze start-airflow`` command

2021-09-14 Thread GitBox


kaxil opened a new pull request #18259:
URL: https://github.com/apache/airflow/pull/18259


   This adds ``airflow triggerer`` command to ``./breeze start-airflow``
   
   
![image](https://user-images.githubusercontent.com/8811558/133353878-cef8c048-bc5b-4576-a2d3-78aa9887d88e.png)
   
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on pull request #18228: Fix provider test acessing importlib-resources

2021-09-14 Thread GitBox


kaxil commented on pull request #18228:
URL: https://github.com/apache/airflow/pull/18228#issuecomment-919617368


   I have reverted the `importlib-resources` changes - 
https://github.com/apache/airflow/pull/18250. Can you add those changes in this 
PR again so we test it with the change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch main updated: Make sure create_user arguments are keyword-ed (#18248)

2021-09-14 Thread uranusjr
This is an automated email from the ASF dual-hosted git repository.

uranusjr pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
 new 292751c  Make sure create_user arguments are keyword-ed (#18248)
292751c is described below

commit 292751c2581d65c279886622e07c03819d2c9bda
Author: Tzu-ping Chung 
AuthorDate: Wed Sep 15 08:39:06 2021 +0800

Make sure create_user arguments are keyword-ed (#18248)
---
 tests/test_utils/api_connexion_utils.py |  2 +-
 tests/www/test_security.py  | 32 
 tests/www/views/conftest.py |  2 +-
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/tests/test_utils/api_connexion_utils.py 
b/tests/test_utils/api_connexion_utils.py
index 39a072b..e15d288 100644
--- a/tests/test_utils/api_connexion_utils.py
+++ b/tests/test_utils/api_connexion_utils.py
@@ -18,7 +18,7 @@ from airflow.api_connexion.exceptions import 
EXCEPTIONS_LINK_MAP
 from airflow.www.security import EXISTING_ROLES
 
 
-def create_user(app, username, role_name, email=None, permissions=None):
+def create_user(app, *, username, role_name, email=None, permissions=None):
 appbuilder = app.appbuilder
 
 # Removes user and role so each test has isolated test data.
diff --git a/tests/www/test_security.py b/tests/www/test_security.py
index eab27ba..9d1db20 100644
--- a/tests/www/test_security.py
+++ b/tests/www/test_security.py
@@ -336,8 +336,8 @@ class TestSecurity(unittest.TestCase):
 with self.app.app_context():
 user = api_connexion_utils.create_user(
 self.app,
-username,
-role_name,
+username=username,
+role_name=role_name,
 permissions=[
 (role_perm, role_vm),
 ],
@@ -355,8 +355,8 @@ class TestSecurity(unittest.TestCase):
 with self.app.app_context():
 user = api_connexion_utils.create_user(
 self.app,
-"current_user_has_permissions",
-"current_user_has_permissions",
+username="current_user_has_permissions",
+role_name="current_user_has_permissions",
 permissions=[("can_some_action", "SomeBaseView")],
 )
 role = user.roles[0]
@@ -379,8 +379,8 @@ class TestSecurity(unittest.TestCase):
 
 user = api_connexion_utils.create_user(
 self.app,
-username,
-role_name,
+username=username,
+role_name=role_name,
 permissions=[
 (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG),
 (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG),
@@ -407,8 +407,8 @@ class TestSecurity(unittest.TestCase):
 
 user = api_connexion_utils.create_user(
 self.app,
-username,
-role_name,
+username=username,
+role_name=role_name,
 permissions=[
 (permissions.ACTION_CAN_EDIT, permissions.RESOURCE_DAG),
 ],
@@ -473,8 +473,8 @@ class TestSecurity(unittest.TestCase):
 with self.app.app_context():
 user = api_connexion_utils.create_user(
 self.app,
-username,
-role_name,
+username=username,
+role_name=role_name,
 permissions=[
 (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG),
 (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG),
@@ -512,8 +512,8 @@ class TestSecurity(unittest.TestCase):
 with self.app.app_context():
 user = api_connexion_utils.create_user(
 self.app,
-username,
-role_name,
+username=username,
+role_name=role_name,
 permissions=[],
 )
 self.expect_user_is_in_role(user, rolename='team-a')
@@ -540,8 +540,8 @@ class TestSecurity(unittest.TestCase):
 with self.app.app_context():
 user = api_connexion_utils.create_user(
 self.app,
-username,
-role_name,
+username=username,
+role_name=role_name,
 permissions=[],
 )
 self.expect_user_is_in_role(user, rolename='team-a')
@@ -692,8 +692,8 @@ class TestSecurity(unittest.TestCase):
 ]
 user = api_connexion_utils.create_user(
 self.app,
-username,
-role_name,
+username=username,
+role_name=role_name,
 )
 self.security_manager.bulk_sync_roles(mock_roles)
 self.security_manager._sync_dag_view_permissions(
diff --git 

[GitHub] [airflow] uranusjr merged pull request #18248: Make sure create_user arguments are keyword-ed

2021-09-14 Thread GitBox


uranusjr merged pull request #18248:
URL: https://github.com/apache/airflow/pull/18248


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #18251: Improves installing from sources pages for all components

2021-09-14 Thread GitBox


github-actions[bot] commented on pull request #18251:
URL: https://github.com/apache/airflow/pull/18251#issuecomment-919605754


   The PR is likely ready to be merged. No tests are needed as no important 
environment files, nor python files were modified by it. However, committers 
might decide that full test matrix is needed and add the 'full tests needed' 
label. Then you should rebase it to the latest main or amend the last commit of 
the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] satyarthn edited a comment on issue #9486: ECSOperator failed to display logs from Cloudwatch after providing log configurations

2021-09-14 Thread GitBox


satyarthn edited a comment on issue #9486:
URL: https://github.com/apache/airflow/issues/9486#issuecomment-919600135


   Per the Documentation
   
   `If you don't specify a prefix with this option, then the log stream is 
named after the container ID that is assigned by the Docker daemon on the 
container instance. Because it is difficult to trace logs back to the container 
that sent them with just the Docker container ID (which is only available on 
the container instance), we recommend that you specify a prefix with this 
option.`
   
   I see that in my ECS Task definition, I have the correct settings.
   
   https://user-images.githubusercontent.com/85149961/133350773-677b77c5-3214-422a-8e92-47817c5e78f1.png;>
   
   So shouldn't the log stream be generated with the ECS Task ID and not the 
Docker Runtime Container Id? What could I be missing?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] satyarthn commented on issue #9486: ECSOperator failed to display logs from Cloudwatch after providing log configurations

2021-09-14 Thread GitBox


satyarthn commented on issue #9486:
URL: https://github.com/apache/airflow/issues/9486#issuecomment-919600135


   Ok. I believe the problem I am seeing is not with the ECSOperator, but with 
the AWS's ECS Task Definition UI. When I check the `Auto-configure CloudWatch 
Logs` , it does not persist. And hence the log stream is not getting generated 
as expected.
   
   Per the Documentation
   
   `If you don't specify a prefix with this option, then the log stream is 
named after the container ID that is assigned by the Docker daemon on the 
container instance. Because it is difficult to trace logs back to the container 
that sent them with just the Docker container ID (which is only available on 
the container instance), we recommend that you specify a prefix with this 
option.`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #17121: Deactivating DAGs which have been removed from files

2021-09-14 Thread GitBox


kaxil commented on a change in pull request #17121:
URL: https://github.com/apache/airflow/pull/17121#discussion_r708745692



##
File path: airflow/dag_processing/processor.py
##
@@ -645,3 +648,12 @@ def process_file(
 self.log.exception("Error logging import errors!")
 
 return len(dagbag.dags), len(dagbag.import_errors)
+
+def _deactivate_missing_dags(self, session: Session, dagbag: DagBag, 
file_path: str) -> None:
+deactivated = (
+session.query(DagModel)
+.filter(DagModel.fileloc == file_path, DagModel.is_active, 
~DagModel.dag_id.in_(dagbag.dag_ids))
+.update({DagModel.is_active: False}, synchronize_session="fetch")
+)
+if deactivated:
+self.log.info("Deactivated %i DAGs which are no longer present in 
%s", deactivated, file_path)

Review comment:
   This one is pending or else looks good and ready to be merged




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #17121: Deactivating DAGs which have been removed from files

2021-09-14 Thread GitBox


kaxil commented on a change in pull request #17121:
URL: https://github.com/apache/airflow/pull/17121#discussion_r708745286



##
File path: airflow/models/dag.py
##
@@ -2656,11 +2655,6 @@ def deactivate_deleted_dags(cls, alive_dag_filelocs: 
List[str], session=None):
 if dag_model.fileloc is not None:
 if correct_maybe_zipped(dag_model.fileloc) not in 
alive_dag_filelocs:
 dag_model.is_active = False
-else:
-# If is_active is set as False and the DAG File still 
exists
-# Change is_active=True
-if not dag_model.is_active:
-dag_model.is_active = True

Review comment:
   Yup, make sense




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #18257: Ability to access http k8s via multiple hostnames

2021-09-14 Thread GitBox


mik-laj commented on a change in pull request #18257:
URL: https://github.com/apache/airflow/pull/18257#discussion_r708744374



##
File path: chart/values.schema.json
##
@@ -126,10 +126,15 @@
 "default": ""
 },
 "host": {
-"description": "The hostname for the web Ingress.",
+"description": "The hostname for the web Ingress. 
(Deprecated - renamed to `ingress.web.hosts`)",

Review comment:
   Can you add warning about deprescte options in NOTES?
   
https://github.com/apache/airflow/blob/c73004d0cd33d76b82b172078f572e8d4eecab39/chart/templates/NOTES.txt#L94




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] satyarthn commented on issue #9486: ECSOperator failed to display logs from Cloudwatch after providing log configurations

2021-09-14 Thread GitBox


satyarthn commented on issue #9486:
URL: https://github.com/apache/airflow/issues/9486#issuecomment-919597341


   Folks,
   
   I went through this entire thread, but not able to troubleshoot the issue. I 
am having the same issue.
   
   So I can see CloudWatch has my logs
   
   ```
   CloudWatch
   Log groups
   /ecs/airflow-aws-gcp
   ecs/airflow-aws-gcp/ad93daaa94db48f0bd278c780e946f64
   ```
   
   Note the Stream Name is 
`ecs/airflow-aws-gcp/ad93daaa94db48f0bd278c780e946f64`
   
   In my AirFlow ECSOperator Definition, I am providing
   
   ```
   ECS_AWSLOGS_GROUP = '/ecs/airflow-aws-gcp'
   ECS_AWSLOGS_STREAM_PREFIX = 'ecs/airflow-aws-gcp'
   ```
   
   In the AirFlow logs, I see
   
   [2021-09-14 23:46:03,310] {{ecs.py:257}} INFO - ECS Task started: {'tasks': 
[{'attachments': [{'id': '8fdb6419-df5f-4f61-a5d8-9388bb523695', 'type': 
'ElasticNetworkInterface', 'status': 'PRECREATED', ..'containers': 
[{'containerArn': 
'arn:aws:ecs:us-east-1:{ACCOUNT_ID}:container/airflow-ecs/ad93daaa94db48f0bd278c780e946f64/02114c6b-5925-45dd-a595-b9608157bdb6
 
   
   Am I reading it right that,
   
   1. ECS Task id is `8fdb6419-df5f-4f61-a5d8-9388bb523695`
   2. But Log Stream that is generated has the name 
`ecs/airflow-aws-gcp/ad93daaa94db48f0bd278c780e946f64`
   
   Why the discrepancy? I am assuming that the 
`ad93daaa94db48f0bd278c780e946f64` is the container instance id and not really 
the ECS Task Id. Am I configuring my ECS Task definition wrong?
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] SamWheating commented on a change in pull request #17121: Deactivating DAGs which have been removed from files

2021-09-14 Thread GitBox


SamWheating commented on a change in pull request #17121:
URL: https://github.com/apache/airflow/pull/17121#discussion_r708743677



##
File path: airflow/models/dag.py
##
@@ -2656,11 +2655,6 @@ def deactivate_deleted_dags(cls, alive_dag_filelocs: 
List[str], session=None):
 if dag_model.fileloc is not None:
 if correct_maybe_zipped(dag_model.fileloc) not in 
alive_dag_filelocs:
 dag_model.is_active = False
-else:
-# If is_active is set as False and the DAG File still 
exists
-# Change is_active=True
-if not dag_model.is_active:
-dag_model.is_active = True

Review comment:
   > Why this change?
   
   With this code in place, any time a DAG is no longer present in a file it 
will be continually reactivated, so this code needs to be removed in order to 
fix this issue. 
   
   I could not find a good explanation for why this was added in the first 
place ([here's the PR which introduced 
it](https://github.com/apache/airflow/pull/5743/files)), so I think its safe to 
remove. 
   
   > what will re-activate the DAG if it is readded
   
   I believe that the next time a DAG is found in a file, it will be marked as 
`active` when syncing the DAGs from the file processor to the DB.  
   
   
https://github.com/apache/airflow/blob/c73004d0cd33d76b82b172078f572e8d4eecab39/airflow/models/dag.py#L2406




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] SamWheating commented on a change in pull request #17121: Deactivating DAGs which have been removed from files

2021-09-14 Thread GitBox


SamWheating commented on a change in pull request #17121:
URL: https://github.com/apache/airflow/pull/17121#discussion_r708743677



##
File path: airflow/models/dag.py
##
@@ -2656,11 +2655,6 @@ def deactivate_deleted_dags(cls, alive_dag_filelocs: 
List[str], session=None):
 if dag_model.fileloc is not None:
 if correct_maybe_zipped(dag_model.fileloc) not in 
alive_dag_filelocs:
 dag_model.is_active = False
-else:
-# If is_active is set as False and the DAG File still 
exists
-# Change is_active=True
-if not dag_model.is_active:
-dag_model.is_active = True

Review comment:
   > Why this change?
   
   With this code in place, any time a DAG is no longer present in a file it 
will be continually reactivated, so this code needs to be removed in order to 
fix this issue. 
   
   I could not find a good explanation for why this was added in the first 
place ([here's the PR which introduced 
it](https://github.com/apache/airflow/pull/5743/files)), so I think its safe to 
remove. 
   
   > what will re-activate the DAG if it is readded
   
   I believe that the next time a DAG is found in a file, it will be marked as 
`active`. 
   
   
https://github.com/apache/airflow/blob/c73004d0cd33d76b82b172078f572e8d4eecab39/airflow/models/dag.py#L2406




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #15277: Remove support for FAB `APP_THEME`

2021-09-14 Thread GitBox


github-actions[bot] commented on pull request #15277:
URL: https://github.com/apache/airflow/pull/15277#issuecomment-919595241


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed in 5 days if no further activity occurs. 
Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #8082: [AIRFLOW-4355] removed task should not lead to dagrun success

2021-09-14 Thread GitBox


github-actions[bot] commented on pull request #8082:
URL: https://github.com/apache/airflow/pull/8082#issuecomment-919595353


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed in 5 days if no further activity occurs. 
Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] closed issue #13925: Duplicate entry with MySQL backend (AriFlow 2.0.0)

2021-09-14 Thread GitBox


github-actions[bot] closed issue #13925:
URL: https://github.com/apache/airflow/issues/13925


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on issue #13925: Duplicate entry with MySQL backend (AriFlow 2.0.0)

2021-09-14 Thread GitBox


github-actions[bot] commented on issue #13925:
URL: https://github.com/apache/airflow/issues/13925#issuecomment-919595272


   This issue has been closed because it has not received response from the 
issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] rab170 commented on pull request #18117: Update wasb.rst

2021-09-14 Thread GitBox


rab170 commented on pull request #18117:
URL: https://github.com/apache/airflow/pull/18117#issuecomment-919592896






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #17121: Deactivating DAGs which have been removed from files

2021-09-14 Thread GitBox


kaxil commented on a change in pull request #17121:
URL: https://github.com/apache/airflow/pull/17121#discussion_r708739971



##
File path: airflow/models/dag.py
##
@@ -2656,11 +2655,6 @@ def deactivate_deleted_dags(cls, alive_dag_filelocs: 
List[str], session=None):
 if dag_model.fileloc is not None:
 if correct_maybe_zipped(dag_model.fileloc) not in 
alive_dag_filelocs:
 dag_model.is_active = False
-else:
-# If is_active is set as False and the DAG File still 
exists
-# Change is_active=True
-if not dag_model.is_active:
-dag_model.is_active = True

Review comment:
   Why this change? what will re-activate the DAG if it is readded




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] rab170 edited a comment on pull request #18117: Update wasb.rst

2021-09-14 Thread GitBox


rab170 edited a comment on pull request #18117:
URL: https://github.com/apache/airflow/pull/18117#issuecomment-919591212


   thanks :) certainly a small contribution! haha 
   
   next one is guaranteed to be be strictly greater than or equal in size :D 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] rab170 commented on pull request #18117: Update wasb.rst

2021-09-14 Thread GitBox


rab170 commented on pull request #18117:
URL: https://github.com/apache/airflow/pull/18117#issuecomment-919591212


   thanks :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on pull request #17552: AIP 39: Documentation

2021-09-14 Thread GitBox


kaxil commented on pull request #17552:
URL: https://github.com/apache/airflow/pull/17552#issuecomment-919587937


   cc @ jwitz


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #17552: AIP 39: Documentation

2021-09-14 Thread GitBox


kaxil commented on a change in pull request #17552:
URL: https://github.com/apache/airflow/pull/17552#discussion_r708734047



##
File path: docs/apache-airflow/faq.rst
##
@@ -216,20 +216,35 @@ actually start. If this were not the case, the backfill 
just would not start.
 What does ``execution_date`` mean?
 --
 
-Airflow was developed as a solution for ETL needs. In the ETL world, you 
typically summarize data. So, if you want to
-summarize data for 2016-02-19, You would do it at 2016-02-20 midnight UTC, 
which would be right after all data for
-2016-02-19 becomes available.
-
-This datetime value is available to you as :ref:`Template 
variables` with various formats in Jinja templated
-fields. They are also included in the context dictionary given to an 
Operator's execute function.
+*Execution date* or ``execution_date`` is a historical name for what is called 
a
+*logical date*, and also usually the start of the data interval represented by 
a
+DAG run.
+
+Airflow was developed as a solution for ETL needs. In the ETL world, you
+typically summarize data. So, if you want to summarize data for 2016-02-19, You
+would do it at 2016-02-20 midnight UTC, which would be right after all data for
+2016-02-19 becomes available. This interval between midnights of 2016-02-19 and
+2016-02-20 is called the *data interval*, and since the it represents data in
+the date of 2016-02-19, this date is thus called the run's *logical date*, or
+the date that this DAG run is executed for, thus *execution date*.

Review comment:
   ```suggestion
   typically summarize data. So, if you want to summarize data for 
``2016-02-19``, You
   would do it at ``2016-02-20`` midnight ``UTC``, which would be right after 
all data for
   ``2016-02-19`` becomes available. This interval between midnights of 
``2016-02-19`` and
   ``2016-02-20`` is called the *data interval*, and since it represents data in
   the date of ``2016-02-19``, this date is thus called the run's *logical 
date*, or
   the date that this DAG run is executed for, thus *execution date*.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #17552: AIP 39: Documentation

2021-09-14 Thread GitBox


kaxil commented on a change in pull request #17552:
URL: https://github.com/apache/airflow/pull/17552#discussion_r708733553



##
File path: docs/apache-airflow/dag-run.rst
##
@@ -54,17 +54,36 @@ Cron Presets
 Your DAG will be instantiated for each schedule along with a corresponding
 DAG Run entry in the database backend.
 
-.. note::
 
-If you run a DAG on a schedule_interval of one day, the run stamped 
2020-01-01
-will be triggered soon after 2020-01-01T23:59. In other words, the job 
instance is
-started once the period it covers has ended.  The ``execution_date`` 
available in the context
-will also be 2020-01-01.
+.. _data-interval:
 
-The first DAG Run is created based on the minimum ``start_date`` for the 
tasks in your DAG.
-Subsequent DAG Runs are created by the scheduler process, based on your 
DAG’s ``schedule_interval``,
-sequentially. If your start_date is 2020-01-01 and schedule_interval is 
@daily, the first run
-will be created on 2020-01-02 i.e., after your start date has passed.
+Data Interval
+-
+
+Each DAG run in Airflow has an assigned "data interval" that represents the 
time
+range it operates in. For a DAG scheduled with ``@daily``, for example, each of
+its data interval would start at midnight of each day and end at midnight of 
the
+next day.
+
+A DAG run is usually scheduled *after* its associated data interval has ended,
+to ensure the run is able to collect all the data within the time period. In
+other words, a run covering the data period of 2020-01-01 generally does not
+start to run until 2020-01-01 has ended, i.e. after 2020-01-02 00:00:00.
+
+All dates in Airflow are tied to the data interval concept in some way. The
+"logical date" (also called ``execution_date`` in Airflow versions prior to 
2.2)
+of a DAG run, for example, denotes the start of the data interval, not when the
+DAG is actually executed.
+
+Similarly, since the ``start_date`` argument for the DAG and its tasks points 
to
+the same logical date, it marks the start of *the DAG's fist data interval*, 
not
+when tasks in the DAG will start running. In other words, a DAG run will only 
be
+scheduled one interval after ``start_date``.
+
+.. tip::
+
+If ``schedule_interval`` is not enough to express your DAG's schedule,
+logical date, or data interval, see :doc:`Customizing imetables 
`.

Review comment:
   ```suggestion
   logical date, or data interval, see :doc:`Customizing timetables 
`.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] tikikun edited a comment on issue #15635: ARM64 support in docker images

2021-09-14 Thread GitBox


tikikun edited a comment on issue #15635:
URL: https://github.com/apache/airflow/issues/15635#issuecomment-919575669


   Hi everyone, is Airflow supporting ARM64 now?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] tikikun commented on issue #15635: ARM64 support in docker images

2021-09-14 Thread GitBox


tikikun commented on issue #15635:
URL: https://github.com/apache/airflow/issues/15635#issuecomment-919575669


   Hi everyone, is Docker supporting ARM64 now?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj opened a new pull request #18258: Improve coverage for airflow.security.kerberos module

2021-09-14 Thread GitBox


mik-laj opened a new pull request #18258:
URL: https://github.com/apache/airflow/pull/18258


   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #18243: Fix deleting of zipped Dags in Serialized Dag Table

2021-09-14 Thread GitBox


github-actions[bot] commented on pull request #18243:
URL: https://github.com/apache/airflow/pull/18243#issuecomment-919505236


   The PR most likely needs to run full matrix of tests because it modifies 
parts of the core of Airflow. However, committers might decide to merge it 
quickly and take the risk. If they don't merge it quickly - please rebase it to 
the latest main at your convenience, or amend the last commit of the PR, and 
push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jedcunningham commented on a change in pull request #18243: Fix deleting of zipped Dags in Serialized Dag Table

2021-09-14 Thread GitBox


jedcunningham commented on a change in pull request #18243:
URL: https://github.com/apache/airflow/pull/18243#discussion_r708623755



##
File path: airflow/dag_processing/manager.py
##
@@ -660,13 +661,29 @@ def _refresh_dag_dir(self):
 self.clear_nonexistent_import_errors()
 except Exception:
 self.log.exception("Error removing old import errors")
+# Check if file path is a zipfile and get the full path of the 
python file.

Review comment:
   ```suggestion
   
   # Check if file path is a zipfile and get the full path of the 
python file.
   ```
   nit




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] fredthomsen opened a new pull request #18257: Ability to access http k8s via multiple hostnames

2021-09-14 Thread GitBox


fredthomsen opened a new pull request #18257:
URL: https://github.com/apache/airflow/pull/18257


   Template the airflow ui and flower ingress in the helm chart to enable them
   to be accessed via multiple hostnames.  Also quote hostnames so that
   wildcard hostnames can be used.
   
   The old field specifying a single hostname has been marked as deprecated, 
but can still be used maintaining backwards compatibility.  This does not allow 
different paths to be associated with different hosts.  I was concerned about 
increasing the complexity too much here while trying to keep backwards 
compatability.
   
   closes: #18216.
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] eladkal commented on issue #15951: Mark tasks as critical

2021-09-14 Thread GitBox


eladkal commented on issue #15951:
URL: https://github.com/apache/airflow/issues/15951#issuecomment-919478949


   Not ideal but you can solve it by placing PythonOperator after the terminate 
task.
   The operator can look for the main task status and set to fail/success 
accordingly. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] eladkal commented on issue #8703: Support for set in XCom serialization

2021-09-14 Thread GitBox


eladkal commented on issue #8703:
URL: https://github.com/apache/airflow/issues/8703#issuecomment-919473918


   @kaxil https://github.com/apache/airflow/pull/16395 was on hold for a fix 
here.
   Should we also close the linked issue 
https://github.com/apache/airflow/issues/16386 ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jaketf closed pull request #18222: make current_user_has_permissions backwards compatible

2021-09-14 Thread GitBox


jaketf closed pull request #18222:
URL: https://github.com/apache/airflow/pull/18222


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] eladkal closed issue #17072: MSSQL on CI has been consistently failing

2021-09-14 Thread GitBox


eladkal closed issue #17072:
URL: https://github.com/apache/airflow/issues/17072


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] eladkal commented on issue #17072: MSSQL on CI has been consistently failing

2021-09-14 Thread GitBox


eladkal commented on issue #17072:
URL: https://github.com/apache/airflow/issues/17072#issuecomment-919470650


   solved by https://github.com/apache/airflow/pull/17078


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] andrewgodwin opened a new pull request #18254: Add metrics docs for triggerer metrics

2021-09-14 Thread GitBox


andrewgodwin opened a new pull request #18254:
URL: https://github.com/apache/airflow/pull/18254


   Follow-on from https://github.com/apache/airflow/pull/18214
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jedcunningham commented on a change in pull request #18222: make current_user_has_permissions backwards compatible

2021-09-14 Thread GitBox


jedcunningham commented on a change in pull request #18222:
URL: https://github.com/apache/airflow/pull/18222#discussion_r708574349



##
File path: airflow/www/security.py
##
@@ -346,10 +346,7 @@ def get_current_user_permissions(self):
 return perms
 
 def current_user_has_permissions(self) -> bool:
-for role in self.get_user_roles():
-if role.permissions:
-return True
-return False
+return len(self.get_current_user_permissions()) > 0

Review comment:
   Yeah, having an ABC could make this a lot cleaner . Before you do the 
work though, it'd be worth pinging @jhtimmins directly to coordinate.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Jorricks commented on pull request #17207: Fix external_executor_id not being set for manually run jobs.

2021-09-14 Thread GitBox


Jorricks commented on pull request #17207:
URL: https://github.com/apache/airflow/pull/17207#issuecomment-919455634


   Rebased on latest main again to see if that makes it better.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Jorricks edited a comment on pull request #16634: Require can_edit on DAG privileges to modify TaskInstances and DagRuns

2021-09-14 Thread GitBox


Jorricks edited a comment on pull request #16634:
URL: https://github.com/apache/airflow/pull/16634#issuecomment-919453822


   The `create_task_instance_of_operator` decorator in `conftest.py` made my 
life a lot easier :v: Tests should be fixed now :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Jorricks commented on pull request #16634: Require can_edit on DAG privileges to modify TaskInstances and DagRuns

2021-09-14 Thread GitBox


Jorricks commented on pull request #16634:
URL: https://github.com/apache/airflow/pull/16634#issuecomment-919453822


   The `create_task_instance_of_operator` made my life a lot easier :v: Tests 
should be fixed now :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] alexInhert opened a new issue #18253: Allow to write to task log from task policy

2021-09-14 Thread GitBox


alexInhert opened a new issue #18253:
URL: https://github.com/apache/airflow/issues/18253


   ### Description
   
   Airflow offers Task policies.
   The problem is that while task policy alter the task and replace parameters 
there is no way to indicate to the user that parameters were switched.
   
   
   ### Use case/motivation
   
   Consider the following:
   ```
   def task_policy(task: BaseOperator):
   if task.timeout > timedelta(hours=10):
   task.timeout = timedelta(hours=10)
   ```
   
   This replace the timeout for task if criteria met. However the problem is 
that if you set timeout 20 hours in your code this is what you will see in task 
instance details. You won't see 10 you will see 20.
   Also if I do:
   ```
   def task_policy(task: BaseOperator):
   if task.timeout > timedelta(hours=10):
   task.timeout = timedelta(hours=10)
   self.log("task policy changed timeout to 10 hours")
   ```
   
   This log will be printed in the web server log! it will not be printed in 
the task log.
   
   Result: the user who set timeout of 20 hours and the task timouted after 10 
has no idea why it happened. He can't see it in the logs, he can't see it in 
task details. he is completely clueless.
   
   
   Desired solution - allow to write to the task log from the task policy and 
if possible indicate in the task instance that a policy applied on the task and 
changed values.
   
   ### Related issues
   
   no
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #18113: callback functions not called when a dag run is marked success or failure

2021-09-14 Thread GitBox


potiuk commented on issue #18113:
URL: https://github.com/apache/airflow/issues/18113#issuecomment-919450649


   > One thing I do want to look at soon, however, is extending the Executor 
interface to allow executors to run more than just Tasks - in essence, opening 
up the ability to submit other workloads (like callback functions) to run on 
executors, and hopefully pave the way for moving these and a lot of other 
things out of the scheduler/webserver.
   
   Sounds great,


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #18251: Improves installing from sources pages for all components

2021-09-14 Thread GitBox


potiuk commented on pull request #18251:
URL: https://github.com/apache/airflow/pull/18251#issuecomment-919442578


   ![Screenshot 2021-09-14 21 14 
33](https://user-images.githubusercontent.com/595491/133319873-6fe3b27e-ab12-4ff9-8d16-666f7edcb229.png)
   ![Screenshot 2021-09-14 21 09 
38](https://user-images.githubusercontent.com/595491/133319879-b68a4c37-5e53-4114-bfdd-cd48ea619d64.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ephraimbuddy commented on a change in pull request #18243: Fix deleting of zipped Dags in Serialized Dag Table

2021-09-14 Thread GitBox


ephraimbuddy commented on a change in pull request #18243:
URL: https://github.com/apache/airflow/pull/18243#discussion_r708558352



##
File path: airflow/models/dag.py
##
@@ -2814,10 +2815,24 @@ def deactivate_deleted_dags(cls, alive_dag_filelocs: 
List[str], session=None):
 """
 log.debug("Deactivating DAGs (for which DAG files are deleted) from %s 
table ", cls.__tablename__)
 
+dag_filelocs = []
+for fileloc in alive_dag_filelocs:
+if zipfile.is_zipfile(fileloc):
+with zipfile.ZipFile(fileloc) as z:
+dag_filelocs.extend(

Review comment:
   Mistake adding it here, it's the `DagCode.remove_deleted_dags` that 
needed it too. Will move the logic to `_refresh_dag_dir`. Thanks




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jaketf commented on a change in pull request #18222: make current_user_has_permissions backwards compatible

2021-09-14 Thread GitBox


jaketf commented on a change in pull request #18222:
URL: https://github.com/apache/airflow/pull/18222#discussion_r708557702



##
File path: airflow/www/security.py
##
@@ -346,10 +346,7 @@ def get_current_user_permissions(self):
 return perms
 
 def current_user_has_permissions(self) -> bool:
-for role in self.get_user_roles():
-if role.permissions:
-return True
-return False
+return len(self.get_current_user_permissions()) > 0

Review comment:
   good point, this logic resonates with me. It's smelly to make 
implementation choices that cause inefficiency for this default 
`AirflowSecurityManager` based on "what if people want to subclass this". The 
purpose of this class is to be the default `AirflowSecurityManager` so 
sub-classers have to assume the risk / responsibility that methods like this 
might crop up over time and need to be overridden.
   
   If the goal is to improve the security manager subclassing experience it'd 
probably be better addressed by pulling out a 
`BaseAirlfowSecurityManager(abc.ABC)` that clearly spells out the contract of 
which abstract methods should be overridden to make this work correctly with 
the rest of airflow and making this AirflowSecurityManager a subclass of that.
   
   that's not really the topic of this PR so I will close it and can open a new 
PR for implementing a `BaseAirlfowSecurityManager` if you think that's 
worthwhile.
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] andrewgodwin commented on issue #18113: callback functions not called when a dag run is marked success or failure

2021-09-14 Thread GitBox


andrewgodwin commented on issue #18113:
URL: https://github.com/apache/airflow/issues/18113#issuecomment-919439356


   Unfortunately the triggerer mechanism doesn't solve this - triggers can run 
separately, but even to raise a task failure, deferred tasks have to be sent 
back to a worker to execute, so they're still bound by the same sort of rules.
   
   One thing I do want to look at soon, however, is extending the Executor 
interface to allow executors to run more than just Tasks - in essence, opening 
up the ability to submit other workloads (like callback functions) to run on 
executors, and hopefully pave the way for moving these and a lot of other 
things out of the scheduler/webserver.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk opened a new pull request #18251: Improves installing from sources pages for all components

2021-09-14 Thread GitBox


potiuk opened a new pull request #18251:
URL: https://github.com/apache/airflow/pull/18251


   * Shorter menu sections for installation page
   * Added "installing from sources" for Helm Chart
   * Added Providers summary page for all provider packages
   * Added scripts to verify PyPI packages with gpg/sha
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #18226: Add start date to trigger_dagrun operator

2021-09-14 Thread GitBox


kaxil commented on a change in pull request #18226:
URL: https://github.com/apache/airflow/pull/18226#discussion_r708553715



##
File path: tests/operators/test_trigger_dagrun.py
##
@@ -255,3 +255,33 @@ def 
test_trigger_dagrun_with_wait_for_completion_true_fail(self):
 )
 with pytest.raises(AirflowException):
 task.run(start_date=execution_date, end_date=execution_date)
+
+def test_trigger_dagrun_triggering_itself(self):
+"""Test TriggerDagRunOperator that triggers itself"""
+execution_date = DEFAULT_DATE
+task = TriggerDagRunOperator(
+task_id="test_task",
+trigger_dag_id=self.dag.dag_id,
+allowed_states=[State.RUNNING, State.SUCCESS],
+dag=self.dag,
+)
+task.run(start_date=execution_date, end_date=execution_date)
+
+with create_session() as session:
+dagruns = session.query(DagRun).filter(DagRun.dag_id == 
self.dag.dag_id).all()
+assert len(dagruns) == 2
+assert isinstance(dagruns[1].start_date, datetime)
+
+def test_trigger_dagrun_triggering_itself_with_execution_date(self):
+"""Test TriggerDagRunOperator that triggers itself with execution 
date, fails with DagRunAlreadyExists"""
+execution_date = DEFAULT_DATE
+task = TriggerDagRunOperator(
+task_id="test_task",
+trigger_dag_id=self.dag.dag_id,
+execution_date=execution_date,
+allowed_states=[State.RUNNING, State.SUCCESS],
+dag=self.dag,
+)
+with pytest.raises(DagRunAlreadyExists):
+task.run(start_date=execution_date, end_date=execution_date)
+

Review comment:
   ```suggestion
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #18226: Add start date to trigger_dagrun operator

2021-09-14 Thread GitBox


kaxil commented on a change in pull request #18226:
URL: https://github.com/apache/airflow/pull/18226#discussion_r708553250



##
File path: tests/operators/test_trigger_dagrun.py
##
@@ -255,3 +255,33 @@ def 
test_trigger_dagrun_with_wait_for_completion_true_fail(self):
 )
 with pytest.raises(AirflowException):
 task.run(start_date=execution_date, end_date=execution_date)
+

Review comment:
   ```suggestion
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch main updated (778be79 -> c73004d)

2021-09-14 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 778be79  Fix example dag of PostgresOperator (#18236)
 add c73004d  Revert Changes to ``importlib-resources`` (#18250)

No new revisions were added by this update.

Summary of changes:
 setup.cfg| 2 +-
 setup.py | 2 +-
 tests/core/test_providers_manager.py | 4 +---
 3 files changed, 3 insertions(+), 5 deletions(-)


[airflow] branch constraints-main updated: Revert "Bump importlib-resources"

2021-09-14 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch constraints-main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/constraints-main by this push:
 new cdd43c5  Revert "Bump importlib-resources"
cdd43c5 is described below

commit cdd43c5fc2ed03a8caf8002c8cbc3cc3ba840cf0
Author: Kaxil Naik 
AuthorDate: Tue Sep 14 20:02:38 2021 +0100

Revert "Bump importlib-resources"

This reverts commit 0ee0232369a2aa70f09678f8f2ba7e40fbcc1948.
---
 constraints-3.6.txt  | 2 +-
 constraints-3.7.txt  | 2 +-
 constraints-3.8.txt  | 2 +-
 constraints-3.9.txt  | 2 +-
 constraints-no-providers-3.6.txt | 2 +-
 constraints-no-providers-3.7.txt | 2 +-
 constraints-no-providers-3.8.txt | 2 +-
 constraints-no-providers-3.9.txt | 2 +-
 constraints-source-providers-3.6.txt | 2 +-
 constraints-source-providers-3.7.txt | 2 +-
 constraints-source-providers-3.8.txt | 2 +-
 constraints-source-providers-3.9.txt | 2 +-
 12 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/constraints-3.6.txt b/constraints-3.6.txt
index a084bb9..f74f6fe 100644
--- a/constraints-3.6.txt
+++ b/constraints-3.6.txt
@@ -298,7 +298,7 @@ ijson==3.1.4
 imagesize==1.2.0
 immutables==0.16
 importlib-metadata==4.8.1
-importlib-resources==5.2.2
+importlib-resources==1.5.0
 inflection==0.5.1
 iniconfig==1.1.1
 ipdb==0.13.9
diff --git a/constraints-3.7.txt b/constraints-3.7.txt
index 358fd9c..9b4c221 100644
--- a/constraints-3.7.txt
+++ b/constraints-3.7.txt
@@ -293,7 +293,7 @@ idna==3.2
 ijson==3.1.4
 imagesize==1.2.0
 importlib-metadata==4.8.1
-importlib-resources==5.2.2
+importlib-resources==1.5.0
 inflection==0.5.1
 iniconfig==1.1.1
 ipdb==0.13.9
diff --git a/constraints-3.8.txt b/constraints-3.8.txt
index dac652b..412d235 100644
--- a/constraints-3.8.txt
+++ b/constraints-3.8.txt
@@ -292,7 +292,7 @@ idna==3.2
 ijson==3.1.4
 imagesize==1.2.0
 importlib-metadata==4.8.1
-importlib-resources==5.2.2
+importlib-resources==1.5.0
 inflection==0.5.1
 iniconfig==1.1.1
 ipdb==0.13.9
diff --git a/constraints-3.9.txt b/constraints-3.9.txt
index b4b729b..77f8864 100644
--- a/constraints-3.9.txt
+++ b/constraints-3.9.txt
@@ -289,7 +289,7 @@ idna==3.2
 ijson==3.1.4
 imagesize==1.2.0
 importlib-metadata==4.8.1
-importlib-resources==5.2.2
+importlib-resources==1.5.0
 inflection==0.5.1
 iniconfig==1.1.1
 ipdb==0.13.9
diff --git a/constraints-no-providers-3.6.txt b/constraints-no-providers-3.6.txt
index 17fb385..4fc37cc 100644
--- a/constraints-no-providers-3.6.txt
+++ b/constraints-no-providers-3.6.txt
@@ -88,7 +88,7 @@ humanize==3.11.0
 idna==3.2
 immutables==0.16
 importlib-metadata==4.8.1
-importlib-resources==5.2.2
+importlib-resources==1.5.0
 inflection==0.5.1
 iso8601==0.1.16
 isodate==0.6.0
diff --git a/constraints-no-providers-3.7.txt b/constraints-no-providers-3.7.txt
index 8402d82..8b03131 100644
--- a/constraints-no-providers-3.7.txt
+++ b/constraints-no-providers-3.7.txt
@@ -85,7 +85,7 @@ httpx==0.19.0
 humanize==3.11.0
 idna==3.2
 importlib-metadata==4.8.1
-importlib-resources==5.2.2
+importlib-resources==1.5.0
 inflection==0.5.1
 iso8601==0.1.16
 isodate==0.6.0
diff --git a/constraints-no-providers-3.8.txt b/constraints-no-providers-3.8.txt
index a90b4c7..4571b76 100644
--- a/constraints-no-providers-3.8.txt
+++ b/constraints-no-providers-3.8.txt
@@ -84,7 +84,7 @@ httpx==0.19.0
 humanize==3.11.0
 idna==3.2
 importlib-metadata==4.8.1
-importlib-resources==5.2.2
+importlib-resources==1.5.0
 inflection==0.5.1
 iso8601==0.1.16
 isodate==0.6.0
diff --git a/constraints-no-providers-3.9.txt b/constraints-no-providers-3.9.txt
index 9d467a8..f6c2be4 100644
--- a/constraints-no-providers-3.9.txt
+++ b/constraints-no-providers-3.9.txt
@@ -83,7 +83,7 @@ httpcore==0.13.6
 httpx==0.19.0
 humanize==3.11.0
 idna==3.2
-importlib-resources==5.2.2
+importlib-resources==1.5.0
 inflection==0.5.1
 iso8601==0.1.16
 isodate==0.6.0
diff --git a/constraints-source-providers-3.6.txt 
b/constraints-source-providers-3.6.txt
index b0d3569..0669e75 100644
--- a/constraints-source-providers-3.6.txt
+++ b/constraints-source-providers-3.6.txt
@@ -228,7 +228,7 @@ ijson==3.1.4
 imagesize==1.2.0
 immutables==0.16
 importlib-metadata==4.8.1
-importlib-resources==5.2.2
+importlib-resources==1.5.0
 inflection==0.5.1
 iniconfig==1.1.1
 ipdb==0.13.9
diff --git a/constraints-source-providers-3.7.txt 
b/constraints-source-providers-3.7.txt
index 5545db7..4719a44 100644
--- a/constraints-source-providers-3.7.txt
+++ b/constraints-source-providers-3.7.txt
@@ -223,7 +223,7 @@ idna==3.2
 ijson==3.1.4
 imagesize==1.2.0
 importlib-metadata==4.8.1
-importlib-resources==5.2.2
+importlib-resources==1.5.0
 inflection==0.5.1
 iniconfig==1.1.1
 ipdb==0.13.9
diff --git a/constraints-source-providers-3.8.txt 
b/constraints-source-providers-3.8.txt
index 6930c0d..a95d65e 100644
--- 

[GitHub] [airflow] kaxil merged pull request #18250: Revert Changes to ``importlib-resources``

2021-09-14 Thread GitBox


kaxil merged pull request #18250:
URL: https://github.com/apache/airflow/pull/18250


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #18250: Revert Changes to ``importlib-resources``

2021-09-14 Thread GitBox


github-actions[bot] commented on pull request #18250:
URL: https://github.com/apache/airflow/pull/18250#issuecomment-919428303


   The PR most likely needs to run full matrix of tests because it modifies 
parts of the core of Airflow. However, committers might decide to merge it 
quickly and take the risk. If they don't merge it quickly - please rebase it to 
the latest main at your convenience, or amend the last commit of the PR, and 
push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jedcunningham commented on a change in pull request #18222: make current_user_has_permissions backwards compatible

2021-09-14 Thread GitBox


jedcunningham commented on a change in pull request #18222:
URL: https://github.com/apache/airflow/pull/18222#discussion_r708543688



##
File path: airflow/www/security.py
##
@@ -346,10 +346,7 @@ def get_current_user_permissions(self):
 return perms
 
 def current_user_has_permissions(self) -> bool:
-for role in self.get_user_roles():
-if role.permissions:
-return True
-return False
+return len(self.get_current_user_permissions()) > 0

Review comment:
   It seems like an odd fallback to me as it doesn't make sense in the 
context of `AirflowSecurityManager`.
   
   I'm not sure where you draw that line. E.g. should `get_accessible_dags` 
also have a fallback to support subclasses trying to remove the need for roles? 
What if someone else tried to add another layer instead of removing one?
   
   My thoughts are: If you are redefining where permissions come from (not 
roles) you should be prepared to implement all of the permissions based 
methods. I'm not saying changes to make it more subclass-friendly shouldn't 
happen, however if you need a comment explaining why a fallback is needed that 
could only happen with a subclass and removal of a core assumption, idk.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil opened a new pull request #18250: Revert Changes to ``importlib-resources``

2021-09-14 Thread GitBox


kaxil opened a new pull request #18250:
URL: https://github.com/apache/airflow/pull/18250


   Revert changes in #18209 and #18215 as it is causing issues
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jedcunningham commented on pull request #18222: make current_user_has_permissions backwards compatible

2021-09-14 Thread GitBox


jedcunningham commented on pull request #18222:
URL: https://github.com/apache/airflow/pull/18222#issuecomment-919408132


   > it feels misleading for that 403 page to say "user has no roles and/**or** 
permissions" if we really are only checking roles.
   
   It's actually checking both given the assumption of user -> roles -> 
permissions, so it seems like decent language to me.  However I'm certainly 
open to alternate wording here, maybe "user has no permissions"?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] john-jac commented on pull request #18211: Support all Unix wildcards in S3KeySensor

2021-09-14 Thread GitBox


john-jac commented on pull request #18211:
URL: https://github.com/apache/airflow/pull/18211#issuecomment-919400424


   > Can you add tests to avoid regression?
   
   Added tests for both changes.  Should have thought of that myself.  Thanks 
Kamil!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jhtimmins commented on pull request #16634: Require can_edit on DAG privileges to modify TaskInstances and DagRuns

2021-09-14 Thread GitBox


jhtimmins commented on pull request #16634:
URL: https://github.com/apache/airflow/pull/16634#issuecomment-919399312


   @Jorricks sorry this got lost in the inbox. I'll take a look today


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jaketf commented on a change in pull request #18222: make current_user_has_permissions backwards compatible

2021-09-14 Thread GitBox


jaketf commented on a change in pull request #18222:
URL: https://github.com/apache/airflow/pull/18222#discussion_r708483857



##
File path: airflow/www/security.py
##
@@ -346,10 +346,7 @@ def get_current_user_permissions(self):
 return perms
 
 def current_user_has_permissions(self) -> bool:
-for role in self.get_user_roles():
-if role.permissions:
-return True
-return False
+return len(self.get_current_user_permissions()) > 0

Review comment:
   @uranusjr thanks for that feedback on blowing up DB queries. Would this 
be an acceptable workaround?
   Look for roles first (existing behavior), then look for permissions only if 
necessary?
   ```suggestion
   for role in self.get_user_roles():
   if role.permissions:
   return True
   return len(self.get_current_user_permissions()) > 0
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jaketf commented on a change in pull request #18222: make current_user_has_permissions backwards compatible

2021-09-14 Thread GitBox


jaketf commented on a change in pull request #18222:
URL: https://github.com/apache/airflow/pull/18222#discussion_r708488295



##
File path: airflow/www/security.py
##
@@ -346,10 +346,7 @@ def get_current_user_permissions(self):
 return perms
 
 def current_user_has_permissions(self) -> bool:
-for role in self.get_user_roles():
-if role.permissions:
-return True
-return False
+return len(self.get_current_user_permissions()) > 0

Review comment:
   alternatively (pardon my ignorance) is there a way for us to cache the 
user permissions for a sessions so we don't have to make those queries twice? 
It seems like immediately after this we'll have to get user permissions anyway.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] yzhanggithub commented on issue #15978: DAG getting stuck in "running" state indefinitely

2021-09-14 Thread GitBox


yzhanggithub commented on issue #15978:
URL: https://github.com/apache/airflow/issues/15978#issuecomment-919389023


   I had a similar issue, in my case, it worked fine after I fixed this error 
(from airflow.log).
   airflow command error: argument GROUP_OR_COMMAND: `airflow worker` command, 
has been removed, please use `airflow celery worker`, see help above.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] blcksrx commented on pull request #17974: WIP: Implement before stm and after stm in DBApiHooks

2021-09-14 Thread GitBox


blcksrx commented on pull request #17974:
URL: https://github.com/apache/airflow/pull/17974#issuecomment-919387845


   @uranusjr I implemented some methods on MySQLHook about what you said. can 
you verify that its the one that you described and after that I can continue it 
to other hooks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jaketf edited a comment on pull request #18222: make current_user_has_permissions backwards compatible

2021-09-14 Thread GitBox


jaketf edited a comment on pull request #18222:
URL: https://github.com/apache/airflow/pull/18222#issuecomment-919365903


   > I mean, both Airflow and FAB overall assume permissions come from roles.
   
   hmm, that's good to know. it feels misleading for that 403 page to say "user 
has no roles and/**or** permissions" if we really are only checking roles. 
   
   > seems reasonable to me to need to adjust a few new methods every so often
   
   yeah it's certainly not the end of the world. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jaketf commented on pull request #18222: make current_user_has_permissions backwards compatible

2021-09-14 Thread GitBox


jaketf commented on pull request #18222:
URL: https://github.com/apache/airflow/pull/18222#issuecomment-919365903


   > I mean, both Airflow and FAB overall assume permissions come from roles.
   
   hmm, that's good to know. it feels misleading for that 403 page to say "user 
has no roles and/**or** permissions" if we really are only checking roles. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jaketf commented on a change in pull request #18222: make current_user_has_permissions backwards compatible

2021-09-14 Thread GitBox


jaketf commented on a change in pull request #18222:
URL: https://github.com/apache/airflow/pull/18222#discussion_r708488295



##
File path: airflow/www/security.py
##
@@ -346,10 +346,7 @@ def get_current_user_permissions(self):
 return perms
 
 def current_user_has_permissions(self) -> bool:
-for role in self.get_user_roles():
-if role.permissions:
-return True
-return False
+return len(self.get_current_user_permissions()) > 0

Review comment:
   alternatively (pardon my ignorance) is there a way for us to cache the 
user permissions for a sessions so we don't have to make that query twice?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jaketf commented on a change in pull request #18222: make current_user_has_permissions backwards compatible

2021-09-14 Thread GitBox


jaketf commented on a change in pull request #18222:
URL: https://github.com/apache/airflow/pull/18222#discussion_r708483857



##
File path: airflow/www/security.py
##
@@ -346,10 +346,7 @@ def get_current_user_permissions(self):
 return perms
 
 def current_user_has_permissions(self) -> bool:
-for role in self.get_user_roles():
-if role.permissions:
-return True
-return False
+return len(self.get_current_user_permissions()) > 0

Review comment:
   @uranusjr thanks for that feedback on blowing up DB queries. Would this 
be an acceptable workaround?
   Look for roles first, then look for permissions only if necessary?
   ```suggestion
   for role in self.get_user_roles():
   if role.permissions:
   return True
   return len(self.get_current_user_permissions()) > 0
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] bmckallagat-os commented on pull request #18215: Solve CI issues

2021-09-14 Thread GitBox


bmckallagat-os commented on pull request #18215:
URL: https://github.com/apache/airflow/pull/18215#issuecomment-919345094


   @kaxil No worries! Thanks for the approval


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jedcunningham commented on a change in pull request #18243: Fix deleting of zipped Dags in Serialized Dag Table

2021-09-14 Thread GitBox


jedcunningham commented on a change in pull request #18243:
URL: https://github.com/apache/airflow/pull/18243#discussion_r708463072



##
File path: airflow/models/dag.py
##
@@ -2814,10 +2815,24 @@ def deactivate_deleted_dags(cls, alive_dag_filelocs: 
List[str], session=None):
 """
 log.debug("Deactivating DAGs (for which DAG files are deleted) from %s 
table ", cls.__tablename__)
 
+dag_filelocs = []
+for fileloc in alive_dag_filelocs:
+if zipfile.is_zipfile(fileloc):
+with zipfile.ZipFile(fileloc) as z:
+dag_filelocs.extend(

Review comment:
   I wonder if this should be done once in `_refresh_dag_dir` instead of 
done twice here and in `remove_deleted_dags`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jedcunningham commented on a change in pull request #17552: AIP 39: Documentation

2021-09-14 Thread GitBox


jedcunningham commented on a change in pull request #17552:
URL: https://github.com/apache/airflow/pull/17552#discussion_r708453197



##
File path: docs/apache-airflow/howto/timetable.rst
##
@@ -0,0 +1,298 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+
+Customizing DAG Scheduling with Timetables
+==
+
+A DAG's scheduling strategy is determined by its internal "timetable". This
+timetable can be created by specifying the DAG's ``schedule_interval`` 
argument,
+as described in :doc:`DAG Run `. The timetable also dictates the data
+interval and the logical time of each run created for the DAG.
+
+However, there are situations when a cron expression or simple ``timedelta``
+periods cannot properly express the schedule. Some of the examples are:
+
+* Data intervals with "holes" between. (Instead of continuous, as both the cron
+  expression and ``timedelta`` schedules represent.)
+* Run tasks at different times each day. For example, an astronomer may find it
+  useful to run a task at dawn to process data collected from the previous
+  night-time period.
+* Schedules not following the Gregorian calendar. For example, create a run for
+  each month in the `Traditional Chinese Calendar`_. This is conceptually
+  similar to the sunset case above, but for a different time scale.
+* Rolling windows, or overlapping data intervals. For example, one may want to
+  have a run each day, but make each run cover the period of the previous seven
+  days. It is possible to "hack" this with a cron expression, but a custom data
+  interval would be a more natural representation.
+
+.. _`Traditional Chinese Calendar`: 
https://en.wikipedia.org/wiki/Chinese_calendar
+
+
+For our example, let's say a company wants to run a job after each weekday to
+process data collected during the work day. The first intuitive answer to this
+would be ``schedule_interval="0 0 * * 1-5"`` (midnight on Monday to Friday), 
but
+this means data collected on Friday will *not* be processed right after Friday
+ends, but on the next Monday, and that run's interval would be from midnight
+Friday to midnight *Monday*.
+
+This is, therefore, an example in the "holes" category above; the intended
+schedule should not include the two weekend days. What we want is:
+
+* Schedule a run for each Monday, Tuesday, Wednesday, Thursday, and Friday. The
+  run's data interval would cover from midnight of each day, to midnight of the
+  next day (e.g. 2021-01-01 00:00:00 to 2021-01-02 00:00:00).
+* Each run would be created right after the data interval ends. The run 
covering
+  Monday happens on midnight Tuesday and so on. The run covering Friday happens
+  on midnight Saturday. No runs happen on midnights Sunday and Monday.
+
+For simplicity, we will only deal with UTC datetimes in this example.
+
+
+Timetable Registration
+--
+
+A timetable must be a subclass of :class:`~airflow.timetables.base.Timetable`,
+and be registered as a part of a :doc:`plugin `. The following is a
+skeleton for us to implement a new timetable:
+
+.. code-block:: python
+
+from airflow.plugins_manager import AirflowPlugin
+from airflow.timetables.base import Timetable
+
+
+class AfterWorkdayTimetable(Timetable):
+pass
+
+
+class WorkdayTimetablePlugin(AirflowPlugin):
+name = "workday_timetable_plugin"
+timetables = [AfterWorkdayTimetable]
+
+Next, we'll start putting code into ``AfterWorkdayTimetable``. After the
+implementation is finished, we should be able to use the timetable in our DAG
+file:
+
+.. code-block:: python
+
+from airflow import DAG
+
+
+with DAG(timetable=AfterWorkdayTimetable(), tags=["example", "timetable"]) 
as dag:
+...
+
+
+Define Scheduling Logic
+---
+
+When Airflow's scheduler encounters a DAG, it calls one of the two methods to
+know when to schedule the DAG's next run.
+
+* ``next_dagrun_info``: The scheduler uses this to learn the timetable's 
regular
+  schedule, i.e. the "one for every workday, run at the end of it" part in our
+  example.
+* ``infer_data_interval``: When a DAG run is manually triggered (from the web
+  UI, for 

[GitHub] [airflow] alex-astronomer commented on issue #18217: Audit Logging for Variables, Connections, Pools

2021-09-14 Thread GitBox


alex-astronomer commented on issue #18217:
URL: https://github.com/apache/airflow/issues/18217#issuecomment-919323312


   Love that idea.  I was starting to look in the Variables (`variables.py`) 
and found a `set(...)` function that looks like the root for where variables 
are set.  That was my first glance, and seemed like a good place to start or 
add a logging decorator or something.  I like the idea of SQLAlchemy events 
though.  Makes sure that every time that changes it gets captured.  I'll 
definitely look into this more before getting assigned.  I think this might be 
a tough one for an early contributor like me but definitely something I'm 
willing to take a crack at.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #17552: AIP 39: Documentation

2021-09-14 Thread GitBox


ashb commented on a change in pull request #17552:
URL: https://github.com/apache/airflow/pull/17552#discussion_r708451254



##
File path: docs/apache-airflow/howto/timetable.rst
##
@@ -0,0 +1,298 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+
+Customizing DAG Scheduling with Timetables
+==
+
+A DAG's scheduling strategy is determined by its internal "timetable". This
+timetable can be created by specifying the DAG's ``schedule_interval`` 
argument,
+as described in :doc:`DAG Run `. The timetable also dictates the data
+interval and the logical time of each run created for the DAG.
+
+However, there are situations when a cron expression or simple ``timedelta``
+periods cannot properly express the schedule. Some of the examples are:
+
+* Data intervals with "holes" between. (Instead of continuous, as both the cron
+  expression and ``timedelta`` schedules represent.)
+* Run tasks at different times each day. For example, an astronomer may find it
+  useful to run a task at dawn to process data collected from the previous
+  night-time period.
+* Schedules not following the Gregorian calendar. For example, create a run for
+  each month in the `Traditional Chinese Calendar`_. This is conceptually
+  similar to the sunset case above, but for a different time scale.
+* Rolling windows, or overlapping data intervals. For example, one may want to
+  have a run each day, but make each run cover the period of the previous seven
+  days. It is possible to "hack" this with a cron expression, but a custom data
+  interval would be a more natural representation.
+
+.. _`Traditional Chinese Calendar`: 
https://en.wikipedia.org/wiki/Chinese_calendar
+
+
+For our example, let's say a company wants to run a job after each weekday to
+process data collected during the work day. The first intuitive answer to this
+would be ``schedule_interval="0 0 * * 1-5"`` (midnight on Monday to Friday), 
but
+this means data collected on Friday will *not* be processed right after Friday
+ends, but on the next Monday, and that run's interval would be from midnight
+Friday to midnight *Monday*.
+
+This is, therefore, an example in the "holes" category above; the intended
+schedule should not include the two weekend days. What we want is:
+
+* Schedule a run for each Monday, Tuesday, Wednesday, Thursday, and Friday. The
+  run's data interval would cover from midnight of each day, to midnight of the
+  next day (e.g. 2021-01-01 00:00:00 to 2021-01-02 00:00:00).
+* Each run would be created right after the data interval ends. The run 
covering
+  Monday happens on midnight Tuesday and so on. The run covering Friday happens
+  on midnight Saturday. No runs happen on midnights Sunday and Monday.
+
+For simplicity, we will only deal with UTC datetimes in this example.
+
+
+Timetable Registration
+--
+
+A timetable must be a subclass of :class:`~airflow.timetables.base.Timetable`,
+and be registered as a part of a :doc:`plugin `. The following is a
+skeleton for us to implement a new timetable:
+
+.. code-block:: python
+
+from airflow.plugins_manager import AirflowPlugin
+from airflow.timetables.base import Timetable
+
+
+class AfterWorkdayTimetable(Timetable):
+pass
+
+
+class WorkdayTimetablePlugin(AirflowPlugin):
+name = "workday_timetable_plugin"
+timetables = [AfterWorkdayTimetable]
+
+Next, we'll start putting code into ``AfterWorkdayTimetable``. After the
+implementation is finished, we should be able to use the timetable in our DAG
+file:
+
+.. code-block:: python
+
+from airflow import DAG
+
+
+with DAG(timetable=AfterWorkdayTimetable(), tags=["example", "timetable"]) 
as dag:
+...
+
+
+Define Scheduling Logic
+---
+
+When Airflow's scheduler encounters a DAG, it calls one of the two methods to
+know when to schedule the DAG's next run.
+
+* ``next_dagrun_info``: The scheduler uses this to learn the timetable's 
regular
+  schedule, i.e. the "one for every workday, run at the end of it" part in our
+  example.
+* ``infer_data_interval``: When a DAG run is manually triggered (from the web
+  UI, for example), 

[GitHub] [airflow] dimberman closed pull request #17270: Make decorators pluggable

2021-09-14 Thread GitBox


dimberman closed pull request #17270:
URL: https://github.com/apache/airflow/pull/17270


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   3   >