[airflow] 01/02: Update config hash in Breeze's README.md during reinstalllation (#28148)

2023-01-11 Thread ephraimanierobi
This is an automated email from the ASF dual-hosted git repository.

ephraimanierobi pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 1a787a7780531f435251c09f1577f4a78bacb8b7
Author: Jarek Potiuk 
AuthorDate: Tue Dec 6 22:14:10 2022 +0100

Update config hash in Breeze's README.md during reinstalllation (#28148)

Previously we updated Breeze's config hash using pre-commit whenever
setup files changed. This has proven to be brittle.

When you locally work and add new dependencies, breeze would keep
reinstalling every time you run it locally - without the README
being updated. You'd have to manually run pre-commit in order to
get it regenerated.

This PR adds a new flow. Whenever you automatically
re-install breeze, the README.md file of the folder from which
you reinstall breeze gets updated with the new hash **just** before
reinstalling. This means that after installation the new hash is
already present in the package, and next time you run breeze it
will match the changed hash of your dependencies.

The only thing left is to commit the changed README to the repo
together with setup.py/cfg changes of yours.

Pre-commit is still run on commit to verify that the hash of
the config files is good.

(cherry picked from commit 5bac5b39ffa415d535d629ddc4992337317a9c0e)
---
 .pre-commit-config.yaml   |   2 +-
 dev/breeze/src/airflow_breeze/utils/path_utils.py |  12 ++
 images/breeze/output_static-checks.svg| 248 +++---
 3 files changed, 141 insertions(+), 121 deletions(-)

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index 577f0a1dda..bcef7889cf 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -641,7 +641,7 @@ repos:
 name: Update Breeze README.md with config files hash
 language: python
 entry: ./scripts/ci/pre_commit/pre_commit_update_breeze_config_hash.py
-files: 
^dev/breeze/setup.*$|^dev/breeze/pyproject.toml$|^dev/breeze/README.md$
+files: 
dev/breeze/setup.py|dev/breeze/setup.cfg|dev/breeze/pyproject.toml|dev/breeze/README.md
 pass_filenames: false
 require_serial: true
   - id: check-breeze-top-dependencies-limited
diff --git a/dev/breeze/src/airflow_breeze/utils/path_utils.py 
b/dev/breeze/src/airflow_breeze/utils/path_utils.py
index 8ad789e1e2..28c0f45796 100644
--- a/dev/breeze/src/airflow_breeze/utils/path_utils.py
+++ b/dev/breeze/src/airflow_breeze/utils/path_utils.py
@@ -134,6 +134,17 @@ def set_forced_answer_for_upgrade_check():
 set_forced_answer("quit")
 
 
+def process_breeze_readme(breeze_sources: Path, sources_hash: str):
+breeze_readme = breeze_sources / "README.md"
+lines = breeze_readme.read_text().splitlines(keepends=True)
+result_lines = []
+for line in lines:
+if line.startswith("Package config hash:"):
+line = f"Package config hash: {sources_hash}\n"
+result_lines.append(line)
+breeze_readme.write_text("".join(result_lines))
+
+
 def reinstall_if_setup_changed() -> bool:
 """
 Prints warning if detected airflow sources are not the ones that Breeze 
was installed with.
@@ -156,6 +167,7 @@ def reinstall_if_setup_changed() -> bool:
 if installation_sources is not None:
 breeze_sources = installation_sources / "dev" / "breeze"
 warn_dependencies_changed()
+process_breeze_readme(breeze_sources, sources_hash)
 set_forced_answer_for_upgrade_check()
 reinstall_breeze(breeze_sources)
 set_forced_answer(None)
diff --git a/images/breeze/output_static-checks.svg 
b/images/breeze/output_static-checks.svg
index b9c725f1ad..acb8c2e319 100644
--- a/images/breeze/output_static-checks.svg
+++ b/images/breeze/output_static-checks.svg
@@ -1,4 +1,4 @@
-http://www.w3.org/2000/svg";>
+http://www.w3.org/2000/svg";>
 
 
 
@@ -19,249 +19,257 @@
 font-weight: 700;
 }
 
-.terminal-1082452654-matrix {
+.breeze-static-checks-matrix {
 font-family: Fira Code, monospace;
 font-size: 20px;
 line-height: 24.4px;
 font-variant-east-asian: full-width;
 }
 
-.terminal-1082452654-title {
+.breeze-static-checks-title {
 font-size: 18px;
 font-weight: bold;
 font-family: arial;
 }
 
-.terminal-1082452654-r1 { fill: #c5c8c6;font-weight: bold }
-.terminal-1082452654-r2 { fill: #c5c8c6 }
-.terminal-1082452654-r3 { fill: #d0b344;font-weight: bold }
-.terminal-1082452654-r4 { fill: #68a0b3;font-weight: bold }
-.terminal-1082452654-r5 { fill: #868887 }
-.terminal-1082452654-r6 { fill: #98a84b;font-weight: bold }
-.terminal-1082452654-r7 { fill: #8d7b39 }
+.breeze-static-checks-r1 { fill: #c5c8c6;font-weight: bold }
+.breeze-static-checks-r2 { fill: #c5c8c6 }
+.breeze-static-checks-r3 { f

[GitHub] [airflow] uranusjr commented on issue #28879: TypeError: cannot pickle 'module' object

2023-01-11 Thread GitBox


uranusjr commented on issue #28879:
URL: https://github.com/apache/airflow/issues/28879#issuecomment-1379938759

   What are those values in `config`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated: Add dep context description for better log message (#28875)

2023-01-11 Thread dstandish
This is an automated email from the ASF dual-hosted git repository.

dstandish pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
 new 1ca94ee6ba Add dep context description for better log message (#28875)
1ca94ee6ba is described below

commit 1ca94ee6ba767ed6851858db24319aa1008562eb
Author: Daniel Standish <15932138+dstand...@users.noreply.github.com>
AuthorDate: Wed Jan 11 23:56:49 2023 -0800

Add dep context description for better log message (#28875)

Otherwise, it appears that there is a duplicate log record.
---
 airflow/models/taskinstance.py | 4 +++-
 airflow/ti_deps/dep_context.py | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/airflow/models/taskinstance.py b/airflow/models/taskinstance.py
index fcead1fc7e..fe382155ec 100644
--- a/airflow/models/taskinstance.py
+++ b/airflow/models/taskinstance.py
@@ -1090,7 +1090,7 @@ class TaskInstance(Base, LoggingMixin):
 if failed:
 return False
 
-verbose_aware_logger("Dependencies all met for %s", self)
+verbose_aware_logger("Dependencies all met for dep_context=%s ti=%s", 
dep_context.description, self)
 return True
 
 @provide_session
@@ -1242,6 +1242,7 @@ class TaskInstance(Base, LoggingMixin):
 ignore_depends_on_past=ignore_depends_on_past,
 
wait_for_past_depends_before_skipping=wait_for_past_depends_before_skipping,
 ignore_task_deps=ignore_task_deps,
+description="non-requeueable deps",
 )
 if not self.are_dependencies_met(
 dep_context=non_requeueable_dep_context, session=session, 
verbose=True
@@ -1271,6 +1272,7 @@ class TaskInstance(Base, LoggingMixin):
 
wait_for_past_depends_before_skipping=wait_for_past_depends_before_skipping,
 ignore_task_deps=ignore_task_deps,
 ignore_ti_state=ignore_ti_state,
+description="requeueable deps",
 )
 if not self.are_dependencies_met(dep_context=dep_context, 
session=session, verbose=True):
 self.state = State.NONE
diff --git a/airflow/ti_deps/dep_context.py b/airflow/ti_deps/dep_context.py
index 6f2d603509..fbdb81355a 100644
--- a/airflow/ti_deps/dep_context.py
+++ b/airflow/ti_deps/dep_context.py
@@ -77,6 +77,7 @@ class DepContext:
 ignore_ti_state: bool = False
 ignore_unmapped_tasks: bool = False
 finished_tis: list[TaskInstance] | None = None
+description: str | None = None
 
 have_changed_ti_states: bool = False
 """Have any of the TIs state's been changed as a result of evaluating 
dependencies"""



[airflow] 02/02: Add inputimeout as dependency to breeze-cmd-line pre-commit deps (#28299)

2023-01-11 Thread ephraimanierobi
This is an automated email from the ASF dual-hosted git repository.

ephraimanierobi pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 33420d5db97afcd4d537a86bf30b45a683f8e21a
Author: Jarek Potiuk 
AuthorDate: Mon Dec 12 12:00:34 2022 +0100

Add inputimeout as dependency to breeze-cmd-line pre-commit deps (#28299)

(cherry picked from commit 504e2c29ef1ea070291f14d1284de403a433f157)
---
 .pre-commit-config.yaml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index bcef7889cf..810ac0fc1b 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -796,7 +796,7 @@ repos:
 files: ^BREEZE\.rst$|^dev/breeze/.*$|^\.pre-commit-config\.yaml$
 require_serial: true
 pass_filenames: false
-additional_dependencies: ['rich>=12.4.4', 'rich-click>=1.5']
+additional_dependencies: ['rich>=12.4.4', 'rich-click>=1.5', 
'inputimeout']
   - id: check-example-dags-urls
 name: Check that example dags url include provider versions
 entry: ./scripts/ci/pre_commit/pre_commit_update_example_dags_paths.py



[airflow] branch v2-5-test updated (af684d823c -> 33420d5db9)

2023-01-11 Thread ephraimanierobi
This is an automated email from the ASF dual-hosted git repository.

ephraimanierobi pushed a change to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


from af684d823c Fix taskflow.rst duplicated "or" (#28839)
 new 1a787a7780 Update config hash in Breeze's README.md during 
reinstalllation (#28148)
 new 33420d5db9 Add inputimeout as dependency to breeze-cmd-line pre-commit 
deps (#28299)

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .pre-commit-config.yaml   |   4 +-
 dev/breeze/src/airflow_breeze/utils/path_utils.py |  12 ++
 images/breeze/output_static-checks.svg| 248 +++---
 3 files changed, 142 insertions(+), 122 deletions(-)



[GitHub] [airflow] dstandish merged pull request #28875: Add dep context description for better log message

2023-01-11 Thread GitBox


dstandish merged PR #28875:
URL: https://github.com/apache/airflow/pull/28875


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #28879: TypeError: cannot pickle 'module' object

2023-01-11 Thread GitBox


boring-cyborg[bot] commented on issue #28879:
URL: https://github.com/apache/airflow/issues/28879#issuecomment-1379937259

   Thanks for opening your first issue here! Be sure to follow the issue 
template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] SadmiB opened a new issue, #28879: TypeError: cannot pickle 'module' object

2023-01-11 Thread GitBox


SadmiB opened a new issue, #28879:
URL: https://github.com/apache/airflow/issues/28879

   ### Apache Airflow version
   
   2.5.0
   
   ### What happened
   
   I'm running airflow in Kubernetes, I schedule PythonOperator using 
Kubernetes executor to download a dataset from s3 and I'm getting this error:
   
   ```
   [2023-01-12 07:42:14,502] {dagbag.py:538} INFO - Filling up the DagBag from 
/opt/airflow/dags/recognizer_pipeline.py
   [2023-01-12 07:42:18,298] {font_manager.py:1633} INFO - generated new 
fontManager
   Downloading https://ultralytics.com/assets/Arial.ttf to 
/home/airflow/.config/Ultralytics/Arial.ttf...
   [2023-01-12 07:42:19,779] {s3.py:48} INFO - Initializing S3 client
   [2023-01-12 07:42:19,800] {credentials.py:1108} INFO - Found credentials 
from IAM Role: high_load-eks-node-group-202207211737086022
   [2023-01-12 07:42:19,904] {s3.py:48} INFO - Initializing S3 client
   [2023-01-12 07:42:21,855] {task_command.py:389} INFO - Running 
 on host 
recognizer-pipeline-download-d-3765e820b7594e71988db6c28c48ae66
   Traceback (most recent call last):
 File "/home/airflow/.local/bin/airflow", line 8, in 
   sys.exit(main())
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/__main__.py", line 
39, in main
   args.func(args)
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/cli/cli_parser.py", 
line 52, in command
   return func(*args, **kwargs)
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/cli.py", line 
108, in wrapper
   return f(*args, **kwargs)
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/cli/commands/task_command.py",
 line 396, in task_run
   _run_task_by_selected_method(args, dag, ti)
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/cli/commands/task_command.py",
 line 193, in _run_task_by_selected_method
   _run_task_by_local_task_job(args, ti)
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/cli/commands/task_command.py",
 line 252, in _run_task_by_local_task_job
   run_job.run()
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/base_job.py", 
line 247, in run
   self._execute()
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/local_task_job.py",
 line 132, in _execute
   self.handle_task_exit(return_code)
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/local_task_job.py",
 line 164, in handle_task_exit
   self.task_instance.schedule_downstream_tasks()
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/session.py", 
line 75, in wrapper
   return func(*args, session=session, **kwargs)
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py",
 line 2562, in schedule_downstream_tasks
   partial_dag = task.dag.partial_subset(
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/dag.py", line 
2201, in partial_subset
   dag.task_dict = {
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/dag.py", line 
2202, in 
   t.task_id: _deepcopy_task(t)
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/dag.py", line 
2199, in _deepcopy_task
   return copy.deepcopy(t, memo)
 File "/usr/local/lib/python3.8/copy.py", line 153, in deepcopy
   y = copier(memo)
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/baseoperator.py",
 line 1166, in __deepcopy__
   setattr(result, k, copy.deepcopy(v, memo))
 File "/usr/local/lib/python3.8/copy.py", line 146, in deepcopy
   y = copier(x, memo)
 File "/usr/local/lib/python3.8/copy.py", line 230, in _deepcopy_dict
   y[deepcopy(key, memo)] = deepcopy(value, memo)
 File "/usr/local/lib/python3.8/copy.py", line 172, in deepcopy
   y = _reconstruct(x, memo, *rv)
 File "/usr/local/lib/python3.8/copy.py", line 270, in _reconstruct
   state = deepcopy(state, memo)
 File "/usr/local/lib/python3.8/copy.py", line 146, in deepcopy
   y = copier(x, memo)
 File "/usr/local/lib/python3.8/copy.py", line 230, in _deepcopy_dict
   y[deepcopy(key, memo)] = deepcopy(value, memo)
 File "/usr/local/lib/python3.8/copy.py", line 161, in deepcopy
   rv = reductor(4)
   TypeError: cannot pickle 'module' object
   ```
   
   and when I check the logs of the task in the web server I find this:
   
   ```
   *** Reading local file: 
/opt/airflow/logs/dag_id=recognizer-pipeline/run_id=manual__2023-01-12T07:41:48.392745+00:00/task_id=download_dataset.download_platesmania_test_dataset/attempt=1.log
   [2023-01-12, 07:42:19 UTC] {taskinstance.py:1087} INFO - Dependencies all 
met for 
   [2023-01-12, 07:42:19 UTC] {taskinstance.py:1087} INFO - Dependencies all 
met for 
   [2023-01-12, 07:42:19 UTC] {taskinstance.py:1283} INFO - 
   
-

[GitHub] [airflow] ephraimbuddy commented on pull request #28534: Ensure that pod_mutation_hook is called before logging the pod name

2023-01-11 Thread GitBox


ephraimbuddy commented on PR #28534:
URL: https://github.com/apache/airflow/pull/28534#issuecomment-1379931032

   > Marking this as a "bug fix" and putting it in for 2.5.1
   
   Moving this to 2.6.0 due to multiple conflicts


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dstandish commented on pull request #28875: Add dep context description for better log message

2023-01-11 Thread GitBox


dstandish commented on PR #28875:
URL: https://github.com/apache/airflow/pull/28875#issuecomment-1379928063

   > Do we not want to include other DepContext fields in the log?
   
   too many values.  i explored mucking with repr to just print out non-default 
values but ... was a pain so i thought this good enough improvement


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dstandish commented on a diff in pull request #28878: Allow nested attr in elasticsearch host_field

2023-01-11 Thread GitBox


dstandish commented on code in PR #28878:
URL: https://github.com/apache/airflow/pull/28878#discussion_r1067792774


##
airflow/providers/elasticsearch/log/es_task_handler.py:
##
@@ -409,16 +409,22 @@ def supports_external_link(self) -> bool:
 return bool(self.frontend)
 
 
-def safe_attrgetter(*items, obj, default):
+def getattr_nested(obj, item, default):
 """
-Get items from obj but return default if not found
+Get item from obj but return default if not found
+
+E.g. calling ``getattr_nested('b.c', a, "NA")`` will return
+``a.b.c`` if such a value exists
 
 :meta private:
 """
-val = None
+NOTSET = object()
+val = NOTSET
 try:
-val = attrgetter(*items)(obj)
+val = attrgetter(item)(obj)
 except AttributeError:
 pass

Review Comment:
   yes, now that we're  no longer being as "helpful" thanks... updated



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated (9d6b150b87 -> c7f0aca525)

2023-01-11 Thread dstandish
This is an automated email from the ASF dual-hosted git repository.

dstandish pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


from 9d6b150b87 Update CODEOWNERS amazon and executors (#28873)
 add c7f0aca525 Remove horizontal lines in TI logs (#28876)

No new revisions were added by this update.

Summary of changes:
 airflow/models/taskinstance.py | 8 
 1 file changed, 8 deletions(-)



[GitHub] [airflow] dstandish merged pull request #28876: Remove horizontal lines in TI logs

2023-01-11 Thread GitBox


dstandish merged PR #28876:
URL: https://github.com/apache/airflow/pull/28876


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #28878: Allow nested attr in elasticsearch host_field

2023-01-11 Thread GitBox


uranusjr commented on code in PR #28878:
URL: https://github.com/apache/airflow/pull/28878#discussion_r1067790617


##
airflow/providers/elasticsearch/log/es_task_handler.py:
##
@@ -409,16 +409,22 @@ def supports_external_link(self) -> bool:
 return bool(self.frontend)
 
 
-def safe_attrgetter(*items, obj, default):
+def getattr_nested(obj, item, default):
 """
-Get items from obj but return default if not found
+Get item from obj but return default if not found
+
+E.g. calling ``getattr_nested('b.c', a, "NA")`` will return
+``a.b.c`` if such a value exists
 
 :meta private:
 """
-val = None
+NOTSET = object()
+val = NOTSET
 try:
-val = attrgetter(*items)(obj)
+val = attrgetter(item)(obj)
 except AttributeError:
 pass

Review Comment:
   ```python
   try:
   return attrgetter(item)(obj)
   except AttributeError:
   return default
   ```
   
   I think this would work…?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dstandish commented on a diff in pull request #28878: Allow nested attr in elasticsearch host_field

2023-01-11 Thread GitBox


dstandish commented on code in PR #28878:
URL: https://github.com/apache/airflow/pull/28878#discussion_r1067789878


##
airflow/providers/elasticsearch/log/es_task_handler.py:
##
@@ -407,3 +407,18 @@ def get_external_log_url(self, task_instance: 
TaskInstance, try_number: int) ->
 def supports_external_link(self) -> bool:
 """Whether we can support external links"""
 return bool(self.frontend)
+
+
+def safe_attrgetter(*items, obj, default):
+"""
+Get items from obj but return default if not found
+
+:meta private:
+"""
+val = None
+try:
+val = attrgetter(*items)(obj)
+except AttributeError:
+pass
+
+return val or default

Review Comment:
   slash updated



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] phanikumv commented on a diff in pull request #28874: Add `HivePartitionAsyncSensor`

2023-01-11 Thread GitBox


phanikumv commented on code in PR #28874:
URL: https://github.com/apache/airflow/pull/28874#discussion_r1067780016


##
airflow/providers/apache/hive/provider.yaml:
##
@@ -56,6 +56,7 @@ dependencies:
   # the sasl library anyway (and there sasl library version is not relevant)
   - sasl>=0.3.1; python_version>="3.9"
   - thrift>=0.9.2
+  - impyla

Review Comment:
   We have written our own asyncio method as shown below, because  impyla 
returns immediately after submitting the query.
   
   Please check the link
   
https://github.com/cloudera/impyla/blob/v0.16a2/impala/hiveserver2.py#L334-L338

   ```
   async def partition_exists(self, table: str, schema: str, partition: str, 
polling_interval: float) -> str:
   """
   Checks for the existence of a partition in the given hive table.
   
   :param table: table in hive where the partition exists.
   :param schema: database where the hive table exists
   :param partition: partition to check for in given hive database and 
hive table.
   :param polling_interval: polling interval in seconds to sleep 
between checks
   """
   client = self.get_hive_client()
   cursor = client.cursor()
   query = f"show partitions {schema}.{table} partition({partition})"
   cursor.execute_async(query)
   while cursor.is_executing():
   await asyncio.sleep(polling_interval)
   results = cursor.fetchall()
   if len(results) == 0:
   return "failure"
   return "success"
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] phanikumv commented on a diff in pull request #28874: Add `HivePartitionAsyncSensor`

2023-01-11 Thread GitBox


phanikumv commented on code in PR #28874:
URL: https://github.com/apache/airflow/pull/28874#discussion_r1067780016


##
airflow/providers/apache/hive/provider.yaml:
##
@@ -56,6 +56,7 @@ dependencies:
   # the sasl library anyway (and there sasl library version is not relevant)
   - sasl>=0.3.1; python_version>="3.9"
   - thrift>=0.9.2
+  - impyla

Review Comment:
   We have written our own asyncio method as shown below, because  impyla give 
us a handle immediately after submitting the query.
   
   Please check the link
   
https://github.com/cloudera/impyla/blob/v0.16a2/impala/hiveserver2.py#L334-L338

   ```
   async def partition_exists(self, table: str, schema: str, partition: str, 
polling_interval: float) -> str:
   """
   Checks for the existence of a partition in the given hive table.
   
   :param table: table in hive where the partition exists.
   :param schema: database where the hive table exists
   :param partition: partition to check for in given hive database and 
hive table.
   :param polling_interval: polling interval in seconds to sleep 
between checks
   """
   client = self.get_hive_client()
   cursor = client.cursor()
   query = f"show partitions {schema}.{table} partition({partition})"
   cursor.execute_async(query)
   while cursor.is_executing():
   await asyncio.sleep(polling_interval)
   results = cursor.fetchall()
   if len(results) == 0:
   return "failure"
   return "success"
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dstandish commented on a diff in pull request #28878: Allow nested attr in elasticsearch host_field

2023-01-11 Thread GitBox


dstandish commented on code in PR #28878:
URL: https://github.com/apache/airflow/pull/28878#discussion_r1067782575


##
airflow/providers/elasticsearch/log/es_task_handler.py:
##
@@ -407,3 +407,18 @@ def get_external_log_url(self, task_instance: 
TaskInstance, try_number: int) ->
 def supports_external_link(self) -> bool:
 """Whether we can support external links"""
 return bool(self.frontend)
+
+
+def safe_attrgetter(*items, obj, default):
+"""
+Get items from obj but return default if not found
+
+:meta private:
+"""
+val = None
+try:
+val = attrgetter(*items)(obj)
+except AttributeError:
+pass
+
+return val or default

Review Comment:
   ok renamed 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] phanikumv commented on a diff in pull request #28874: Add `HivePartitionAsyncSensor`

2023-01-11 Thread GitBox


phanikumv commented on code in PR #28874:
URL: https://github.com/apache/airflow/pull/28874#discussion_r1067780016


##
airflow/providers/apache/hive/provider.yaml:
##
@@ -56,6 +56,7 @@ dependencies:
   # the sasl library anyway (and there sasl library version is not relevant)
   - sasl>=0.3.1; python_version>="3.9"
   - thrift>=0.9.2
+  - impyla

Review Comment:
   We have written our own asyncio method as shown below, because  impyla give 
us a handle immediately after submitting the query

   ```
   async def partition_exists(self, table: str, schema: str, partition: str, 
polling_interval: float) -> str:
   """
   Checks for the existence of a partition in the given hive table.
   
   :param table: table in hive where the partition exists.
   :param schema: database where the hive table exists
   :param partition: partition to check for in given hive database and 
hive table.
   :param polling_interval: polling interval in seconds to sleep 
between checks
   """
   client = self.get_hive_client()
   cursor = client.cursor()
   query = f"show partitions {schema}.{table} partition({partition})"
   cursor.execute_async(query)
   while cursor.is_executing():
   await asyncio.sleep(polling_interval)
   results = cursor.fetchall()
   if len(results) == 0:
   return "failure"
   return "success"
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #28878: Allow nested attr in elasticsearch host_field

2023-01-11 Thread GitBox


uranusjr commented on code in PR #28878:
URL: https://github.com/apache/airflow/pull/28878#discussion_r1067778960


##
airflow/providers/elasticsearch/log/es_task_handler.py:
##
@@ -407,3 +407,18 @@ def get_external_log_url(self, task_instance: 
TaskInstance, try_number: int) ->
 def supports_external_link(self) -> bool:
 """Whether we can support external links"""
 return bool(self.frontend)
+
+
+def safe_attrgetter(*items, obj, default):
+"""
+Get items from obj but return default if not found
+
+:meta private:
+"""
+val = None
+try:
+val = attrgetter(*items)(obj)
+except AttributeError:
+pass
+
+return val or default

Review Comment:
   Sorry, I meant the `val or default` part, if the attrgetter gets a None, the 
function would return `default` instead of `None`, inconsistent to `getattr`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28528: Fixes to how DebugExecutor handles sensors

2023-01-11 Thread GitBox


potiuk commented on code in PR #28528:
URL: https://github.com/apache/airflow/pull/28528#discussion_r1067776865


##
airflow/ti_deps/deps/ready_to_reschedule.py:
##
@@ -44,7 +45,8 @@ def _get_dep_statuses(self, ti, session, dep_context):
 from airflow.models.mappedoperator import MappedOperator
 
 is_mapped = isinstance(ti.task, MappedOperator)
-if not is_mapped and not getattr(ti.task, "reschedule", False):
+is_debug_executor = conf.get("core", "executor") == "DebugExecutor"

Review Comment:
   What`s mor i think we do not possibly even want to replace it for AIP-47. 
Why would we? Maybe we can rename to SystemTestExcecutor but I think it does 
the job nicely ? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #28870: Support multiple buttons from `operator_extra_links`

2023-01-11 Thread GitBox


uranusjr commented on issue #28870:
URL: https://github.com/apache/airflow/issues/28870#issuecomment-1379908954

   How is this different from implementing multiple OperatorLink classes for 
the operator?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dstandish commented on a diff in pull request #28878: Allow nested attr in elasticsearch host_field

2023-01-11 Thread GitBox


dstandish commented on code in PR #28878:
URL: https://github.com/apache/airflow/pull/28878#discussion_r1067776939


##
airflow/providers/elasticsearch/log/es_task_handler.py:
##
@@ -407,3 +407,18 @@ def get_external_log_url(self, task_instance: 
TaskInstance, try_number: int) ->
 def supports_external_link(self) -> bool:
 """Whether we can support external links"""
 return bool(self.frontend)
+
+
+def safe_attrgetter(*items, obj, default):
+"""
+Get items from obj but return default if not found
+
+:meta private:
+"""
+val = None
+try:
+val = attrgetter(*items)(obj)
+except AttributeError:
+pass
+
+return val or default

Review Comment:
   yup! lmkyt



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dstandish commented on a diff in pull request #28878: Allow nested attr in elasticsearch host_field

2023-01-11 Thread GitBox


dstandish commented on code in PR #28878:
URL: https://github.com/apache/airflow/pull/28878#discussion_r1067776349


##
airflow/providers/elasticsearch/log/es_task_handler.py:
##
@@ -407,3 +407,18 @@ def get_external_log_url(self, task_instance: 
TaskInstance, try_number: int) ->
 def supports_external_link(self) -> bool:
 """Whether we can support external links"""
 return bool(self.frontend)
+
+
+def safe_attrgetter(*items, obj, default):
+"""
+Get items from obj but return default if not found
+
+:meta private:
+"""
+val = None
+try:
+val = attrgetter(*items)(obj)
+except AttributeError:
+pass
+
+return val or default

Review Comment:
   or that's the intention lemme confirm



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dstandish commented on a diff in pull request #28878: Allow nested attr in elasticsearch host_field

2023-01-11 Thread GitBox


dstandish commented on code in PR #28878:
URL: https://github.com/apache/airflow/pull/28878#discussion_r1067776188


##
airflow/providers/elasticsearch/log/es_task_handler.py:
##
@@ -407,3 +407,18 @@ def get_external_log_url(self, task_instance: 
TaskInstance, try_number: int) ->
 def supports_external_link(self) -> bool:
 """Whether we can support external links"""
 return bool(self.frontend)
+
+
+def safe_attrgetter(*items, obj, default):
+"""
+Get items from obj but return default if not found
+
+:meta private:
+"""
+val = None
+try:
+val = attrgetter(*items)(obj)
+except AttributeError:
+pass
+
+return val or default

Review Comment:
   it does not default to None -- it defaults to _default_, which is required!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #28870: Support multiple buttons from `operator_extra_links`

2023-01-11 Thread GitBox


potiuk commented on issue #28870:
URL: https://github.com/apache/airflow/issues/28870#issuecomment-1379906282

   As long as it backwards compatible - sure. If you would like to contribute 
it, that works ld be great - it does not pass the bar of being super simple but 
for experienced person it might be interesting one to learn  how things work. 
Marking as good first issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #28875: Add dep context description for better log message

2023-01-11 Thread GitBox


uranusjr commented on PR #28875:
URL: https://github.com/apache/airflow/pull/28875#issuecomment-1379905764

   Do we not want to include other DepContext fields in the log?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #28878: Allow nested attr in elasticsearch host_field

2023-01-11 Thread GitBox


uranusjr commented on code in PR #28878:
URL: https://github.com/apache/airflow/pull/28878#discussion_r1067772280


##
airflow/providers/elasticsearch/log/es_task_handler.py:
##
@@ -407,3 +407,18 @@ def get_external_log_url(self, task_instance: 
TaskInstance, try_number: int) ->
 def supports_external_link(self) -> bool:
 """Whether we can support external links"""
 return bool(self.frontend)
+
+
+def safe_attrgetter(*items, obj, default):
+"""
+Get items from obj but return default if not found
+
+:meta private:
+"""
+val = None
+try:
+val = attrgetter(*items)(obj)
+except AttributeError:
+pass
+
+return val or default

Review Comment:
   I’d name this something like `getattr_nested`
   
   Also defauling to None does not feel like a good idea. You’ll want NOTSET. 
Or it’s also a good idea to just inline this logic in `_group_logs_by_host`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #28865: dag.test() hangs with deferrable operators if DAG is paused

2023-01-11 Thread GitBox


uranusjr commented on issue #28865:
URL: https://github.com/apache/airflow/issues/28865#issuecomment-1379900888

   Go ahead.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dstandish opened a new pull request, #28878: Allow nested attr in elasticsearch host_field

2023-01-11 Thread GitBox


dstandish opened a new pull request, #28878:
URL: https://github.com/apache/airflow/pull/28878

   Sometimes we may need to use nested field e.g. with filebeat:
   
   AIRFLOW__ELASTICSEARCH__HOST_FIELD=host.name
   
   Currently this will not fail but will return "default_host" -- the default 
value.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] kamalesh0406 commented on issue #28865: dag.test() hangs with deferrable operators if DAG is paused

2023-01-11 Thread GitBox


kamalesh0406 commented on issue #28865:
URL: https://github.com/apache/airflow/issues/28865#issuecomment-1379895553

   @potiuk I can look into this issue, can you assign it to me?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated: Update CODEOWNERS amazon and executors (#28873)

2023-01-11 Thread taragolis
This is an automated email from the ASF dual-hosted git repository.

taragolis pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
 new 9d6b150b87 Update CODEOWNERS amazon and executors (#28873)
9d6b150b87 is described below

commit 9d6b150b87ecbcf408b904263184ce7a9cf221f8
Author: Niko Oliveira 
AuthorDate: Wed Jan 11 22:54:45 2023 -0800

Update CODEOWNERS amazon and executors (#28873)
---
 .github/CODEOWNERS | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS
index 15300027e3..1e87eb16c7 100644
--- a/.github/CODEOWNERS
+++ b/.github/CODEOWNERS
@@ -1,5 +1,5 @@
 # Core
-/airflow/executors/ @kaxil @XD-DENG @ashb
+/airflow/executors/ @kaxil @XD-DENG @ashb @o-nikolas
 /airflow/jobs/ @kaxil @ashb @XD-DENG
 /airflow/models/ @kaxil @XD-DENG @ashb
 
@@ -67,14 +67,14 @@
 /airflow/providers/cncf/kubernetes @jedcunningham
 /airflow/providers/dbt/cloud/ @josh-fell
 /airflow/providers/tabular/ @Fokko
-/airflow/providers/amazon/ @eladkal
+/airflow/providers/amazon/ @eladkal @o-nikolas
 /airflow/providers/common/sql/ @eladkal
 /airflow/providers/slack/ @eladkal
-/docs/apache-airflow-providers-amazon/ @eladkal
+/docs/apache-airflow-providers-amazon/ @eladkal @o-nikolas
 /docs/apache-airflow-providers-common-sql/ @eladkal
 /docs/apache-airflow-providers-slack/ @eladkal
 /docs/apache-airflow-providers-cncf-kubernetes @jedcunningham
-/tests/providers/amazon/ @eladkal
+/tests/providers/amazon/ @eladkal @o-nikolas
 /tests/providers/common/sql/ @eladkal
 /tests/providers/slack/ @eladkal
 



[GitHub] [airflow] Taragolis merged pull request #28873: Update CODEOWNERS amazon and executors

2023-01-11 Thread GitBox


Taragolis merged PR #28873:
URL: https://github.com/apache/airflow/pull/28873


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #28874: Add `HivePartitionAsyncSensor`

2023-01-11 Thread GitBox


Taragolis commented on code in PR #28874:
URL: https://github.com/apache/airflow/pull/28874#discussion_r1067755312


##
airflow/providers/apache/hive/provider.yaml:
##
@@ -56,6 +56,7 @@ dependencies:
   # the sasl library anyway (and there sasl library version is not relevant)
   - sasl>=0.3.1; python_version>="3.9"
   - thrift>=0.9.2
+  - impyla

Review Comment:
   [impyla](https://github.com/cloudera/impyla) doesn't make any asyncio calls 
in execute_async.
   



##
airflow/providers/apache/hive/hooks/hive.py:
##
@@ -1036,3 +1040,92 @@ def get_pandas_df(  # type: ignore
 res = self.get_results(sql, schema=schema, hive_conf=hive_conf)
 df = pandas.DataFrame(res["data"], columns=[c[0] for c in 
res["header"]], **kwargs)
 return df
+
+
+class HiveCliAsyncHook(BaseHook):
+"""
+HiveCliAsyncHook to interact with the Hive using impyla library
+
+:param metastore_conn_id: connection string for the hive
+:param auth_mechanism: auth mechanism to use for authentication
+"""
+
+def __init__(self, metastore_conn_id: str) -> None:
+"""Get the connection parameters separated from connection string"""
+super().__init__()
+self.conn = self.get_connection(conn_id=metastore_conn_id)
+self.auth_mechanism = self.conn.extra_dejson.get("authMechanism", 
"PLAIN")

Review Comment:
   This block async thread



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated: Add a new SSM hook and use it in the System Test context builder (#28755)

2023-01-11 Thread onikolas
This is an automated email from the ASF dual-hosted git repository.

onikolas pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
 new 870ecd477a Add a new SSM hook and use it in the System Test context 
builder (#28755)
870ecd477a is described below

commit 870ecd477af3774546bd82bb71921a03914a2b64
Author: D. Ferruzzi 
AuthorDate: Wed Jan 11 22:49:44 2023 -0800

Add a new SSM hook and use it in the System Test context builder (#28755)

* Add a new SSM hook and use it in the System Test context builder

Includes unit tests

Co-authored-by: Andrey Anshin 
---
 airflow/providers/amazon/aws/hooks/ssm.py  |  53 
 airflow/providers/amazon/provider.yaml |   7 +++
 .../aws/aws-systems-manager_light...@4x.png| Bin 0 -> 75092 bytes
 tests/providers/amazon/aws/hooks/test_ssm.py   |  67 +
 .../system/providers/amazon/aws/utils/__init__.py  |   7 ++-
 5 files changed, 131 insertions(+), 3 deletions(-)

diff --git a/airflow/providers/amazon/aws/hooks/ssm.py 
b/airflow/providers/amazon/aws/hooks/ssm.py
new file mode 100644
index 00..25a7f01f90
--- /dev/null
+++ b/airflow/providers/amazon/aws/hooks/ssm.py
@@ -0,0 +1,53 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.types import NOTSET, ArgNotSet
+
+
+class SsmHook(AwsBaseHook):
+"""
+Interact with Amazon Systems Manager (SSM) using the boto3 library.
+All API calls available through the Boto API are also available here.
+See: 
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ssm.html#client
+
+Additional arguments (such as ``aws_conn_id``) may be specified and
+are passed down to the underlying AwsBaseHook.
+
+.. seealso::
+:class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+"""
+
+def __init__(self, *args, **kwargs) -> None:
+kwargs["client_type"] = "ssm"
+super().__init__(*args, **kwargs)
+
+def get_parameter_value(self, parameter: str, default: str | ArgNotSet = 
NOTSET) -> str:
+"""
+Returns the value of the provided Parameter or an optional default.
+
+:param parameter: The SSM Parameter name to return the value for.
+:param default: Optional default value to return if none is found.
+"""
+try:
+return 
self.conn.get_parameter(Name=parameter)["Parameter"]["Value"]
+except self.conn.exceptions.ParameterNotFound:
+if isinstance(default, ArgNotSet):
+raise
+return default
diff --git a/airflow/providers/amazon/provider.yaml 
b/airflow/providers/amazon/provider.yaml
index dc401b8884..b1a8aa0f89 100644
--- a/airflow/providers/amazon/provider.yaml
+++ b/airflow/providers/amazon/provider.yaml
@@ -195,6 +195,10 @@ integrations:
 how-to-guide:
   - /docs/apache-airflow-providers-amazon/operators/s3.rst
 tags: [aws]
+  - integration-name: Amazon Systems Manager (SSM)
+external-doc-url: https://aws.amazon.com/systems-manager/
+logo: /integration-logos/aws/aws-systems-manager_light...@4x.png
+tags: [aws]
   - integration-name: Amazon Web Services
 external-doc-url: https://aws.amazon.com/
 logo: /integration-logos/aws/aws-cloud-alt_light...@4x.png
@@ -461,6 +465,9 @@ hooks:
   - integration-name: Amazon Simple Email Service (SES)
 python-modules:
   - airflow.providers.amazon.aws.hooks.ses
+  - integration-name: Amazon Systems Manager (SSM)
+python-modules:
+  - airflow.providers.amazon.aws.hooks.ssm
   - integration-name: Amazon SecretsManager
 python-modules:
   - airflow.providers.amazon.aws.hooks.secrets_manager
diff --git a/docs/integration-logos/aws/aws-systems-manager_light...@4x.png 
b/docs/integration-logos/aws/aws-systems-manager_light...@4x.png
new file mode 100644
index 00..0aae71d6cb
Binary files /dev/null and 
b/docs/integration-logos/aws/aws-systems-manager_light...@4x.png differ
diff --git a/tests/providers/a

[GitHub] [airflow] o-nikolas merged pull request #28755: Add a new SSM hook and use it in the System Test context builder

2023-01-11 Thread GitBox


o-nikolas merged PR #28755:
URL: https://github.com/apache/airflow/pull/28755


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] o-nikolas commented on a diff in pull request #28755: Add a new SSM hook and use it in the System Test context builder

2023-01-11 Thread GitBox


o-nikolas commented on code in PR #28755:
URL: https://github.com/apache/airflow/pull/28755#discussion_r1067757109


##
tests/system/providers/amazon/aws/utils/__init__.py:
##
@@ -92,15 +93,15 @@ def _fetch_from_ssm(key: str, test_name: str | None = None) 
-> str:
 :return: The value of the provided key from SSM
 """
 _test_name: str = test_name if test_name else _get_test_name()
-ssm_client: BaseClient = boto3.client("ssm")
+hook = SsmHook()

Review Comment:
   I don't love changing the "prod" code in this way to make the tests pass (I 
think the expectations in that unit test should be updated for AWS examples). 
But on the other hand the "prod" code here is technically also test code, so 
it's not so bad and it's a clean change :+1: Happy to merge!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dstandish opened a new pull request, #28876: Remove horizontal lines in TI logs

2023-01-11 Thread GitBox


dstandish opened a new pull request, #28876:
URL: https://github.com/apache/airflow/pull/28876

   These are not really that helpful and certainly provide no meaningful 
information.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #28868: Use dynamic task mapping in TriggerDagRunOperator may generate the same run_id

2023-01-11 Thread GitBox


uranusjr commented on issue #28868:
URL: https://github.com/apache/airflow/issues/28868#issuecomment-1379867945

   The real problem is not actually the run ID, but the logical date (execution 
date), which also has a unique constraint, and can be identical if you schedule 
too fast. Even if the run ID duplication is fixed, you still can’t schedule 
multiple runs under the same logical date. So that’s the thing you must 
explicitly pass to `expand` (or `expand_kwargs`) if you want to use dynamic 
task mapping.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dstandish opened a new pull request, #28875: Add dep context description for better log message

2023-01-11 Thread GitBox


dstandish opened a new pull request, #28875:
URL: https://github.com/apache/airflow/pull/28875

   Otherwise, it appears that there is a duplicate log record.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] phanikumv opened a new pull request, #28874: Add async hive sensor

2023-01-11 Thread GitBox


phanikumv opened a new pull request, #28874:
URL: https://github.com/apache/airflow/pull/28874

   This PR donates the async hive sensor from 
[astronomer-providers](https://github.com/astronomer/astronomer-providers) repo 
to Airflow
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a 
newsfragment file, named `{pr_number}.significant.rst` or 
`{issue_number}.significant.rst`, in 
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] o-nikolas opened a new pull request, #28873: Update CODEOWNERS amazon and executors

2023-01-11 Thread GitBox


o-nikolas opened a new pull request, #28873:
URL: https://github.com/apache/airflow/pull/28873

   Updating the code owners file for areas I have some context and ownership 
for.
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a 
newsfragment file, named `{pr_number}.significant.rst` or 
`{issue_number}.significant.rst`, in 
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #28871: KubenetesExecutor sends state even when successful

2023-01-11 Thread GitBox


uranusjr commented on code in PR #28871:
URL: https://github.com/apache/airflow/pull/28871#discussion_r1067709600


##
airflow/executors/kubernetes_executor.py:
##
@@ -751,19 +751,25 @@ def _change_state(self, key: TaskInstanceKey, state: str 
| None, pod_id: str, na
 if TYPE_CHECKING:
 assert self.kube_scheduler
 
-if state != State.RUNNING:
-if self.kube_config.delete_worker_pods:
-if state != State.FAILED or 
self.kube_config.delete_worker_pods_on_failure:
-self.kube_scheduler.delete_pod(pod_id, namespace)
-self.log.info("Deleted pod: %s in namespace %s", str(key), 
str(namespace))
-else:
-self.kube_scheduler.patch_pod_executor_done(pod_id=pod_id, 
namespace=namespace)
-self.log.info("Patched pod %s in namespace %s to mark it as 
done", str(key), str(namespace))
-try:
-self.running.remove(key)
-except KeyError:
-self.log.debug("Could not find key: %s", str(key))
-self.event_buffer[key] = state, None
+if state == State.RUNNING:
+self.event_buffer[key] = state, None
+return
+
+if self.kube_config.delete_worker_pods:
+if state != State.FAILED or 
self.kube_config.delete_worker_pods_on_failure:
+self.kube_scheduler.delete_pod(pod_id, namespace)
+self.log.info("Deleted pod: %s in namespace %s", str(key), 
str(namespace))
+else:
+self.kube_scheduler.patch_pod_executor_done(pod_id=pod_id, 
namespace=namespace)
+self.log.info("Patched pod %s in namespace %s to mark it as done", 
str(key), str(namespace))
+
+if key in self.running:
+self.running.remove(key)
+# We do get multiple events once the pod hits a terminal state, 
and we only want to
+# do this once, so only do it when we remove the task from running
+self.event_buffer[key] = state, None
+else:
+self.log.debug("TI key not in running, not adding to event_buffer: 
%s", str(key))

Review Comment:
   ```suggestion
   try:
   self.running.remove(key)
   except KeyError:
   self.log.debug("TI key not in running, not adding to 
event_buffer: %s", key)
   else:
   # We get multiple events once the pod hits a terminal state, and 
we only want to
   # do this once, so only do it when we remove the task from 
running
   self.event_buffer[key] = state, None
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #28871: KubenetesExecutor sends state even when successful

2023-01-11 Thread GitBox


uranusjr commented on code in PR #28871:
URL: https://github.com/apache/airflow/pull/28871#discussion_r1067709600


##
airflow/executors/kubernetes_executor.py:
##
@@ -751,19 +751,25 @@ def _change_state(self, key: TaskInstanceKey, state: str 
| None, pod_id: str, na
 if TYPE_CHECKING:
 assert self.kube_scheduler
 
-if state != State.RUNNING:
-if self.kube_config.delete_worker_pods:
-if state != State.FAILED or 
self.kube_config.delete_worker_pods_on_failure:
-self.kube_scheduler.delete_pod(pod_id, namespace)
-self.log.info("Deleted pod: %s in namespace %s", str(key), 
str(namespace))
-else:
-self.kube_scheduler.patch_pod_executor_done(pod_id=pod_id, 
namespace=namespace)
-self.log.info("Patched pod %s in namespace %s to mark it as 
done", str(key), str(namespace))
-try:
-self.running.remove(key)
-except KeyError:
-self.log.debug("Could not find key: %s", str(key))
-self.event_buffer[key] = state, None
+if state == State.RUNNING:
+self.event_buffer[key] = state, None
+return
+
+if self.kube_config.delete_worker_pods:
+if state != State.FAILED or 
self.kube_config.delete_worker_pods_on_failure:
+self.kube_scheduler.delete_pod(pod_id, namespace)
+self.log.info("Deleted pod: %s in namespace %s", str(key), 
str(namespace))
+else:
+self.kube_scheduler.patch_pod_executor_done(pod_id=pod_id, 
namespace=namespace)
+self.log.info("Patched pod %s in namespace %s to mark it as done", 
str(key), str(namespace))
+
+if key in self.running:
+self.running.remove(key)
+# We do get multiple events once the pod hits a terminal state, 
and we only want to
+# do this once, so only do it when we remove the task from running
+self.event_buffer[key] = state, None
+else:
+self.log.debug("TI key not in running, not adding to event_buffer: 
%s", str(key))

Review Comment:
   ```suggestion
   try:
   self.running.remove(key)
   except KeyError:
   self.log.debug("TI key not in running, not adding to 
event_buffer: %s", str(key))
   else:
   # We get multiple events once the pod hits a terminal state, and 
we only want to
   # do this once, so only do it when we remove the task from 
running
   self.event_buffer[key] = state, None
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Bowrna commented on issue #28789: Add colors in help outputs of Airfow CLI commands

2023-01-11 Thread GitBox


Bowrna commented on issue #28789:
URL: https://github.com/apache/airflow/issues/28789#issuecomment-1379826391

   @potiuk Do you mean that it should be done either for all the CLI commands 
or for nothing? Am i right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] EricGao888 commented on a diff in pull request #28858: Fix incorrect comments in worker-kedaautoscaler.yaml

2023-01-11 Thread GitBox


EricGao888 commented on code in PR #28858:
URL: https://github.com/apache/airflow/pull/28858#discussion_r1067672767


##
chart/templates/workers/worker-kedaautoscaler.yaml:
##
@@ -37,10 +37,10 @@ spec:
   scaleTargetRef:
 kind: {{ ternary "StatefulSet" "Deployment" 
.Values.workers.persistence.enabled }}
 name: {{ .Release.Name }}-worker
-  pollingInterval:  {{ .Values.workers.keda.pollingInterval }}   # Optional. 
Default: 30 seconds
-  cooldownPeriod: {{ .Values.workers.keda.cooldownPeriod }}# Optional. 
Default: 300 seconds
+  pollingInterval:  {{ .Values.workers.keda.pollingInterval }}   # Optional. 
Default: 5 seconds
+  cooldownPeriod: {{ .Values.workers.keda.cooldownPeriod }}# Optional. 
Default: 30 seconds

Review Comment:
   Done, thanks for the suggestions : )



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] gunziptarball commented on pull request #28430: Pass extra index url with build secrets

2023-01-11 Thread GitBox


gunziptarball commented on PR #28430:
URL: https://github.com/apache/airflow/pull/28430#issuecomment-1379771353

   Thanks for the review!  I had thought this would be a problem based on [this 
article](https://pythonspeed.com/articles/docker-build-secrets/) but I 
overlooked the bit that said a work around was to use multi stage builds.  In 
this particular case I guess it is indeed unneeded.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] amoghrajesh commented on issue #27507: Making logging for HttpHook optional

2023-01-11 Thread GitBox


amoghrajesh commented on issue #27507:
URL: https://github.com/apache/airflow/issues/27507#issuecomment-1379755880

   @alon-parag @potiuk can you attach some references on how to reproduce this 
issue/where to look to make code changes?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #28872: Grid view doesn't show recent dag runs for some pipelines in 2.4.0

2023-01-11 Thread GitBox


boring-cyborg[bot] commented on issue #28872:
URL: https://github.com/apache/airflow/issues/28872#issuecomment-1379731428

   Thanks for opening your first issue here! Be sure to follow the issue 
template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] aprettyloner opened a new issue, #28872: Grid view doesn't show recent dag runs for some pipelines in 2.4.0

2023-01-11 Thread GitBox


aprettyloner opened a new issue, #28872:
URL: https://github.com/apache/airflow/issues/28872

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### What happened
   
   # Context
   We performed an upgrade today from 2.2.3 to 2.4.0, with a chart upgrade from 
8.4.1 to 8.6.1.
   
   # Issue Encountered
   The grid UI at `/dags/{dag_id}/grid` endpoint has very unusual display 
behavior.
   
   ## 1. Missing recent dag runs in grid
   This dag has run hourly in production since 2021-08-19. For some reason, the 
grid endpoint is displaying dag runs from January 2022. Even when a dag run is 
in progress, the grid default view still shows old data.
   
   
![image](https://user-images.githubusercontent.com/22802852/211955052-c047abb2-3b7b-411f-91c8-873375271e73.png)
   
   ### Data Mismatch
   
    Expected
   
   We would expect to see in the **DAG Runs Summary** table:
   - First Run Start: 2023-01-11, 02:00:09 UTC
   - Last Run Start: 2023-01-12, 02:00:03 UTC
   
    Observed
   
   In the **DAG Runs Summary** table, we see:
   - First Run Start: 2022-01-24, 04:00:05 UTC
   - Last Run Start: 2022-01-25, 04:00:04 UTC
   
   This doesn't align with the database for the same query (default 25 most 
recent runs).
   
   
![image](https://user-images.githubusercontent.com/22802852/211957991-577e3e7d-65f1-4e72-b94f-02abddae9403.png)
   
   
   
   ### Workaround
   If and only if a dag is running, then it appears in the grid using the 
`/dags/{dag_id}/grid?run_state=running` endpoint.
   
   
![image](https://user-images.githubusercontent.com/22802852/211955769-a9bad3ea-5bc0-4696-9007-407c86c2ee98.png)
   
   
   ## 2. Peculiar ordering for dag runs in grid
   This dag has been running daily since 2022-01-07. Unlike the above example, 
the recent 2023 dag runs do appear in the grid. However, the ordering is very 
peculiar. We would expect this week's runs to be all the way to the right.
   
   
   
![image](https://user-images.githubusercontent.com/22802852/211956324-6b94b6b3-f026-42fb-8bd1-7fa8e8344316.png)
   
   Runs from 2023-01-07 to 2023-01-12 appear left of first run (2022-01-06, in 
the failed state).
   
   
![image](https://user-images.githubusercontent.com/22802852/211956371-5480b168-d98e-4ac4-ba1e-b4f1a4f0d515.png)
   
   
    Expected
   
   We would expect to see in the **DAG Runs Summary** table:
   - First Run Start: 2022-12-19, 00:05:00 UTC
   - Last Run Start: 2023-01-12, 00:05:00 UTC
   
    Observed
   
   As above, there is a mismatch in the **DAG Runs Summary** table.
   - First Run Start: 2023-01-07, 00:05:01 UTC < Why would "first" be a 
later date? 🤔 🤔 🤔 🤔 🤔 🤔 
   - Last Run Start: 2022-01-25, 00:05:04 UTC
   
   This doesn't align with the database for the same query (default 25 most 
recent runs).
   
   
![image](https://user-images.githubusercontent.com/22802852/211958774-fc377a10-56f3-4f3f-a721-1ad485bab747.png)
   
   
   
   ### What you think should happen instead
   
   # What possibly went wrong
   There seems to be some incorrect date filtering/ordering against the 
database.
   
   # Observations
   
   Interestingly, our stage instance for the same DAG shows recent runs 
correctly and in a logical order. The deployments process, chart configuration, 
and airflow version are identical across our environments.
   
   
![image](https://user-images.githubusercontent.com/22802852/211959047-afeff5c1-37df-4f41-b25d-0bdd5c1fea60.png)
   
   
   ## DAG History
   
   The only discernable difference between behavior is when the dags were first 
turned on.
   
   ### Not missing recent runs
   Staging DAG has run since 2022-05-05
   
   
![image](https://user-images.githubusercontent.com/22802852/211959726-145acc13-b270-4265-821e-494b5bc31ff7.png)
   
   Prod DAG has run since 2022-01-07 - **Note that this one had the additional 
left -> right ordering issue**
   
   
![image](https://user-images.githubusercontent.com/22802852/211960267-f0dc1ba4-00a7-4e24-ae65-2ab4d14c3c75.png)
   
   ### Missing recent runs
   
   Prod DAG has run since 2021-08-19
   
   
![image](https://user-images.githubusercontent.com/22802852/211960399-3520e8a5-9513-44df-a6e1-ce57430edcc2.png)
   
   
   
   
   ### How to reproduce
   
   1. Spin up with 2.4.0 and run multiple dags with varying `start_date` and 
with `catchup` enabled.
   2. Inspect the  `/dags/{dag_id}/grid` endpoint for running dags.
   
   ### Operating System
   
   Debian GNU/Linux - 11 (bullseye)
   
   ### Versions of Apache Airflow Providers
   
   ```
   apache-airflow-providers-amazon  5.1.0
   apache-airflow-providers-celery  3.0.0
   apache-airflow-providers-cncf-kubernetes 4.3.0
   apache-airflow-providers-common-sql  1.3.1
   apache-airflow-providers-docker  3.1.0
   apache-airflow-providers-elasticsearch   4.2.0
   apache-airflow-providers-ftp 3.2.0
   apache-airflow-providers-google  8.3.0
   apache-airflow-providers-grpc  

[GitHub] [airflow] o-nikolas commented on a diff in pull request #28528: Fixes to how DebugExecutor handles sensors

2023-01-11 Thread GitBox


o-nikolas commented on code in PR #28528:
URL: https://github.com/apache/airflow/pull/28528#discussion_r1067627962


##
airflow/ti_deps/deps/ready_to_reschedule.py:
##
@@ -44,7 +45,8 @@ def _get_dep_statuses(self, ti, session, dep_context):
 from airflow.models.mappedoperator import MappedOperator
 
 is_mapped = isinstance(ti.task, MappedOperator)
-if not is_mapped and not getattr(ti.task, "reschedule", False):
+is_debug_executor = conf.get("core", "executor") == "DebugExecutor"

Review Comment:
   Yeah it was removed in the context of the `dag.test()` command in [this 
commit](https://github.com/apache/airflow/commit/71c64de96248694017897fdb3d9d241e7c980825).
 But AIP-47 compliant system tests still depend on that executor, so we can't 
kill it just yet without finding a replacement for that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] jedcunningham opened a new pull request, #28871: KubenetesExecutor sends state even when successful

2023-01-11 Thread GitBox


jedcunningham opened a new pull request, #28871:
URL: https://github.com/apache/airflow/pull/28871

   Like the other executors, KubernetesExecutor should send the "worker
   state" back to result buffer. This is more consistent, particularly
   around logging, with the other executors.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] yenchenLiu commented on issue #28868: Use dynamic task mapping in TriggerDagRunOperator may generate the same run_id

2023-01-11 Thread GitBox


yenchenLiu commented on issue #28868:
URL: https://github.com/apache/airflow/issues/28868#issuecomment-1379676489

   I guess that we can check if the task is dynamic mapping (maybe by checking 
`expanded_ti_count`?), then add the current task number into `run_id` etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #28870: Support multiple buttons from `operator_extra_links`

2023-01-11 Thread GitBox


boring-cyborg[bot] commented on issue #28870:
URL: https://github.com/apache/airflow/issues/28870#issuecomment-1379674340

   Thanks for opening your first issue here! Be sure to follow the issue 
template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] travis-cook-sfdc opened a new issue, #28870: Support multiple buttons from `operator_extra_links`

2023-01-11 Thread GitBox


travis-cook-sfdc opened a new issue, #28870:
URL: https://github.com/apache/airflow/issues/28870

   ### Description
   
   `BaseOperatorLink` should support returning multiple links:
   
   ```
   class ButtonLink::
   def __init__(self, name: str, link: str) -> None:
   self.name = name
   self.link = link
   
   
   class BaseOperatorLink(metadataclass=ABCMeta):
   ...
   @abstractmethod
   def get_links(self, operator: BaseOperator, dttm: datetime) -> 
List[ButtonLink]:
   pass
   ```
   
   ### Use case/motivation
   
   My company has implemented a "MultiExternalTaskSensor" that allows a single 
task to wait for multiple other tasks to complete.  
   
   `ExternalTaskSensor` automatically supports `operator_extra_links` to 
provide a link to the upstream DAG page, but this is impossible to do with 
`Multi...` since there's an arbitrary number of buttons that need to show up 
based on the number of tasks that are being waited on.
   
   Another benefit of allowing this would be that the buttons could support 
dynamic names, which would allow information about the task (like it's state) 
to present in the button text.
   
   This would add an additional challenge while the button data was loading, 
since the name might not be known.  At this point, there could either be a 
default name fallback or a simple loading spinner while we wait for all button 
names to return.
   
   There would need to be changes to the `/extra_links` API and `dag.js`, but 
it seems feasible.
   
   It could be implement along the lines of:
   ```python
   class ButtonLink:
   
   def __init__(self, name: str, link: str) -> None:
   self.name = name
   self.link = link
   
   class BaseOperatorLink(metaclass=ABCMeta):
   ...
   
   @abstractmethod
   def get_links(self, operator: BaseOperator, dttm: datetime) -> 
List[ButtonLink]:
   pass
   ```
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] vandonr-amz opened a new pull request, #28869: rewrite polling code for appflow hook

2023-01-11 Thread GitBox


vandonr-amz opened a new pull request, #28869:
URL: https://github.com/apache/airflow/pull/28869

   Existing code had several problems imho, like 
* using 3 different sleep times, only one being configurable
* sleeping before doing anything
* relying on the last run being the one we care about, which is not 
necessarily true
   
   I also refactored some duplicated code and added a `wait_for_completion` 
parameter to skip the wait entirely if needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch v2-5-test updated: Fix taskflow.rst duplicated "or" (#28839)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v2-5-test by this push:
 new af684d823c Fix taskflow.rst duplicated "or" (#28839)
af684d823c is described below

commit af684d823c077b2ca6e7ed6125909ed5de6248ad
Author: itaymaslo <62256194+itayma...@users.noreply.github.com>
AuthorDate: Wed Jan 11 20:44:39 2023 +0200

Fix taskflow.rst duplicated "or" (#28839)

Co-authored-by: Jed Cunningham 
<66968678+jedcunning...@users.noreply.github.com>
(cherry picked from commit 7d57f5696885eb2a4cd64d56bf79d6a8e5a5d638)
---
 docs/apache-airflow/tutorial/taskflow.rst | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/docs/apache-airflow/tutorial/taskflow.rst 
b/docs/apache-airflow/tutorial/taskflow.rst
index 66255936f5..d859257d73 100644
--- a/docs/apache-airflow/tutorial/taskflow.rst
+++ b/docs/apache-airflow/tutorial/taskflow.rst
@@ -234,8 +234,7 @@ Using the TaskFlow API with complex/conflicting Python 
dependencies
 ---
 
 If you have tasks that require complex or conflicting requirements then you 
will have the ability to use the
-TaskFlow API with either Python virtual environment (since 2.0.2), Docker 
container (since version 2.2.0) or
-or ExternalPythonOperator or KubernetesPodOperator (since 2.4.0).
+TaskFlow API with either Python virtual environment (since 2.0.2), Docker 
container (since 2.2.0), ExternalPythonOperator (since 2.4.0) or 
KubernetesPodOperator (since 2.4.0).
 
 This functionality allows a much more comprehensive range of use-cases for the 
TaskFlow API,
 as you are not limited to the packages and system libraries of the Airflow 
worker. For all cases of



[airflow] branch v2-5-test updated: Update scheduler docs about low priority tasks (#28831)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v2-5-test by this push:
 new 192fd4b540 Update scheduler docs about low priority tasks (#28831)
192fd4b540 is described below

commit 192fd4b540c2a8ddfb3ffce66b024eaab356f3fa
Author: eladkal <45845474+elad...@users.noreply.github.com>
AuthorDate: Wed Jan 11 13:25:31 2023 +0200

Update scheduler docs about low priority tasks (#28831)

Gathered insights from discussion in 
https://github.com/apache/airflow/issues/26933 into a paragraph in scheduler 
docs to clarify why sometimes low priority tasks are scheduled before high 
priority tasks

(cherry picked from commit 493b433ad57088a5f5cabc466c949445e500b4c1)
---
 docs/apache-airflow/concepts/scheduler.rst | 8 
 1 file changed, 8 insertions(+)

diff --git a/docs/apache-airflow/concepts/scheduler.rst 
b/docs/apache-airflow/concepts/scheduler.rst
index 10057c386d..b07c5c16c9 100644
--- a/docs/apache-airflow/concepts/scheduler.rst
+++ b/docs/apache-airflow/concepts/scheduler.rst
@@ -59,6 +59,14 @@ In the UI, it appears as if Airflow is running your tasks a 
day **late**
 
 You should refer to :doc:`/dag-run` for details on scheduling a DAG.
 
+.. note::
+The scheduler is designed for high throughput. This is an informed design 
decision to achieve scheduling
+tasks as soon as possible. The scheduler checks how many free slots 
available in a pool and schedule at most that number of tasks instances in one 
iteration.
+This means that task priority will only come in to effect when there are 
more scheduled tasks
+waiting than the queue slots. Thus there can be cases where low priority 
tasks will be schedule before high priority tasks if they share the same batch.
+For more read about that you can reference `this GitHub discussion 
`__.
+
+
 DAG File Processing
 ---
 



[airflow] branch v2-5-test updated: Fix masking of non-sensitive environment variables (#28802)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v2-5-test by this push:
 new 3543195633 Fix masking of non-sensitive environment variables (#28802)
3543195633 is described below

commit 3543195633c0fcfe613e0aee8a39c0079fcf4952
Author: Ephraim Anierobi 
AuthorDate: Mon Jan 9 17:43:51 2023 +0100

Fix masking of non-sensitive environment variables (#28802)

Environment variables are hidden even when we set expose-config to 
non-sensitive-only.
This PR changes it to work like every other source, the items are only
hidden when they are sensitive

(cherry picked from commit 0a8d0ab56689c341e65a36c0287c9d635bae1242)
---
 airflow/configuration.py | 4 ++--
 tests/core/test_configuration.py | 8 +---
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/airflow/configuration.py b/airflow/configuration.py
index 4cea1a53f6..ce55aa45c6 100644
--- a/airflow/configuration.py
+++ b/airflow/configuration.py
@@ -1142,8 +1142,8 @@ class AirflowConfigParser(ConfigParser):
 if not display_sensitive and env_var != self._env_var_name("core", 
"unit_test_mode"):
 # Don't hide cmd/secret values here
 if not env_var.lower().endswith("cmd") and not 
env_var.lower().endswith("secret"):
-opt = "< hidden >"
-
+if (section, key) in self.sensitive_config_values:
+opt = "< hidden >"
 elif raw:
 opt = opt.replace("%", "%%")
 if display_source:
diff --git a/tests/core/test_configuration.py b/tests/core/test_configuration.py
index 6cbe0e9076..558634de14 100644
--- a/tests/core/test_configuration.py
+++ b/tests/core/test_configuration.py
@@ -65,6 +65,7 @@ def restore_env():
 "os.environ",
 {
 "AIRFLOW__TESTSECTION__TESTKEY": "testvalue",
+"AIRFLOW__CORE__FERNET_KEY": "testvalue",
 "AIRFLOW__TESTSECTION__TESTPERCENT": "with%percent",
 "AIRFLOW__TESTCMDENV__ITSACOMMAND_CMD": 'echo -n "OK"',
 "AIRFLOW__TESTCMDENV__NOTACOMMAND_CMD": 'echo -n "NOT OK"',
@@ -136,15 +137,16 @@ class TestConf:
 assert cfg_dict["core"]["percent"] == "with%inside"
 
 # test env vars
-assert cfg_dict["testsection"]["testkey"] == "< hidden >"
-assert 
cfg_dict["kubernetes_environment_variables"]["AIRFLOW__TESTSECTION__TESTKEY"] 
== "< hidden >"
+assert cfg_dict["testsection"]["testkey"] == "testvalue"
+assert 
cfg_dict["kubernetes_environment_variables"]["AIRFLOW__TESTSECTION__TESTKEY"] 
== "nested"
 
 def test_conf_as_dict_source(self):
 # test display_source
 cfg_dict = conf.as_dict(display_source=True)
 assert cfg_dict["core"]["load_examples"][1] == "airflow.cfg"
 assert cfg_dict["database"]["load_default_connections"][1] == 
"airflow.cfg"
-assert cfg_dict["testsection"]["testkey"] == ("< hidden >", "env var")
+assert cfg_dict["testsection"]["testkey"] == ("testvalue", "env var")
+assert cfg_dict["core"]["fernet_key"] == ("< hidden >", "env var")
 
 def test_conf_as_dict_sensitive(self):
 # test display_sensitive



[airflow] branch v2-5-test updated: Update dynamic-task-mapping.rst (#28797)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v2-5-test by this push:
 new 2c735b248f Update dynamic-task-mapping.rst (#28797)
2c735b248f is described below

commit 2c735b248fc565f78684467f46aa2f904699a95e
Author: Elena Sadler <36947886+sadler-el...@users.noreply.github.com>
AuthorDate: Mon Jan 9 02:16:12 2023 -0500

Update dynamic-task-mapping.rst (#28797)

(cherry picked from commit 6ca67ba98ee74c1b42a93f9812ddb8a0e02c041d)
---
 docs/apache-airflow/concepts/dynamic-task-mapping.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/apache-airflow/concepts/dynamic-task-mapping.rst 
b/docs/apache-airflow/concepts/dynamic-task-mapping.rst
index d15c0ada77..8f5f2ee6e2 100644
--- a/docs/apache-airflow/concepts/dynamic-task-mapping.rst
+++ b/docs/apache-airflow/concepts/dynamic-task-mapping.rst
@@ -338,7 +338,7 @@ There are a couple of things to note:
 Combining upstream data (aka "zipping")
 ===
 
-It is also to want to combine multiple input sources into one task mapping 
iterable. This is generally known as "zipping" (like Python's built-in 
``zip()`` function), and is also performed as pre-processing of the downstream 
task.
+It is also common to want to combine multiple input sources into one task 
mapping iterable. This is generally known as "zipping" (like Python's built-in 
``zip()`` function), and is also performed as pre-processing of the downstream 
task.
 
 This is especially useful for conditional logic in task mapping. For example, 
if you want to download files from S3, but rename those files, something like 
this would be possible:
 



[airflow] branch v2-5-test updated: Remove swagger-ui extra from connexion and install swagger-ui-dist via npm package (#28788)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v2-5-test by this push:
 new 1eb8ec2f74 Remove swagger-ui extra from connexion and install 
swagger-ui-dist via npm package (#28788)
1eb8ec2f74 is described below

commit 1eb8ec2f741da64268300974582efb4abe41b6ce
Author: Josh Goldman 
AuthorDate: Tue Jan 10 05:24:17 2023 -0500

Remove swagger-ui extra from connexion and install swagger-ui-dist via npm 
package (#28788)

(cherry picked from commit 35ad16dc0f6b764322b1eb289709e493fbbb0ae0)
---
 Dockerfile.ci |  2 +-
 airflow/www/extensions/init_views.py  |  7 ++-
 airflow/www/package.json  |  1 +
 airflow/www/templates/swagger-ui/index.j2 | 87 +++
 airflow/www/webpack.config.js |  8 +++
 airflow/www/yarn.lock |  5 ++
 setup.cfg |  2 +-
 7 files changed, 109 insertions(+), 3 deletions(-)

diff --git a/Dockerfile.ci b/Dockerfile.ci
index 95ab395053..74123f9427 100644
--- a/Dockerfile.ci
+++ b/Dockerfile.ci
@@ -1003,7 +1003,7 @@ ARG PYTHON_BASE_IMAGE
 ARG AIRFLOW_IMAGE_REPOSITORY="https://github.com/apache/airflow";
 
 # By increasing this number we can do force build of all dependencies
-ARG DEPENDENCIES_EPOCH_NUMBER="6"
+ARG DEPENDENCIES_EPOCH_NUMBER="7"
 
 # Make sure noninteractive debian install is used and language variables set
 ENV PYTHON_BASE_IMAGE=${PYTHON_BASE_IMAGE} \
diff --git a/airflow/www/extensions/init_views.py 
b/airflow/www/extensions/init_views.py
index 25d5d5898c..7db06a06c1 100644
--- a/airflow/www/extensions/init_views.py
+++ b/airflow/www/extensions/init_views.py
@@ -208,7 +208,12 @@ def init_api_connexion(app: Flask) -> None:
 return views.method_not_allowed(ex)
 
 spec_dir = path.join(ROOT_APP_DIR, "api_connexion", "openapi")
-connexion_app = App(__name__, specification_dir=spec_dir, 
skip_error_handlers=True)
+swagger_ui_dir = path.join(ROOT_APP_DIR, "www", "static", "dist", 
"swagger-ui")
+options = {
+"swagger_ui": conf.getboolean("webserver", "enable_swagger_ui", 
fallback=True),
+"swagger_path": swagger_ui_dir,
+}
+connexion_app = App(__name__, specification_dir=spec_dir, 
skip_error_handlers=True, options=options)
 connexion_app.app = app
 api_bp = connexion_app.add_api(
 specification="v1.yaml", base_path=base_path, validate_responses=True, 
strict_validation=True
diff --git a/airflow/www/package.json b/airflow/www/package.json
index f694089ee6..37565483c2 100644
--- a/airflow/www/package.json
+++ b/airflow/www/package.json
@@ -118,6 +118,7 @@
 "react-table": "^7.8.0",
 "react-textarea-autosize": "^8.3.4",
 "redoc": "^2.0.0-rc.72",
+"swagger-ui-dist": "3.52.0",
 "type-fest": "^2.17.0",
 "url-search-params-polyfill": "^8.1.0"
   },
diff --git a/airflow/www/templates/swagger-ui/index.j2 
b/airflow/www/templates/swagger-ui/index.j2
new file mode 100644
index 00..62661a369a
--- /dev/null
+++ b/airflow/www/templates/swagger-ui/index.j2
@@ -0,0 +1,87 @@
+{#
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+#}
+
+
+
+
+  
+
+{{ title | default('Swagger UI') }}
+
+
+
+
+  html
+  {
+box-sizing: border-box;
+overflow: -moz-scrollbars-vertical;
+overflow-y: scroll;
+  }
+  *,
+  *:before,
+  *:after
+  {
+box-sizing: inherit;
+  }
+  body
+  {
+margin:0;
+background: #fafafa;
+  }
+
+  
+
+  
+
+ 
+ 
+
+window.onload = function() {
+  // Begin Swagger UI call region
+  const ui = SwaggerUIBundle({
+url: "{{ openapi_spec_url }}",
+{% if urls is defined %}
+urls: {{ urls|tojson|safe }},
+{% endif %}
+validatorUrl: {{ validatorUrl | default('null') }},
+{% if configUrl is defined %}
+configUrl: "{{ configUrl }}",
+{% endif %}
+dom_id: '#swagger-ui',
+deepLinking: true,
+presets: [
+  SwaggerUIBundle.p

[airflow] branch v2-5-test updated: Fix UIAlert should_show when AUTH_ROLE_PUBLIC set (#28781)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v2-5-test by this push:
 new aac37ec52a Fix UIAlert should_show when AUTH_ROLE_PUBLIC set (#28781)
aac37ec52a is described below

commit aac37ec52a03db132a62af57e3a034808b9d2ece
Author: Victor Chiapaikeo 
AuthorDate: Tue Jan 10 00:51:53 2023 -0500

Fix UIAlert should_show when AUTH_ROLE_PUBLIC set (#28781)

* Fix UIAlert should_show when AUTH_ROLE_PUBLIC set

Co-authored-by: Tzu-ping Chung 
(cherry picked from commit f17e2ba48b59525655a92e04684db664a672918f)
---
 airflow/www/utils.py   | 22 +++---
 tests/test_utils/www.py|  7 +++
 tests/www/views/conftest.py|  7 ++-
 tests/www/views/test_views_home.py |  5 +
 4 files changed, 37 insertions(+), 4 deletions(-)

diff --git a/airflow/www/utils.py b/airflow/www/utils.py
index b1985a3c8a..5381709e35 100644
--- a/airflow/www/utils.py
+++ b/airflow/www/utils.py
@@ -55,6 +55,8 @@ if TYPE_CHECKING:
 from sqlalchemy.orm.query import Query
 from sqlalchemy.sql.operators import ColumnOperators
 
+from airflow.www.fab_security.sqla.manager import SecurityManager
+
 
 def datetime_to_string(value: DateTime | None) -> str | None:
 if value is None:
@@ -812,10 +814,24 @@ class UIAlert:
 self.html = html
 self.message = Markup(message) if html else message
 
-def should_show(self, securitymanager) -> bool:
-"""Determine if the user should see the message based on their role 
membership"""
+def should_show(self, securitymanager: SecurityManager) -> bool:
+"""Determine if the user should see the message.
+
+The decision is based on the user's role. If ``AUTH_ROLE_PUBLIC`` is
+set in ``webserver_config.py``, An anonymous user would have the
+``AUTH_ROLE_PUBLIC`` role.
+"""
 if self.roles:
-user_roles = {r.name for r in securitymanager.current_user.roles}
+current_user = securitymanager.current_user
+if current_user is not None:
+user_roles = {r.name for r in 
securitymanager.current_user.roles}
+elif "AUTH_ROLE_PUBLIC" in 
securitymanager.appbuilder.get_app.config:
+# If the current_user is anonymous, assign AUTH_ROLE_PUBLIC 
role (if it exists) to them
+user_roles = 
{securitymanager.appbuilder.get_app.config["AUTH_ROLE_PUBLIC"]}
+else:
+# Unable to obtain user role - default to not showing
+return False
+
 if not user_roles.intersection(set(self.roles)):
 return False
 return True
diff --git a/tests/test_utils/www.py b/tests/test_utils/www.py
index 8491d54094..55699c7051 100644
--- a/tests/test_utils/www.py
+++ b/tests/test_utils/www.py
@@ -32,6 +32,13 @@ def client_with_login(app, **kwargs):
 return client
 
 
+def client_without_login(app):
+# Anonymous users can only view if AUTH_ROLE_PUBLIC is set to non-Public
+app.config["AUTH_ROLE_PUBLIC"] = "Viewer"
+client = app.test_client()
+return client
+
+
 def check_content_in_response(text, resp, resp_code=200):
 resp_html = resp.data.decode("utf-8")
 assert resp_code == resp.status_code
diff --git a/tests/www/views/conftest.py b/tests/www/views/conftest.py
index 4166b1c961..e04b5ec96e 100644
--- a/tests/www/views/conftest.py
+++ b/tests/www/views/conftest.py
@@ -29,7 +29,7 @@ from airflow.models import DagBag
 from airflow.www.app import create_app
 from tests.test_utils.api_connexion_utils import delete_user
 from tests.test_utils.decorators import dont_initialize_flask_app_submodules
-from tests.test_utils.www import client_with_login
+from tests.test_utils.www import client_with_login, client_without_login
 
 
 @pytest.fixture(autouse=True, scope="module")
@@ -123,6 +123,11 @@ def user_client(app):
 return client_with_login(app, username="test_user", password="test_user")
 
 
+@pytest.fixture()
+def anonymous_client(app):
+return client_without_login(app)
+
+
 class _TemplateWithContext(NamedTuple):
 template: jinja2.environment.Template
 context: dict[str, Any]
diff --git a/tests/www/views/test_views_home.py 
b/tests/www/views/test_views_home.py
index 1041a71117..4f6e354a7b 100644
--- a/tests/www/views/test_views_home.py
+++ b/tests/www/views/test_views_home.py
@@ -203,6 +203,11 @@ def test_home_robots_header_in_response(user_client):
 @pytest.mark.parametrize(
 "client, flash_message, expected",
 [
+("anonymous_client", UIAlert("hello world"), True),
+("anonymous_client", UIAlert("hello world", roles=["Viewer"]), True),
+("anonymous_client", UIAlert("hello world", roles=["User"]), False),
+("anonymous_client", UIAlert("hello world", roles=["Viewer", "User"

[GitHub] [airflow] boring-cyborg[bot] commented on issue #28868: Use dynamic task mapping in TriggerDagRunOperator may generate the same run_id

2023-01-11 Thread GitBox


boring-cyborg[bot] commented on issue #28868:
URL: https://github.com/apache/airflow/issues/28868#issuecomment-1379652983

   Thanks for opening your first issue here! Be sure to follow the issue 
template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] yenchenLiu opened a new issue, #28868: Use dynamic task mapping in TriggerDagRunOperator may generate the same run_id

2023-01-11 Thread GitBox


yenchenLiu opened a new issue, #28868:
URL: https://github.com/apache/airflow/issues/28868

   ### Apache Airflow version
   
   2.5.0
   
   ### What happened
   
   I use TriggerDagRunOperator to generate a large number of dag instances(>= 
100), and it sometimes causes a bug that is 
   ``` log
   Failed to execute job 7159001 for task dynamic_dags 
((psycopg2.errors.UniqueViolation) duplicate key value violates unique 
constraint "dag_run_dag_id_run_id_key"
   DETAIL:  Key (dag_id, run_id)=(dynamic_dags, 
manual__2023-01-11T22:28:03.286419+00:00) already exists.
   ```
   
   ``` python
   TriggerDagRunOperator.partial(task_id='dynamic_dags', 
trigger_dag_id='trigger_dag', wait_for_completion=True).expand(conf=[{"id": 1}, 
{"id": 2}, {"id": 3}, ..])
   ```
   
   ### What you think should happen instead
   
   I expect the `run_id` to be always unique when I use dynamic task mapping to 
generate instances.
   
   ### How to reproduce
   
   1. Create a dag called `trigger_dag`.
   2. Create a dag called `test_dag`, which runs TriggerDagRunOperator with the 
dynamic task to generate a large number of `trigger_dag`. 
   3. Sometimes, you will see the task failed due to the same `run_id`.
   
   * I use the local executor and  parallelism = 16, max_active_tasks_per_dag = 
12, and max_active_runs_per_dag = 12
   
   ### Operating System
   
   Ubuntu 20.04.4 LTS
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Virtualenv installation
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] 04/04: Only patch single label when adopting pod (#28776)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 2fa9474cc061152d7705a93ede5fe4bcfa49f1e1
Author: Jed Cunningham <66968678+jedcunning...@users.noreply.github.com>
AuthorDate: Mon Jan 9 18:07:56 2023 -0600

Only patch single label when adopting pod (#28776)

When KubernetesExecutor adopts pods, it was patching the pod with the
pod it retrieved from the k8s api, while just updating a single label.
Normally this works just fine, but there are cases where the pod you
pull from the k8s api can't be used as-is when patching - it results
in a 422 `Forbidden: pod updates may not change fields other than ...`.

Instead we now just pass the single label we need to update to patch,
allowing us to avoid accidentally "updating" other fields.

Closes #24015

(cherry picked from commit 9922953bcd9e11a1412a3528aef938444d62f7fe)
---
 airflow/executors/kubernetes_executor.py| 17 +++---
 tests/executors/test_kubernetes_executor.py | 80 +
 2 files changed, 77 insertions(+), 20 deletions(-)

diff --git a/airflow/executors/kubernetes_executor.py 
b/airflow/executors/kubernetes_executor.py
index 65e463a948..28f720f35e 100644
--- a/airflow/executors/kubernetes_executor.py
+++ b/airflow/executors/kubernetes_executor.py
@@ -636,7 +636,6 @@ class KubernetesExecutor(BaseExecutor):
 )
 self.fail(task[0], e)
 except ApiException as e:
-
 # These codes indicate something is wrong with pod 
definition; otherwise we assume pod
 # definition is ok, and that retrying may work
 if e.status in (400, 422):
@@ -748,27 +747,28 @@ class KubernetesExecutor(BaseExecutor):
 assert self.scheduler_job_id
 
 self.log.info("attempting to adopt pod %s", pod.metadata.name)
-pod.metadata.labels["airflow-worker"] = 
pod_generator.make_safe_label_value(self.scheduler_job_id)
 pod_id = annotations_to_key(pod.metadata.annotations)
 if pod_id not in pod_ids:
 self.log.error("attempting to adopt taskinstance which was not 
specified by database: %s", pod_id)
 return
 
+new_worker_id_label = 
pod_generator.make_safe_label_value(self.scheduler_job_id)
 try:
 kube_client.patch_namespaced_pod(
 name=pod.metadata.name,
 namespace=pod.metadata.namespace,
-body=PodGenerator.serialize_pod(pod),
+body={"metadata": {"labels": {"airflow-worker": 
new_worker_id_label}}},
 )
-pod_ids.pop(pod_id)
-self.running.add(pod_id)
 except ApiException as e:
 self.log.info("Failed to adopt pod %s. Reason: %s", 
pod.metadata.name, e)
+return
+
+del pod_ids[pod_id]
+self.running.add(pod_id)
 
 def _adopt_completed_pods(self, kube_client: client.CoreV1Api) -> None:
 """
-
-Patch completed pod so that the KubernetesJobWatcher can delete it.
+Patch completed pods so that the KubernetesJobWatcher can delete them.
 
 :param kube_client: kubernetes client for speaking to kube API
 """
@@ -783,12 +783,11 @@ class KubernetesExecutor(BaseExecutor):
 pod_list = 
kube_client.list_namespaced_pod(namespace=self.kube_config.kube_namespace, 
**kwargs)
 for pod in pod_list.items:
 self.log.info("Attempting to adopt pod %s", pod.metadata.name)
-pod.metadata.labels["airflow-worker"] = new_worker_id_label
 try:
 kube_client.patch_namespaced_pod(
 name=pod.metadata.name,
 namespace=pod.metadata.namespace,
-body=PodGenerator.serialize_pod(pod),
+body={"metadata": {"labels": {"airflow-worker": 
new_worker_id_label}}},
 )
 except ApiException as e:
 self.log.info("Failed to adopt pod %s. Reason: %s", 
pod.metadata.name, e)
diff --git a/tests/executors/test_kubernetes_executor.py 
b/tests/executors/test_kubernetes_executor.py
index 367f1cb2c4..97619225e6 100644
--- a/tests/executors/test_kubernetes_executor.py
+++ b/tests/executors/test_kubernetes_executor.py
@@ -654,20 +654,78 @@ class TestKubernetesExecutor:
 pod_ids = {ti_key: {}}
 
 executor.adopt_launched_task(mock_kube_client, pod=pod, 
pod_ids=pod_ids)
-assert mock_kube_client.patch_namespaced_pod.call_args[1] == {
-"body": {
-"metadata": {
-"labels": {"airflow-worker": "modified"},
-"annotations": annotations,
-"name": "foo",
-}
-},
-"name": "foo",
-"namespace": None,
-}
+   

[airflow] 01/04: Clarify about docker compose (#28729)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit a00f089f45e096c5b833af088063e543f3c19e9d
Author: eladkal <45845474+elad...@users.noreply.github.com>
AuthorDate: Wed Jan 4 18:05:58 2023 +0200

Clarify about docker compose (#28729)

We got several requests to update syntax 
https://github.com/apache/airflow/pull/28728 
https://github.com/apache/airflow/pull/27792 
https://github.com/apache/airflow/pull/28194
lets clarify that this is not a mistake

(cherry picked from commit df0e4c9ad447377073af1ed60fb0dfad731be059)
---
 docs/apache-airflow/howto/docker-compose/index.rst | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/docs/apache-airflow/howto/docker-compose/index.rst 
b/docs/apache-airflow/howto/docker-compose/index.rst
index 33e06e4b96..cc7a67e297 100644
--- a/docs/apache-airflow/howto/docker-compose/index.rst
+++ b/docs/apache-airflow/howto/docker-compose/index.rst
@@ -162,6 +162,9 @@ Now you can start all services:
 
 docker compose up
 
+.. note::
+  docker-compose is old syntax. Please check `Stackoverflow 
`__.
+
 In a second terminal you can check the condition of the containers and make 
sure that no containers are in an unhealthy condition:
 
 .. code-block:: text



[airflow] 02/04: Update CSRF token to expire with session (#28730)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 8202a4cc9f873d496c5f84b94fbca2c38fffda4f
Author: Max Ho 
AuthorDate: Wed Jan 11 07:25:29 2023 +0800

Update CSRF token to expire with session (#28730)

(cherry picked from commit 543e9a592e6b9dc81467c55169725e192fe95e89)
---
 airflow/config_templates/default_webserver_config.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/airflow/config_templates/default_webserver_config.py 
b/airflow/config_templates/default_webserver_config.py
index ac999a0dea..aa22b125fa 100644
--- a/airflow/config_templates/default_webserver_config.py
+++ b/airflow/config_templates/default_webserver_config.py
@@ -32,6 +32,7 @@ basedir = os.path.abspath(os.path.dirname(__file__))
 
 # Flask-WTF flag for CSRF
 WTF_CSRF_ENABLED = True
+WTF_CSRF_TIME_LIMIT = None
 
 # 
 # AUTHENTICATION CONFIG



[airflow] branch v2-5-test updated (e420172865 -> 2fa9474cc0)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a change to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


from e420172865 Limit SQLAlchemy to below 2.0 (#28725)
 new a00f089f45 Clarify about docker compose (#28729)
 new 8202a4cc9f Update CSRF token to expire with session (#28730)
 new 228961c9b9 Clarify that versioned constraints are fixed at release 
time (#28762)
 new 2fa9474cc0 Only patch single label when adopting pod (#28776)

The 4 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../config_templates/default_webserver_config.py   |  1 +
 airflow/executors/kubernetes_executor.py   | 17 +++--
 docs/apache-airflow/howto/docker-compose/index.rst |  3 +
 .../installation/installing-from-pypi.rst  | 34 -
 docs/docker-stack/index.rst| 35 ++
 tests/executors/test_kubernetes_executor.py| 80 +++---
 6 files changed, 149 insertions(+), 21 deletions(-)



[airflow] 03/04: Clarify that versioned constraints are fixed at release time (#28762)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 228961c9b9fe209fd179f3122b2b93cdf455cd79
Author: Jarek Potiuk 
AuthorDate: Fri Jan 6 10:29:28 2023 +0100

Clarify that versioned constraints are fixed at release time (#28762)

We received a number of requests to upgrade individual dependencies in
the constraint files (mostly due to those dependencies releasing version
with vulnerabilities fixed). This is not how our constraint works, their
main purpose is to provide "consistent installation" mechanism for
anyone who installs airflow from the scratch, we are not going to keep
such relased versions up-to-date with versions of dependencies released
after the release.

This PR provides additional explanation about that in both constraint
files as well as in reference container images which follow similar
patterns.

(cherry picked from commit 8290ade26deba02ca6cf3d8254981b31cf89ee5b)
---
 .../installation/installing-from-pypi.rst  | 34 -
 docs/docker-stack/index.rst| 35 ++
 2 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/docs/apache-airflow/installation/installing-from-pypi.rst 
b/docs/apache-airflow/installation/installing-from-pypi.rst
index 3ec492b0ed..e6cf162bc3 100644
--- a/docs/apache-airflow/installation/installing-from-pypi.rst
+++ b/docs/apache-airflow/installation/installing-from-pypi.rst
@@ -53,7 +53,7 @@ and both at the same time. We decided to keep our 
dependencies as open as possib
 version of libraries if needed. This means that from time to time plain ``pip 
install apache-airflow`` will
 not work or will produce an unusable Airflow installation.
 
-In order to have a repeatable installation, we also keep a set of 
"known-to-be-working" constraint files in the
+In order to have a repeatable installation (and only for that reason), we also 
keep a set of "known-to-be-working" constraint files in the
 ``constraints-main``, ``constraints-2-0``, ``constraints-2-1`` etc. orphan 
branches and then we create a tag
 for each released version e.g. :subst-code:`constraints-|version|`. This way, 
we keep a tested and working set of dependencies.
 
@@ -88,6 +88,38 @@ constraints always points to the "latest" released Airflow 
version constraints:
 
   
https://raw.githubusercontent.com/apache/airflow/constraints-latest/constraints-3.7.txt
 
+
+Fixing Constraint files at release time
+'''
+
+The released "versioned" constraints are mostly ``fixed`` when we release 
Airflow version and we only
+update them in exceptional circumstances. For example when we find out that 
the released constraints might prevent
+Airflow from being installed consistently from the scratch. In normal 
circumstances, the constraint files
+are not going to change if new version of Airflow dependencies are released - 
not even when those
+versions contain critical security fixes. The process of Airflow releases is 
designed around upgrading
+dependencies automatically where applicable but only when we release a new 
version of Airflow,
+not for already released versions.
+
+If you want to make sure that Airflow dependencies are upgraded to the latest 
released versions containing
+latest security fixes, you should implement your own process to upgrade those 
yourself when
+you detect the need for that. Airflow usually does not upper-bound versions of 
its dependencies via
+requirements, so you should be able to upgrade them to the latest versions - 
usually without any problems.
+
+Obviously - since we have no control over what gets released in new versions 
of the dependencies, we
+cannot give any guarantees that tests and functionality of those dependencies 
will be compatible with
+Airflow after you upgrade them - testing if Airflow still works with those is 
in your hands,
+and in case of any problems, you should raise issue with the authors of the 
dependencies that are problematic.
+You can also - in such cases - look at the `Airflow issues 
`_
+`Airflow Pull Requests `_ and
+`Airflow Discussions `_, 
searching for similar
+problems to see if there are any fixes or workarounds found in the ``main`` 
version of Airflow and apply them
+to your deployment.
+
+The easiest way to keep-up with the latest released dependencies is however, 
to upgrade to the latest released
+Airflow version. Whenever we release a new version of Airflow, we upgrade all 
dependencies to the latest
+applicable versions and test them together, so if you want to keep up with 
those tests - staying up-to-date
+with latest version of Airflow is the easiest way to update those dependencies.
+
 Installation and upgra

[GitHub] [airflow] potiuk commented on issue #27507: Making logging for HttpHook optional

2023-01-11 Thread GitBox


potiuk commented on issue #27507:
URL: https://github.com/apache/airflow/issues/27507#issuecomment-1379644585

   Feel free.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] 03/03: Limit SQLAlchemy to below 2.0 (#28725)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit e420172865a4568b328b0cbf6a9135d8c3b24205
Author: Jarek Potiuk 
AuthorDate: Wed Jan 4 13:39:50 2023 +0100

Limit SQLAlchemy to below 2.0 (#28725)

SQLAlchemy is about to release 2.0 version and in 1.46 version it
started to warn about deprecated features that are used. This
(nicely) started to fail our builds - so our canary tests caught
it early and gave us a chance to prepare for the 2.0 release and
limit Airflow's dependencies beforehand.

This PR adds the deprecation as "known" and limits SQLAlchemy to
be <2.0 (and links to appropriate issues and documentation)

(cherry picked from commit 93fed0cf5eeed5dbea9f261370149206232fca98)
---
 scripts/in_container/verify_providers.py | 7 ---
 setup.cfg| 6 +-
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/scripts/in_container/verify_providers.py 
b/scripts/in_container/verify_providers.py
index 87d79ee6ca..d46e32caa3 100755
--- a/scripts/in_container/verify_providers.py
+++ b/scripts/in_container/verify_providers.py
@@ -231,6 +231,7 @@ KNOWN_COMMON_DEPRECATED_MESSAGES: set[str] = {
 "Please use `schedule` instead. ",
 "'urllib3.contrib.pyopenssl' module is deprecated and will be removed in a 
future "
 "release of urllib3 2.x. Read more in this issue: 
https://github.com/urllib3/urllib3/issues/2680";,
+"Deprecated API features detected! These feature(s) are not compatible 
with SQLAlchemy 2.0",
 }
 
 # The set of warning messages generated by direct importing of some deprecated 
modules. We should only
@@ -260,7 +261,7 @@ KNOWN_DEPRECATED_DIRECT_IMPORTS: set[str] = {
 def filter_known_warnings(warn: warnings.WarningMessage) -> bool:
 msg_string = str(warn.message).replace("\n", " ")
 for message, origin in KNOWN_DEPRECATED_MESSAGES:
-if msg_string == message and warn.filename.find(f"/{origin}/") != -1:
+if message in msg_string and warn.filename.find(f"/{origin}/") != -1:
 return False
 return True
 
@@ -268,7 +269,7 @@ def filter_known_warnings(warn: warnings.WarningMessage) -> 
bool:
 def filter_direct_importlib_warning(warn: warnings.WarningMessage) -> bool:
 msg_string = str(warn.message).replace("\n", " ")
 for m in KNOWN_DEPRECATED_DIRECT_IMPORTS:
-if msg_string == m and warn.filename.find("/importlib/") != -1:
+if m in msg_string and warn.filename.find("/importlib/") != -1:
 return False
 return True
 
@@ -276,7 +277,7 @@ def filter_direct_importlib_warning(warn: 
warnings.WarningMessage) -> bool:
 def filter_known_common_deprecated_messages(warn: warnings.WarningMessage) -> 
bool:
 msg_string = str(warn.message).replace("\n", " ")
 for m in KNOWN_COMMON_DEPRECATED_MESSAGES:
-if msg_string == m:
+if m in msg_string:
 return False
 return True
 
diff --git a/setup.cfg b/setup.cfg
index 5cc2a1342b..868065c398 100644
--- a/setup.cfg
+++ b/setup.cfg
@@ -125,7 +125,11 @@ install_requires =
 python-slugify>=5.0
 rich>=12.4.4
 setproctitle>=1.1.8
-sqlalchemy>=1.4
+# We use some deprecated features of sqlalchemy 2.0 and we should replace 
them before we can upgrade
+# See https://sqlalche.me/e/b8d9 for details of deprecated features
+# you can set environment variable SQLALCHEMY_WARN_20=1 to show all 
deprecation warnings.
+# The issue tracking it is https://github.com/apache/airflow/issues/28723
+sqlalchemy>=1.4,<2.0
 sqlalchemy_jsonfield>=1.0
 tabulate>=0.7.5
 tenacity>=6.2.0



[airflow] 02/03: Ignore Blackification commit from Blame (#28719)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 9687cd7fdd58b5c5f0409230c7bd71acd00c9fea
Author: Tzu-ping Chung 
AuthorDate: Wed Jan 4 15:41:47 2023 +0800

Ignore Blackification commit from Blame (#28719)

(cherry picked from commit 8cb69bb05417075adebef19cd28b2409dbba3f33)
---
 .git-blame-ignore-revs | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.git-blame-ignore-revs b/.git-blame-ignore-revs
index 3cbfd2db16..639153b3c0 100644
--- a/.git-blame-ignore-revs
+++ b/.git-blame-ignore-revs
@@ -8,3 +8,4 @@ d67ac5932dabbf06ae733fc57b48491a8029b8c2
 # Mass converting string literals to use double quotes.
 2a34dc9e8470285b0ed2db71109ef4265e29688b
 bfcae349b88fd959e32bfacd027a5be976fe2132
+01a819a42daa7990c30ab9776208b3dcb9f3a28b



[airflow] branch v2-5-test updated (df84c2d1d3 -> e420172865)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a change to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


from df84c2d1d3 Add Niko to committers (#28712)
 new dbdafc5ab7 Bump json5 from 1.0.1 to 1.0.2 in /airflow/www (#28715)
 new 9687cd7fdd Ignore Blackification commit from Blame (#28719)
 new e420172865 Limit SQLAlchemy to below 2.0 (#28725)

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .git-blame-ignore-revs   |  1 +
 airflow/www/yarn.lock| 13 -
 scripts/in_container/verify_providers.py |  7 ---
 setup.cfg|  6 +-
 4 files changed, 14 insertions(+), 13 deletions(-)



[airflow] 01/03: Bump json5 from 1.0.1 to 1.0.2 in /airflow/www (#28715)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit dbdafc5ab790ada51928055cb183d1042ef74bde
Author: Dependabot [bot] <49699333+dependabot[bot]@users.noreply.github.com>
AuthorDate: Wed Jan 4 09:32:52 2023 -0500

Bump json5 from 1.0.1 to 1.0.2 in /airflow/www (#28715)

Bumps [json5](https://github.com/json5/json5) from 1.0.1 to 1.0.2.
- [Release notes](https://github.com/json5/json5/releases)
- [Changelog](https://github.com/json5/json5/blob/main/CHANGELOG.md)
- [Commits](https://github.com/json5/json5/compare/v1.0.1...v1.0.2)

---
updated-dependencies:
- dependency-name: json5
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] 

Signed-off-by: dependabot[bot] 
Co-authored-by: dependabot[bot] 
<49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit e4bc5e54b1f41c991542850045bcfd060bac7395)
---
 airflow/www/yarn.lock | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/airflow/www/yarn.lock b/airflow/www/yarn.lock
index 5431b00d52..17125472c3 100644
--- a/airflow/www/yarn.lock
+++ b/airflow/www/yarn.lock
@@ -7235,9 +7235,9 @@ json-stringify-safe@^5.0.1:
   integrity 
sha512-ZClg6AaYvamvYEE82d3Iyd3vSSIjQ+odgjaTzRuO3s7toCdFKczob2i0zCh7JE8kWn17yvAWhUVxvqGwUalsRA==
 
 json5@^1.0.1:
-  version "1.0.1"
-  resolved 
"https://registry.yarnpkg.com/json5/-/json5-1.0.1.tgz#779fb0018604fa854eacbf6252180d83543e3dbe";
-  integrity 
sha512-aKS4WQjPenRxiQsC93MNfjx+nbF4PAdYzmd/1JIj8HYzqfbu86beTuNgXDzPknWk0n0uARlyewZo4s++ES36Ow==
+  version "1.0.2"
+  resolved 
"https://registry.yarnpkg.com/json5/-/json5-1.0.2.tgz#63d98d60f21b313b77c4d6da18bfa69d80e1d593";
+  integrity 
sha512-g1MWMLBiz8FKi1e4w0UyVL3w+iJceWAFBAaBnnGKOpNa5f8TLktkbre1+s6oICydWAm+HRUGTmI+//xv2hvXYA==
   dependencies:
 minimist "^1.2.0"
 
@@ -7684,16 +7684,11 @@ minimist-options@4.1.0:
 is-plain-obj "^1.1.0"
 kind-of "^6.0.3"
 
-minimist@^1.2.0:
+minimist@^1.2.0, minimist@^1.2.5, minimist@^1.2.6:
   version "1.2.7"
   resolved 
"https://registry.yarnpkg.com/minimist/-/minimist-1.2.7.tgz#daa1c4d91f507390437c6a8bc01078e7000c4d18";
   integrity 
sha512-bzfL1YUZsP41gmu/qjrEk0Q6i2ix/cVeAhbCbqH9u3zYutS1cLg00qhrD0M2MVdCcx4Sc0UpP2eBWo9rotpq6g==
 
-minimist@^1.2.5, minimist@^1.2.6:
-  version "1.2.6"
-  resolved 
"https://registry.yarnpkg.com/minimist/-/minimist-1.2.6.tgz#8637a5b759ea0d6e98702cfb3a9283323c93af44";
-  integrity 
sha512-Jsjnk4bw3YJqYzbdyBiNsPWHPfO++UGG749Cxs6peCu5Xg4nrena6OVxOYxrQTqww0Jmwt+Ref8rggumkTLz9Q==
-
 minipass-collect@^1.0.2:
   version "1.0.2"
   resolved 
"https://registry.yarnpkg.com/minipass-collect/-/minipass-collect-1.0.2.tgz#22b813bf745dc6edba2576b940022ad6edc8c617";



[GitHub] [airflow] github-actions[bot] closed pull request #26800: add commands explanation

2023-01-11 Thread GitBox


github-actions[bot] closed pull request #26800: add commands explanation
URL: https://github.com/apache/airflow/pull/26800


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] github-actions[bot] closed pull request #27541: Remove default owner

2023-01-11 Thread GitBox


github-actions[bot] closed pull request #27541: Remove default owner
URL: https://github.com/apache/airflow/pull/27541


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch v2-5-test updated: Add Niko to committers (#28712)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v2-5-test by this push:
 new df84c2d1d3 Add Niko to committers (#28712)
df84c2d1d3 is described below

commit df84c2d1d39b5b955b7763311a719f55ef889845
Author: Jed Cunningham <66968678+jedcunning...@users.noreply.github.com>
AuthorDate: Tue Jan 3 18:23:55 2023 -0600

Add Niko to committers (#28712)

(cherry picked from commit 56fb1f1b8cd73b4328df5b6fc6d232788b1f7d13)
---
 .github/workflows/ci.yml| 1 +
 docs/apache-airflow/project.rst | 1 +
 2 files changed, 2 insertions(+)

diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index 961fad71e5..6814d69f2c 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -102,6 +102,7 @@ jobs:
 "milton0825",
 "mistercrunch",
 "msumit",
+"o-nikolas",
 "pierrejeambrun",
 "pingzh",
 "potiuk",
diff --git a/docs/apache-airflow/project.rst b/docs/apache-airflow/project.rst
index 9c6fe09e25..a89779f2b5 100644
--- a/docs/apache-airflow/project.rst
+++ b/docs/apache-airflow/project.rst
@@ -71,6 +71,7 @@ Committers
 - Leah Cole (@leahecole)
 - Malthe Borch (@malthe)
 - Maxime "Max" Beauchemin (@mistercrunch)
+- Niko Oliveira (@o-nikolas)
 - Patrick Leo Tardif (@pltardif)
 - Pierre Jeambrun (@pierrejeambrun)
 - Ping Zhang (@pingzh)



[airflow] branch v2-5-test updated: Fix some docs on using sensors with taskflow (#28708)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v2-5-test by this push:
 new 2301e8b831 Fix some docs on using sensors with taskflow (#28708)
2301e8b831 is described below

commit 2301e8b83132f4b49fd23b65af437f4d388414c7
Author: Charles Machalow 
AuthorDate: Wed Jan 4 03:43:15 2023 -0800

Fix some docs on using sensors with taskflow (#28708)

Also add in testing to ensure that returning bool from taskflow sensors 
works as expected

(cherry picked from commit 12a065a38d19f4b5698962db67f5fe9ab50d420a)
---
 airflow/decorators/sensor.py  |  2 +-
 airflow/sensors/python.py |  2 +-
 docs/apache-airflow/concepts/taskflow.rst |  9 
 docs/apache-airflow/tutorial/taskflow.rst |  6 ++-
 tests/decorators/test_sensor.py   | 74 +++
 5 files changed, 90 insertions(+), 3 deletions(-)

diff --git a/airflow/decorators/sensor.py b/airflow/decorators/sensor.py
index 291c412988..2033968620 100644
--- a/airflow/decorators/sensor.py
+++ b/airflow/decorators/sensor.py
@@ -56,7 +56,7 @@ class DecoratedSensorOperator(PythonSensor):
 kwargs["task_id"] = get_unique_task_id(task_id, kwargs.get("dag"), 
kwargs.get("task_group"))
 super().__init__(**kwargs)
 
-def poke(self, context: Context) -> PokeReturnValue:
+def poke(self, context: Context) -> PokeReturnValue | bool:
 return self.python_callable(*self.op_args, **self.op_kwargs)
 
 
diff --git a/airflow/sensors/python.py b/airflow/sensors/python.py
index 374df243d7..615e4e20ee 100644
--- a/airflow/sensors/python.py
+++ b/airflow/sensors/python.py
@@ -65,7 +65,7 @@ class PythonSensor(BaseSensorOperator):
 self.op_kwargs = op_kwargs or {}
 self.templates_dict = templates_dict
 
-def poke(self, context: Context) -> PokeReturnValue:
+def poke(self, context: Context) -> PokeReturnValue | bool:
 context_merge(context, self.op_kwargs, 
templates_dict=self.templates_dict)
 self.op_kwargs = determine_kwargs(self.python_callable, self.op_args, 
context)
 
diff --git a/docs/apache-airflow/concepts/taskflow.rst 
b/docs/apache-airflow/concepts/taskflow.rst
index 11554dabf4..97847455ab 100644
--- a/docs/apache-airflow/concepts/taskflow.rst
+++ b/docs/apache-airflow/concepts/taskflow.rst
@@ -182,6 +182,15 @@ for deserialization ensure that ``deserialize(data: dict, 
version: int)`` is spe
 
   Note: Typing of ``version`` is required and needs to be ``ClassVar[int]``
 
+
+Sensors and the TaskFlow API
+--
+
+.. versionadded:: 2.5.0
+
+For an example of writing a Sensor using the TaskFlow API, see
+:ref:`Using the TaskFlow API with Sensor operators 
`.
+
 History
 ---
 
diff --git a/docs/apache-airflow/tutorial/taskflow.rst 
b/docs/apache-airflow/tutorial/taskflow.rst
index 9db581200e..66255936f5 100644
--- a/docs/apache-airflow/tutorial/taskflow.rst
+++ b/docs/apache-airflow/tutorial/taskflow.rst
@@ -365,7 +365,11 @@ You can apply the ``@task.sensor`` decorator to convert a 
regular Python functio
 BaseSensorOperator class. The Python function implements the poke logic and 
returns an instance of
 the ``PokeReturnValue`` class as the ``poke()`` method in the 
BaseSensorOperator does. The ``PokeReturnValue`` is
 a new feature in Airflow 2.3 that allows a sensor operator to push an XCom 
value as described in
-section "Having sensors return XOM values" of 
:doc:`apache-airflow-providers:howto/create-update-providers`.
+section "Having sensors return XCOM values" of 
:doc:`apache-airflow-providers:howto/create-update-providers`.
+
+Alternatively in cases where the sensor doesn't need to push XCOM values:  
both ``poke()`` and the wrapped
+function can return a boolean-like value where ``True`` designates the 
sensor's operation as complete and
+``False`` designates the sensor's operation as incomplete.
 
 .. _taskflow/task_sensor_example:
 
diff --git a/tests/decorators/test_sensor.py b/tests/decorators/test_sensor.py
index d58fb486aa..a6dd9106cf 100644
--- a/tests/decorators/test_sensor.py
+++ b/tests/decorators/test_sensor.py
@@ -63,6 +63,30 @@ class TestSensorDecorator:
 )
 assert actual_xcom_value == sensor_xcom_value
 
+def test_basic_sensor_success_returns_bool(self, dag_maker):
+@task.sensor
+def sensor_f():
+return True
+
+@task
+def dummy_f():
+pass
+
+with dag_maker():
+sf = sensor_f()
+df = dummy_f()
+sf >> df
+
+dr = dag_maker.create_dagrun()
+sf.operator.run(start_date=dr.execution_date, 
end_date=dr.execution_date, ignore_ti_state=True)
+tis = dr.get_task_instances()
+assert len(tis) == 2
+for ti in tis:
+if ti.task_id == "sensor_f":
+  

[airflow] branch v2-5-test updated: Fix "airflow tasks render" cli command for mapped task instances (#28698)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v2-5-test by this push:
 new b36f24753f Fix "airflow tasks render" cli command for mapped task 
instances (#28698)
b36f24753f is described below

commit b36f24753fc3584648d505a7d13fbc6e6958c1c3
Author: Ephraim Anierobi 
AuthorDate: Wed Jan 4 21:43:20 2023 +0100

Fix "airflow tasks render" cli command for mapped task instances (#28698)

The fix was to use the 'template_fields' attr directly since both mapped 
and unmapped
tasks now have that attribute.
I also had to use ti.task instead of the task from dag.get_task due to this 
error:
`AttributeError: 'DecoratedMappedOperator' object has no attribute 
'templates_dict'` and
I wonder if this is a bug

(cherry picked from commit 1da17be37627385fed7fc06584d72e0abda6a1b5)
---
 airflow/cli/commands/task_command.py|  9 ++---
 tests/cli/commands/test_task_command.py | 62 +
 2 files changed, 67 insertions(+), 4 deletions(-)

diff --git a/airflow/cli/commands/task_command.py 
b/airflow/cli/commands/task_command.py
index 93e7c81146..2f37579c35 100644
--- a/airflow/cli/commands/task_command.py
+++ b/airflow/cli/commands/task_command.py
@@ -591,21 +591,22 @@ def task_test(args, dag=None):
 
 @cli_utils.action_cli(check_db=False)
 @suppress_logs_and_warning
-def task_render(args):
+def task_render(args, dag=None):
 """Renders and displays templated fields for a given task."""
-dag = get_dag(args.subdir, args.dag_id)
+if not dag:
+dag = get_dag(args.subdir, args.dag_id)
 task = dag.get_task(task_id=args.task_id)
 ti, _ = _get_ti(
 task, args.map_index, 
exec_date_or_run_id=args.execution_date_or_run_id, create_if_necessary="memory"
 )
 ti.render_templates()
-for attr in task.__class__.template_fields:
+for attr in task.template_fields:
 print(
 textwrap.dedent(
 f"""# 
--
 # property: {attr}
 # --
-{getattr(task, attr)}
+{getattr(ti.task, attr)}
 """
 )
 )
diff --git a/tests/cli/commands/test_task_command.py 
b/tests/cli/commands/test_task_command.py
index a062f31d4f..864ea10408 100644
--- a/tests/cli/commands/test_task_command.py
+++ b/tests/cli/commands/test_task_command.py
@@ -41,6 +41,7 @@ from airflow.configuration import conf
 from airflow.exceptions import AirflowException, DagRunNotFound
 from airflow.models import DagBag, DagRun, Pool, TaskInstance
 from airflow.models.serialized_dag import SerializedDagModel
+from airflow.operators.bash import BashOperator
 from airflow.utils import timezone
 from airflow.utils.session import create_session
 from airflow.utils.state import State
@@ -389,6 +390,67 @@ class TestCliTasks:
 assert 'echo "2016-01-01"' in output
 assert 'echo "2016-01-08"' in output
 
+def test_mapped_task_render(self):
+"""
+tasks render should render and displays templated fields for a given 
mapping task
+"""
+with redirect_stdout(io.StringIO()) as stdout:
+task_command.task_render(
+self.parser.parse_args(
+[
+"tasks",
+"render",
+"test_mapped_classic",
+"consumer_literal",
+"2022-01-01",
+"--map-index",
+"0",
+]
+)
+)
+# the dag test_mapped_classic has op_args=[[1], [2], [3]], so the 
first mapping task should have
+# op_args=[1]
+output = stdout.getvalue()
+assert "[1]" in output
+assert "[2]" not in output
+assert "[3]" not in output
+assert "property: op_args" in output
+
+def test_mapped_task_render_with_template(self, dag_maker):
+"""
+tasks render should render and displays templated fields for a given 
mapping task
+"""
+with dag_maker() as dag:
+templated_command = """
+{% for i in range(5) %}
+echo "{{ ds }}"
+echo "{{ macros.ds_add(ds, 7)}}"
+{% endfor %}
+"""
+commands = [templated_command, "echo 1"]
+
+
BashOperator.partial(task_id="some_command").expand(bash_command=commands)
+
+with redirect_stdout(io.StringIO()) as stdout:
+task_command.task_render(
+self.parser.parse_args(
+[
+"tasks",
+"render",
+"test_dag",
+"so

[airflow] branch v2-5-test updated: Allow XComArgs for external_task_ids of ExternalTaskSensor (#28692)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v2-5-test by this push:
 new b967fc97d1 Allow XComArgs for external_task_ids of ExternalTaskSensor 
(#28692)
b967fc97d1 is described below

commit b967fc97d19f18ceeff38945fedb627382061a2c
Author: Victor Chiapaikeo 
AuthorDate: Wed Jan 4 06:39:53 2023 -0500

Allow XComArgs for external_task_ids of ExternalTaskSensor (#28692)

(cherry picked from commit 7f18fa96e434c64288d801904caf1fcde18e2cbf)
---
 airflow/sensors/external_task.py   |  6 ++-
 tests/sensors/test_external_task_sensor.py | 72 ++
 2 files changed, 68 insertions(+), 10 deletions(-)

diff --git a/airflow/sensors/external_task.py b/airflow/sensors/external_task.py
index e9573a0671..967bb5a276 100644
--- a/airflow/sensors/external_task.py
+++ b/airflow/sensors/external_task.py
@@ -162,8 +162,7 @@ class ExternalTaskSensor(BaseSensorOperator):
 f"when `external_task_id` or `external_task_ids` or 
`external_task_group_id` "
 f"is not `None`: {State.task_states}"
 )
-if external_task_ids and len(external_task_ids) > 
len(set(external_task_ids)):
-raise ValueError("Duplicate task_ids passed in 
external_task_ids parameter")
+
 elif not total_states <= set(State.dag_states):
 raise ValueError(
 f"Valid values for `allowed_states` and `failed_states` "
@@ -196,6 +195,9 @@ class ExternalTaskSensor(BaseSensorOperator):
 
 @provide_session
 def poke(self, context, session=None):
+if self.external_task_ids and len(self.external_task_ids) > 
len(set(self.external_task_ids)):
+raise ValueError("Duplicate task_ids passed in external_task_ids 
parameter")
+
 dttm_filter = self._get_dttm_filter(context)
 serialized_dttm_filter = ",".join(dt.isoformat() for dt in dttm_filter)
 
diff --git a/tests/sensors/test_external_task_sensor.py 
b/tests/sensors/test_external_task_sensor.py
index 80f538e868..b594210b13 100644
--- a/tests/sensors/test_external_task_sensor.py
+++ b/tests/sensors/test_external_task_sensor.py
@@ -33,8 +33,10 @@ from airflow.exceptions import AirflowException, 
AirflowSensorTimeout
 from airflow.models import DagBag, DagRun, TaskInstance
 from airflow.models.dag import DAG
 from airflow.models.serialized_dag import SerializedDagModel
+from airflow.models.xcom_arg import XComArg
 from airflow.operators.bash import BashOperator
 from airflow.operators.empty import EmptyOperator
+from airflow.operators.python import PythonOperator
 from airflow.sensors.external_task import ExternalTaskMarker, 
ExternalTaskSensor, ExternalTaskSensorLink
 from airflow.sensors.time_sensor import TimeSensor
 from airflow.serialization.serialized_objects import SerializedBaseOperator
@@ -45,6 +47,7 @@ from airflow.utils.timezone import datetime
 from airflow.utils.types import DagRunType
 from tests.models import TEST_DAGS_FOLDER
 from tests.test_utils.db import clear_db_runs
+from tests.test_utils.mock_operators import MockOperator
 
 DEFAULT_DATE = datetime(2015, 1, 1)
 TEST_DAG_ID = "unit_test_dag"
@@ -579,17 +582,70 @@ exit 0
 dag=self.dag,
 )
 
+def test_external_task_sensor_with_xcom_arg_does_not_fail_on_init(self):
+self.add_time_sensor()
+op1 = MockOperator(task_id="op1", dag=self.dag)
+op2 = ExternalTaskSensor(
+
task_id="test_external_task_sensor_with_xcom_arg_does_not_fail_on_init",
+external_dag_id=TEST_DAG_ID,
+external_task_ids=XComArg(op1),
+allowed_states=["success"],
+dag=self.dag,
+)
+assert isinstance(op2.external_task_ids, XComArg)
+
 def test_catch_duplicate_task_ids(self):
 self.add_time_sensor()
-# Test By passing same task_id multiple times
+op1 = ExternalTaskSensor(
+task_id="test_external_task_duplicate_task_ids",
+external_dag_id=TEST_DAG_ID,
+external_task_ids=[TEST_TASK_ID, TEST_TASK_ID],
+allowed_states=["success"],
+dag=self.dag,
+)
 with pytest.raises(ValueError):
-ExternalTaskSensor(
-task_id="test_external_task_duplicate_task_ids",
-external_dag_id=TEST_DAG_ID,
-external_task_ids=[TEST_TASK_ID, TEST_TASK_ID],
-allowed_states=["success"],
-dag=self.dag,
-)
+op1.run(start_date=DEFAULT_DATE, end_date=DEFAULT_DATE)
+
+def test_catch_duplicate_task_ids_with_xcom_arg(self):
+self.add_time_sensor()
+op1 = PythonOperator(
+python_callable=lambda: ["dupe_value", "dupe_value"],
+task_id="op1",
+do_xcom_push=Tr

[airflow] branch v2-5-test updated: Row-lock TIs to be removed during mapped task expansion (#28689)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v2-5-test by this push:
 new 6285c4e71b Row-lock TIs to be removed during mapped task expansion 
(#28689)
6285c4e71b is described below

commit 6285c4e71b79b8994da0cf9b8b7e8942ec0a2110
Author: Ephraim Anierobi 
AuthorDate: Wed Jan 4 07:21:44 2023 +0100

Row-lock TIs to be removed during mapped task expansion (#28689)

Instead of query-update, we row lock the TI to apply the update.
This protects against updating a row that has been updated by another 
process.

(cherry picked from commit a055d8fd9b42ae662e0c696e29066926b5346f6a)
---
 airflow/models/abstractoperator.py | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/airflow/models/abstractoperator.py 
b/airflow/models/abstractoperator.py
index ba0a8954ae..d693f8bfc9 100644
--- a/airflow/models/abstractoperator.py
+++ b/airflow/models/abstractoperator.py
@@ -31,6 +31,7 @@ from airflow.utils.helpers import render_template_as_native, 
render_template_to_
 from airflow.utils.log.logging_mixin import LoggingMixin
 from airflow.utils.mixins import ResolveMixin
 from airflow.utils.session import NEW_SESSION, provide_session
+from airflow.utils.sqlalchemy import skip_locked, with_row_locks
 from airflow.utils.state import State, TaskInstanceState
 from airflow.utils.task_group import MappedTaskGroup
 from airflow.utils.trigger_rule import TriggerRule
@@ -548,13 +549,15 @@ class AbstractOperator(LoggingMixin, DAGNode):
 
 # Any (old) task instances with inapplicable indexes (>= the total
 # number we need) are set to "REMOVED".
-session.query(TaskInstance).filter(
+query = session.query(TaskInstance).filter(
 TaskInstance.dag_id == self.dag_id,
 TaskInstance.task_id == self.task_id,
 TaskInstance.run_id == run_id,
 TaskInstance.map_index >= total_expanded_ti_count,
-).update({TaskInstance.state: TaskInstanceState.REMOVED})
-
+)
+to_update = with_row_locks(query, of=TaskInstance, session=session, 
**skip_locked(session=session))
+for ti in to_update:
+ti.state = TaskInstanceState.REMOVED
 session.flush()
 return all_expanded_tis, total_expanded_ti_count - 1
 



[airflow] branch v2-5-test updated: Fixed typo (#28687)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v2-5-test by this push:
 new f7c473033a Fixed typo (#28687)
f7c473033a is described below

commit f7c473033a4befc382f5ca226d103b1d3f80f4e9
Author: Adylzhan Khashtamov 
AuthorDate: Tue Jan 3 12:11:34 2023 +0300

Fixed typo (#28687)

(cherry picked from commit e598a1b294956448928c82a444e081ff67c6aa47)
---
 docs/apache-airflow/concepts/scheduler.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/apache-airflow/concepts/scheduler.rst 
b/docs/apache-airflow/concepts/scheduler.rst
index d9ba22be82..10057c386d 100644
--- a/docs/apache-airflow/concepts/scheduler.rst
+++ b/docs/apache-airflow/concepts/scheduler.rst
@@ -280,7 +280,7 @@ When you know what your resource usage is, the improvements 
that you can conside
   parsed continuously so optimizing that code might bring tremendous 
improvements, especially if you try
   to reach out to some external databases etc. while parsing DAGs (this should 
be avoided at all cost).
   The :ref:`best_practices/top_level_code` explains what are the best 
practices for writing your top-level
-  Python code. The :ref:`best_practices/reducing_dag_complexity` document 
provides some ares that you might
+  Python code. The :ref:`best_practices/reducing_dag_complexity` document 
provides some areas that you might
   look at when you want to reduce complexity of your code.
 * improve utilization of your resources. This is when you have a free capacity 
in your system that
   seems underutilized (again CPU, memory I/O, networking are the prime 
candidates) - you can take



[airflow] branch v2-5-test updated: Handle ConnectionReset exception in Executor cleanup (#28685)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v2-5-test by this push:
 new 4cef77be70 Handle ConnectionReset exception in Executor cleanup 
(#28685)
4cef77be70 is described below

commit 4cef77be70ce96eb760d4f64425bbdb0cc8e7544
Author: Max Ho 
AuthorDate: Tue Jan 3 19:53:52 2023 +0800

Handle ConnectionReset exception in Executor cleanup (#28685)

(cherry picked from commit a3de721e2f084913e853aff39d04adc00f0b82ea)
---
 airflow/executors/kubernetes_executor.py | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/airflow/executors/kubernetes_executor.py 
b/airflow/executors/kubernetes_executor.py
index 16cf1b282f..65e463a948 100644
--- a/airflow/executors/kubernetes_executor.py
+++ b/airflow/executors/kubernetes_executor.py
@@ -843,13 +843,16 @@ class KubernetesExecutor(BaseExecutor):
 assert self.kube_scheduler
 
 self.log.info("Shutting down Kubernetes executor")
-self.log.debug("Flushing task_queue...")
-self._flush_task_queue()
-self.log.debug("Flushing result_queue...")
-self._flush_result_queue()
-# Both queues should be empty...
-self.task_queue.join()
-self.result_queue.join()
+try:
+self.log.debug("Flushing task_queue...")
+self._flush_task_queue()
+self.log.debug("Flushing result_queue...")
+self._flush_result_queue()
+# Both queues should be empty...
+self.task_queue.join()
+self.result_queue.join()
+except ConnectionResetError:
+self.log.exception("Connection Reset error while flushing 
task_queue and result_queue.")
 if self.kube_scheduler:
 self.kube_scheduler.terminate()
 self._manager.shutdown()



[airflow] branch v2-5-test updated: Fix description of output redirection for access_log for gunicorn (#28672)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v2-5-test by this push:
 new 6b0f069224 Fix description of output redirection for access_log for 
gunicorn (#28672)
6b0f069224 is described below

commit 6b0f069224db764afd4e8db393f270de0b795804
Author: Jarek Potiuk 
AuthorDate: Tue Jan 3 10:05:09 2023 +0100

Fix description of output redirection for access_log for gunicorn (#28672)

As of gunicorn 19.7.0, default for access_log is stdout not stderr
and our documentation has not been updated to reflect that. We are
already past that (min version of gunicorn is 20.1.0, so the
documentation of access-log flag of ours was wrong. Having the
access_log in stdout rather than stderr also allows to redirect
the access log to a separate log sink in deployments like K8S.

(cherry picked from commit 675af73ceb5bc8b03d46a7cd903a73f9b8faba6f)
---
 airflow/cli/cli_parser.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/airflow/cli/cli_parser.py b/airflow/cli/cli_parser.py
index a6ec776bd2..33e513b586 100644
--- a/airflow/cli/cli_parser.py
+++ b/airflow/cli/cli_parser.py
@@ -647,7 +647,7 @@ ARG_DEBUG = Arg(
 ARG_ACCESS_LOGFILE = Arg(
 ("-A", "--access-logfile"),
 default=conf.get("webserver", "ACCESS_LOGFILE"),
-help="The logfile to store the webserver access log. Use '-' to print to 
stderr",
+help="The logfile to store the webserver access log. Use '-' to print to 
stdout",
 )
 ARG_ERROR_LOGFILE = Arg(
 ("-E", "--error-logfile"),



[airflow] branch v2-5-test updated: Fix minor typo in taskflow.rst (#28656)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v2-5-test by this push:
 new e68eac88c9 Fix minor typo in taskflow.rst (#28656)
e68eac88c9 is described below

commit e68eac88c93c8d360b6f408d44e375ebcfe50eea
Author: Mark H <06.swivel-rob...@icloud.com>
AuthorDate: Fri Dec 30 12:05:25 2022 -1000

Fix minor typo in taskflow.rst (#28656)

Case change to match logging API. getlogger -> getLogger

(cherry picked from commit 068886231ac0759d3ae9dd13fc2b2727d87b2f60)
---
 docs/apache-airflow/concepts/taskflow.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/apache-airflow/concepts/taskflow.rst 
b/docs/apache-airflow/concepts/taskflow.rst
index 7efa38b843..11554dabf4 100644
--- a/docs/apache-airflow/concepts/taskflow.rst
+++ b/docs/apache-airflow/concepts/taskflow.rst
@@ -77,7 +77,7 @@ To use logging from your task functions, simply import and 
use Python's logging
 
 .. code-block:: python
 
-   logger = logging.getlogger("airflow.task")
+   logger = logging.getLogger("airflow.task")
 
 Every logging line created this way will be recorded in the task log.
 



[airflow] 02/04: Add back join to zombie query that was dropped in #28198 (#28544)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 6d65da72db04d0c68301ca265dd5da61097670f5
Author: Jed Cunningham <66968678+jedcunning...@users.noreply.github.com>
AuthorDate: Fri Dec 23 09:24:30 2022 -0600

Add back join to zombie query that was dropped in #28198 (#28544)

#28198 accidentally dropped a join in a query, leading to this:

airflow/jobs/scheduler_job.py:1547 SAWarning: SELECT statement has a
cartesian product between FROM element(s) "dag_run_1", "task_instance",
"job" and FROM element "dag". Apply join condition(s) between each 
element to resolve.

(cherry picked from commit a24d18a534ddbcefbcf0d8790d140ff496781f8b)
---
 airflow/jobs/scheduler_job.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/airflow/jobs/scheduler_job.py b/airflow/jobs/scheduler_job.py
index baeabdc2ec..b8b608efcd 100644
--- a/airflow/jobs/scheduler_job.py
+++ b/airflow/jobs/scheduler_job.py
@@ -1525,7 +1525,8 @@ class SchedulerJob(BaseJob):
 zombies: list[tuple[TI, str, str]] = (
 session.query(TI, DM.fileloc, DM.processor_subdir)
 .with_hint(TI, "USE INDEX (ti_state)", dialect_name="mysql")
-.join(LocalTaskJob, TaskInstance.job_id == LocalTaskJob.id)
+.join(LocalTaskJob, TI.job_id == LocalTaskJob.id)
+.join(DM, TI.dag_id == DM.dag_id)
 .filter(TI.state == TaskInstanceState.RUNNING)
 .filter(
 or_(



[airflow] 01/04: Improve provider validation pre-commit (#28516)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit b607a287b8f0e8da0e2af426486f3108095b3537
Author: Jarek Potiuk 
AuthorDate: Thu Dec 22 03:25:08 2022 +0100

Improve provider validation pre-commit (#28516)

(cherry picked from commit e47c472e632effbfe3ddc784788a956c4ca44122)
---
 .pre-commit-config.yaml|  21 +-
 STATIC_CODE_CHECKS.rst |   2 +-
 airflow/cli/commands/info_command.py   |   1 +
 .../pre_commit_check_provider_yaml_files.py| 417 ++---
 .../run_provider_yaml_files_check.py}  |  96 -
 5 files changed, 131 insertions(+), 406 deletions(-)

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index 5df89e4fc7..a6ed9b1f4d 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -627,19 +627,6 @@ repos:
 entry: 
./scripts/ci/pre_commit/pre_commit_check_providers_subpackages_all_have_init.py
 language: python
 require_serial: true
-  - id: check-provider-yaml-valid
-name: Validate providers.yaml files
-pass_filenames: false
-entry: ./scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py
-language: python
-require_serial: true
-files: 
^docs/|provider\.yaml$|^scripts/ci/pre_commit/pre_commit_check_provider_yaml_files\.py$
-additional_dependencies:
-  - 'PyYAML==5.3.1'
-  - 'jsonschema>=3.2.0,<5.0.0'
-  - 'tabulate==0.8.8'
-  - 'jsonpath-ng==1.5.3'
-  - 'rich>=12.4.4'
   - id: check-pre-commit-information-consistent
 name: Update information re pre-commit hooks and verify ids and names
 entry: ./scripts/ci/pre_commit/pre_commit_check_pre_commit_hooks.py
@@ -888,6 +875,14 @@ repos:
 pass_filenames: true
 exclude: ^airflow/_vendor/
 additional_dependencies: ['rich>=12.4.4', 'inputimeout']
+  - id: check-provider-yaml-valid
+name: Validate provider.yaml files
+pass_filenames: false
+entry: ./scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py
+language: python
+require_serial: true
+files: 
^docs/|provider\.yaml$|^scripts/ci/pre_commit/pre_commit_check_provider_yaml_files\.py$
+additional_dependencies: ['rich>=12.4.4', 'inputimeout', 
'markdown-it-py']
   - id: update-migration-references
 name: Update migration ref doc
 language: python
diff --git a/STATIC_CODE_CHECKS.rst b/STATIC_CODE_CHECKS.rst
index 7495044f3d..b2b6081b5f 100644
--- a/STATIC_CODE_CHECKS.rst
+++ b/STATIC_CODE_CHECKS.rst
@@ -195,7 +195,7 @@ require Breeze Docker image to be build locally.
 
++--+-+
 | check-provide-create-sessions-imports  | Check 
provide_session and create_session imports | |
 
++--+-+
-| check-provider-yaml-valid  | Validate 
providers.yaml files| |
+| check-provider-yaml-valid  | Validate 
provider.yaml files | *   |
 
++--+-+
 | check-providers-init-file-missing  | Provider init file 
is missing| |
 
++--+-+
diff --git a/airflow/cli/commands/info_command.py 
b/airflow/cli/commands/info_command.py
index 124271c8c9..a8a7c760ab 100644
--- a/airflow/cli/commands/info_command.py
+++ b/airflow/cli/commands/info_command.py
@@ -176,6 +176,7 @@ _MACHINE_TO_ARCHITECTURE = {
 "arm64": Architecture.ARM,
 "armv7": Architecture.ARM,
 "armv7l": Architecture.ARM,
+"aarch64": Architecture.ARM,
 }
 
 
diff --git a/scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py 
b/scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py
index 5622212f46..f188879200 100755
--- a/scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py
+++ b/scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py
@@ -17,392 +17,47 @@
 # under the License.
 from __future__ import annotations
 
-import json
-import pathlib
+import os
 import sys
-import textwrap
-from collections import Counter
-from itertools import chain, product
-from typing import Any, Iterable
+from pathlib import Path
 
-import jsonschema
-import yaml
-from jsonpath_ng.ext imp

[airflow] branch v2-5-test updated (93a7e5fc18 -> 2f5060edef)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a change to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


 discard 93a7e5fc18 Change Architecture and OperatingSystem classies into Enums 
(#28627)
 discard d65fd66e46 Update pre-commit hooks (#28567)
 discard 0902436028 Add back join to zombie query that was dropped in #28198 
(#28544)
 discard 92c0b34571 Cleanup and do housekeeping with plugin examples (#28537)
 discard df9cd4c9b7 Improve provider validation pre-commit (#28516)
 discard 32a2bb67fd Remove extra H1 & improve formatting of Listeners docs page 
(#28450)
 new b607a287b8 Improve provider validation pre-commit (#28516)
 new 6d65da72db Add back join to zombie query that was dropped in #28198 
(#28544)
 new f56fd84b4c Update pre-commit hooks (#28567)
 new 2f5060edef Change Architecture and OperatingSystem classies into Enums 
(#28627)

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (93a7e5fc18)
\
 N -- N -- N   refs/heads/v2-5-test (2f5060edef)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 4 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .dockerignore  |   1 +
 .../airflow_breeze/utils/docker_command_utils.py   |   1 +
 dev/breeze/tests/test_commands.py  |   8 +-
 docs/apache-airflow/empty_plugin/empty_plugin.py   |  60 ---
 docs/apache-airflow/howto/custom-view-plugin.rst   |  72 ++--
 docs/apache-airflow/listeners.rst  |  54 +++---
 docs/apache-airflow/plugins.rst|  12 --
 .../empty_plugin => metastore_browser}/README.md   |   4 +-
 metastore_browser/hive_metastore.py| 199 +
 .../templates/metastore_browser/base.html  |  20 ++-
 .../templates/metastore_browser/db.html|  36 ++--
 .../templates/metastore_browser/dbs.html   |  11 +-
 .../templates/metastore_browser/table.html | 152 
 scripts/ci/docker-compose/local.yml|   3 +
 14 files changed, 495 insertions(+), 138 deletions(-)
 delete mode 100644 docs/apache-airflow/empty_plugin/empty_plugin.py
 rename {docs/apache-airflow/empty_plugin => metastore_browser}/README.md (90%)
 create mode 100644 metastore_browser/hive_metastore.py
 rename docs/apache-airflow/empty_plugin/templates/empty_plugin/index.html => 
metastore_browser/templates/metastore_browser/base.html (57%)
 copy airflow/www/templates/airflow/xcom.html => 
metastore_browser/templates/metastore_browser/db.html (57%)
 copy airflow/www/templates/airflow/noaccess.html => 
metastore_browser/templates/metastore_browser/dbs.html (83%)
 create mode 100644 metastore_browser/templates/metastore_browser/table.html



[airflow] 04/04: Change Architecture and OperatingSystem classies into Enums (#28627)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 2f5060edefa36338081ed1ffd7e12d669b02fe82
Author: Jarek Potiuk 
AuthorDate: Mon Jan 2 05:58:54 2023 +0100

Change Architecture and OperatingSystem classies into Enums (#28627)

Since they are objects already, there is a very little overhead
into making them Enums and it has the nice property of being able
to add type hinting for the returned values.

(cherry picked from commit 8a15557f6fe73feab0e49f97b295160820ad7cfd)
---
 airflow/cli/commands/info_command.py   | 22 +-
 .../in_container/run_provider_yaml_files_check.py  |  2 +-
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/airflow/cli/commands/info_command.py 
b/airflow/cli/commands/info_command.py
index a8a7c760ab..7261dfc484 100644
--- a/airflow/cli/commands/info_command.py
+++ b/airflow/cli/commands/info_command.py
@@ -23,6 +23,7 @@ import os
 import platform
 import subprocess
 import sys
+from enum import Enum
 from urllib.parse import urlsplit, urlunsplit
 
 import httpx
@@ -124,16 +125,17 @@ class PiiAnonymizer(Anonymizer):
 return urlunsplit((url_parts.scheme, netloc, url_parts.path, 
url_parts.query, url_parts.fragment))
 
 
-class OperatingSystem:
+class OperatingSystem(Enum):
 """Operating system."""
 
 WINDOWS = "Windows"
 LINUX = "Linux"
 MACOSX = "Mac OS"
 CYGWIN = "Cygwin"
+UNKNOWN = "Unknown"
 
 @staticmethod
-def get_current() -> str | None:
+def get_current() -> OperatingSystem:
 """Get current operating system."""
 if os.name == "nt":
 return OperatingSystem.WINDOWS
@@ -143,24 +145,26 @@ class OperatingSystem:
 return OperatingSystem.MACOSX
 elif "cygwin" in sys.platform:
 return OperatingSystem.CYGWIN
-return None
+return OperatingSystem.UNKNOWN
 
 
-class Architecture:
+class Architecture(Enum):
 """Compute architecture."""
 
 X86_64 = "x86_64"
 X86 = "x86"
 PPC = "ppc"
 ARM = "arm"
+UNKNOWN = "unknown"
 
 @staticmethod
-def get_current():
+def get_current() -> Architecture:
 """Get architecture."""
-return _MACHINE_TO_ARCHITECTURE.get(platform.machine().lower())
+current_architecture = 
_MACHINE_TO_ARCHITECTURE.get(platform.machine().lower())
+return current_architecture if current_architecture else 
Architecture.UNKNOWN
 
 
-_MACHINE_TO_ARCHITECTURE = {
+_MACHINE_TO_ARCHITECTURE: dict[str, Architecture] = {
 "amd64": Architecture.X86_64,
 "x86_64": Architecture.X86_64,
 "i686-64": Architecture.X86_64,
@@ -259,8 +263,8 @@ class AirflowInfo:
 python_version = sys.version.replace("\n", " ")
 
 return [
-("OS", operating_system or "NOT AVAILABLE"),
-("architecture", arch or "NOT AVAILABLE"),
+("OS", operating_system.value),
+("architecture", arch.value),
 ("uname", str(uname)),
 ("locale", str(_locale)),
 ("python_version", python_version),
diff --git a/scripts/in_container/run_provider_yaml_files_check.py 
b/scripts/in_container/run_provider_yaml_files_check.py
index cab365eb21..c2cfe565ae 100755
--- a/scripts/in_container/run_provider_yaml_files_check.py
+++ b/scripts/in_container/run_provider_yaml_files_check.py
@@ -444,7 +444,7 @@ def 
check_providers_have_all_documentation_files(yaml_files: dict[str, dict]):
 
 
 if __name__ == "__main__":
-architecture = Architecture().get_current()
+architecture = Architecture.get_current()
 console.print(f"Verifying packages on {architecture} architecture. 
Platform: {platform.machine()}.")
 provider_files_pattern = 
pathlib.Path(ROOT_DIR).glob("airflow/providers/**/provider.yaml")
 all_provider_files = sorted(str(path) for path in provider_files_pattern)



[airflow] 03/04: Update pre-commit hooks (#28567)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit f56fd84b4c6f1d2a3d5491458df869263b6b3989
Author: KarshVashi <41749592+karshva...@users.noreply.github.com>
AuthorDate: Sat Dec 24 01:03:59 2022 +

Update pre-commit hooks (#28567)

(cherry picked from commit 837e0fe2ea8859ae879d8382142c29a6416f02b9)
---
 .pre-commit-config.yaml  |  8 
 airflow/www/fab_security/manager.py  |  2 +-
 .../src/airflow_breeze/commands/testing_commands.py  |  2 +-
 dev/provider_packages/prepare_provider_packages.py   | 16 
 docs/exts/docs_build/docs_builder.py |  4 ++--
 docs/exts/extra_files_with_substitutions.py  |  2 +-
 docs/exts/provider_init_hack.py  |  2 +-
 kubernetes_tests/test_base.py|  2 +-
 tests/jobs/test_triggerer_job.py |  2 +-
 9 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index a6ed9b1f4d..577f0a1dda 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -148,7 +148,7 @@ repos:
   
\.cfg$|\.conf$|\.ini$|\.ldif$|\.properties$|\.readthedocs$|\.service$|\.tf$|Dockerfile.*$
   # Keep version of black in sync wit blacken-docs and pre-commit-hook-names
   - repo: https://github.com/psf/black
-rev: 22.3.0
+rev: 22.12.0
 hooks:
   - id: black
 name: Run black (python formatter)
@@ -210,7 +210,7 @@ repos:
 pass_filenames: true
   # TODO: Bump to Python 3.8 when support for Python 3.7 is dropped in Airflow.
   - repo: https://github.com/asottile/pyupgrade
-rev: v2.32.1
+rev: v3.3.1
 hooks:
   - id: pyupgrade
 name: Upgrade Python code automatically
@@ -259,7 +259,7 @@ repos:
   ^airflow/_vendor/
 additional_dependencies: ['toml']
   - repo: https://github.com/asottile/yesqa
-rev: v1.3.0
+rev: v1.4.0
 hooks:
   - id: yesqa
 name: Remove unnecessary noqa statements
@@ -268,7 +268,7 @@ repos:
   ^airflow/_vendor/
 additional_dependencies: ['flake8>=4.0.1']
   - repo: https://github.com/ikamensh/flynt
-rev: '0.76'
+rev: '0.77'
 hooks:
   - id: flynt
 name: Run flynt string format converter for Python
diff --git a/airflow/www/fab_security/manager.py 
b/airflow/www/fab_security/manager.py
index 96649b046b..ea8918053c 100644
--- a/airflow/www/fab_security/manager.py
+++ b/airflow/www/fab_security/manager.py
@@ -1013,7 +1013,7 @@ class BaseSecurityManager:
 
 @staticmethod
 def ldap_extract(ldap_dict: dict[str, list[bytes]], field_name: str, 
fallback: str) -> str:
-raw_value = ldap_dict.get(field_name, [bytes()])
+raw_value = ldap_dict.get(field_name, [b""])
 # decode - if empty string, default to fallback, otherwise take first 
element
 return raw_value[0].decode("utf-8") or fallback
 
diff --git a/dev/breeze/src/airflow_breeze/commands/testing_commands.py 
b/dev/breeze/src/airflow_breeze/commands/testing_commands.py
index 33781e3373..58d0b509a9 100644
--- a/dev/breeze/src/airflow_breeze/commands/testing_commands.py
+++ b/dev/breeze/src/airflow_breeze/commands/testing_commands.py
@@ -181,7 +181,7 @@ def _run_test(
 for container_id in container_ids:
 dump_path = FILES_DIR / 
f"container_logs_{container_id}_{date_str}.log"
 get_console(output=output).print(f"[info]Dumping container 
{container_id} to {dump_path}")
-with open(dump_path, "wt") as outfile:
+with open(dump_path, "w") as outfile:
 run_command(["docker", "logs", container_id], check=False, 
stdout=outfile)
 finally:
 run_command(
diff --git a/dev/provider_packages/prepare_provider_packages.py 
b/dev/provider_packages/prepare_provider_packages.py
index 2ef0859c89..ed1afb1e8f 100755
--- a/dev/provider_packages/prepare_provider_packages.py
+++ b/dev/provider_packages/prepare_provider_packages.py
@@ -1110,7 +1110,7 @@ def prepare_readme_file(context):
 template_name="PROVIDER_README", context=context, extension=".rst"
 )
 readme_file_path = os.path.join(TARGET_PROVIDER_PACKAGES_PATH, 
"README.rst")
-with open(readme_file_path, "wt") as readme_file:
+with open(readme_file_path, "w") as readme_file:
 readme_file.write(readme_content)
 
 
@@ -1182,7 +1182,7 @@ def 
mark_latest_changes_as_documentation_only(provider_package_id: str, latest_c
 "as doc-only changes!"
 )
 with open(
-os.path.join(provider_details.source_provider_package_path, 
".latest-doc-only-change.txt"), "tw"
+os.path.join(provider_details.source_provider_package_path, 
".latest-doc-only-change.txt"), "w"
 ) as f:
 f.write(la

[airflow-site] branch gh-pages updated: Deploying to gh-pages from @ 4d56e84e39ee4fc1cdeb4b0e90a77bb78e3cddf2 🚀

2023-01-11 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/airflow-site.git


The following commit(s) were added to refs/heads/gh-pages by this push:
 new fbf3d6650c Deploying to gh-pages from  @ 
4d56e84e39ee4fc1cdeb4b0e90a77bb78e3cddf2 🚀
fbf3d6650c is described below

commit fbf3d6650c9dcba660e5f981583fb96d47c7ffce
Author: potiuk 
AuthorDate: Wed Jan 11 23:52:42 2023 +

Deploying to gh-pages from  @ 4d56e84e39ee4fc1cdeb4b0e90a77bb78e3cddf2 🚀
---
 blog/airflow-1.10.10/index.html|   4 +-
 blog/airflow-1.10.12/index.html|   4 +-
 blog/airflow-1.10.8-1.10.9/index.html  |   4 +-
 blog/airflow-2.2.0/index.html  |   4 +-
 blog/airflow-2.3.0/index.html  |   4 +-
 blog/airflow-2.4.0/index.html  |   4 +-
 blog/airflow-2.5.0/index.html  |   4 +-
 blog/airflow-survey-2020/index.html|   4 +-
 blog/airflow-survey-2022/index.html|   4 +-
 blog/airflow-survey/index.html |   4 +-
 blog/airflow-two-point-oh-is-here/index.html   |   4 +-
 blog/airflow_summit_2021/index.html|   4 +-
 blog/airflow_summit_2022/index.html|   4 +-
 blog/announcing-new-website/index.html |   4 +-
 blog/apache-airflow-for-newcomers/index.html   |   4 +-
 .../index.html |   4 +-
 .../index.html |   4 +-
 .../index.html |   4 +-
 .../index.html |   4 +-
 .../index.html |   4 +-
 .../index.html |   4 +-
 docs/apache-airflow/2.5.0/_static/globaltoc.js |  25 ++
 docs/apache-airflow/2.5.0/best-practices.html  |   2 +-
 .../2.5.0/cli-and-env-variables-ref.html   |   2 +-
 docs/apache-airflow/2.5.0/configurations-ref.html  |   2 +-
 docs/apache-airflow/2.5.0/dag-run.html |   2 +-
 docs/apache-airflow/2.5.0/dag-serialization.html   |   2 +-
 docs/apache-airflow/2.5.0/database-erd-ref.html|   2 +-
 .../2.5.0/deprecated-rest-api-ref.html |   2 +-
 docs/apache-airflow/2.5.0/extra-packages-ref.html  |   2 +-
 docs/apache-airflow/2.5.0/faq.html |   2 +-
 docs/apache-airflow/2.5.0/genindex.html|   2 +-
 docs/apache-airflow/2.5.0/index.html   |   2 +-
 docs/apache-airflow/2.5.0/integration.html |   2 +-
 docs/apache-airflow/2.5.0/kubernetes.html  |   2 +-
 docs/apache-airflow/2.5.0/license.html |   2 +-
 docs/apache-airflow/2.5.0/lineage.html |   2 +-
 docs/apache-airflow/2.5.0/listeners.html   |   2 +-
 docs/apache-airflow/2.5.0/modules_management.html  |   2 +-
 .../2.5.0/operators-and-hooks-ref.html |   2 +-
 docs/apache-airflow/2.5.0/plugins.html |   2 +-
 docs/apache-airflow/2.5.0/privacy_notice.html  |   2 +-
 .../2.5.0/production-deployment.html   |   2 +-
 docs/apache-airflow/2.5.0/project.html |   2 +-
 docs/apache-airflow/2.5.0/py-modindex.html |   2 +-
 docs/apache-airflow/2.5.0/release-process.html |   2 +-
 docs/apache-airflow/2.5.0/release_notes.html   |   2 +-
 docs/apache-airflow/2.5.0/templates-ref.html   |   2 +-
 docs/apache-airflow/2.5.0/timezone.html|   2 +-
 docs/apache-airflow/2.5.0/usage-cli.html   |   2 +-
 docs/apache-airflow/stable/_static/globaltoc.js|  25 ++
 docs/apache-airflow/stable/best-practices.html |   2 +-
 .../stable/cli-and-env-variables-ref.html  |   2 +-
 docs/apache-airflow/stable/configurations-ref.html |   2 +-
 docs/apache-airflow/stable/dag-run.html|   2 +-
 docs/apache-airflow/stable/dag-serialization.html  |   2 +-
 docs/apache-airflow/stable/database-erd-ref.html   |   2 +-
 .../stable/deprecated-rest-api-ref.html|   2 +-
 docs/apache-airflow/stable/extra-packages-ref.html |   2 +-
 docs/apache-airflow/stable/faq.html|   2 +-
 docs/apache-airflow/stable/genindex.html   |   2 +-
 docs/apache-airflow/stable/index.html  |   2 +-
 docs/apache-airflow/stable/integration.html|   2 +-
 docs/apache-airflow/stable/kubernetes.html |   2 +-
 docs/apache-airflow/stable/license.html|   2 +-
 docs/apache-airflow/stable/lineage.html|   2 +-
 docs/apache-airflow/stable/listeners.html  |   2 +-
 docs/apache-airflow/stable/modules_management.html |   2 +-
 .../stable/operators-and-hooks-ref.html|   2 +-
 docs/apache-airflow/stable/plugins.html|   2 +-
 docs/apache-airflow/stable/privacy_notice.html |   2 +-
 .../stable/production-deployment.html  |   2 +-
 docs/apache-airflow/stable/p

[airflow] branch v2-5-test updated: Change Architecture and OperatingSystem classies into Enums (#28627)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v2-5-test by this push:
 new 93a7e5fc18 Change Architecture and OperatingSystem classies into Enums 
(#28627)
93a7e5fc18 is described below

commit 93a7e5fc18d31792df26ff5f828bcf36bf3eb21f
Author: Jarek Potiuk 
AuthorDate: Mon Jan 2 05:58:54 2023 +0100

Change Architecture and OperatingSystem classies into Enums (#28627)

Since they are objects already, there is a very little overhead
into making them Enums and it has the nice property of being able
to add type hinting for the returned values.

(cherry picked from commit 8a15557f6fe73feab0e49f97b295160820ad7cfd)
---
 airflow/cli/commands/info_command.py   | 22 +-
 .../in_container/run_provider_yaml_files_check.py  |  2 +-
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/airflow/cli/commands/info_command.py 
b/airflow/cli/commands/info_command.py
index a8a7c760ab..7261dfc484 100644
--- a/airflow/cli/commands/info_command.py
+++ b/airflow/cli/commands/info_command.py
@@ -23,6 +23,7 @@ import os
 import platform
 import subprocess
 import sys
+from enum import Enum
 from urllib.parse import urlsplit, urlunsplit
 
 import httpx
@@ -124,16 +125,17 @@ class PiiAnonymizer(Anonymizer):
 return urlunsplit((url_parts.scheme, netloc, url_parts.path, 
url_parts.query, url_parts.fragment))
 
 
-class OperatingSystem:
+class OperatingSystem(Enum):
 """Operating system."""
 
 WINDOWS = "Windows"
 LINUX = "Linux"
 MACOSX = "Mac OS"
 CYGWIN = "Cygwin"
+UNKNOWN = "Unknown"
 
 @staticmethod
-def get_current() -> str | None:
+def get_current() -> OperatingSystem:
 """Get current operating system."""
 if os.name == "nt":
 return OperatingSystem.WINDOWS
@@ -143,24 +145,26 @@ class OperatingSystem:
 return OperatingSystem.MACOSX
 elif "cygwin" in sys.platform:
 return OperatingSystem.CYGWIN
-return None
+return OperatingSystem.UNKNOWN
 
 
-class Architecture:
+class Architecture(Enum):
 """Compute architecture."""
 
 X86_64 = "x86_64"
 X86 = "x86"
 PPC = "ppc"
 ARM = "arm"
+UNKNOWN = "unknown"
 
 @staticmethod
-def get_current():
+def get_current() -> Architecture:
 """Get architecture."""
-return _MACHINE_TO_ARCHITECTURE.get(platform.machine().lower())
+current_architecture = 
_MACHINE_TO_ARCHITECTURE.get(platform.machine().lower())
+return current_architecture if current_architecture else 
Architecture.UNKNOWN
 
 
-_MACHINE_TO_ARCHITECTURE = {
+_MACHINE_TO_ARCHITECTURE: dict[str, Architecture] = {
 "amd64": Architecture.X86_64,
 "x86_64": Architecture.X86_64,
 "i686-64": Architecture.X86_64,
@@ -259,8 +263,8 @@ class AirflowInfo:
 python_version = sys.version.replace("\n", " ")
 
 return [
-("OS", operating_system or "NOT AVAILABLE"),
-("architecture", arch or "NOT AVAILABLE"),
+("OS", operating_system.value),
+("architecture", arch.value),
 ("uname", str(uname)),
 ("locale", str(_locale)),
 ("python_version", python_version),
diff --git a/scripts/in_container/run_provider_yaml_files_check.py 
b/scripts/in_container/run_provider_yaml_files_check.py
index cab365eb21..c2cfe565ae 100755
--- a/scripts/in_container/run_provider_yaml_files_check.py
+++ b/scripts/in_container/run_provider_yaml_files_check.py
@@ -444,7 +444,7 @@ def 
check_providers_have_all_documentation_files(yaml_files: dict[str, dict]):
 
 
 if __name__ == "__main__":
-architecture = Architecture().get_current()
+architecture = Architecture.get_current()
 console.print(f"Verifying packages on {architecture} architecture. 
Platform: {platform.machine()}.")
 provider_files_pattern = 
pathlib.Path(ROOT_DIR).glob("airflow/providers/**/provider.yaml")
 all_provider_files = sorted(str(path) for path in provider_files_pattern)



[GitHub] [airflow] pierrejeambrun commented on pull request #28619: Fix code readability, add docstrings to json_client

2023-01-11 Thread GitBox


pierrejeambrun commented on PR #28619:
URL: https://github.com/apache/airflow/pull/28619#issuecomment-1379624678

   Needs to be applied on top of 
`https://github.com/apache/airflow/pull/27640`, there would be no conflicts. 
Marking for 2.6.0 as it would be easier to handle then.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow-site] branch main updated: Adding correct reference to globaltoc.js (#715)

2023-01-11 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow-site.git


The following commit(s) were added to refs/heads/main by this push:
 new 4d56e84e39 Adding correct reference to globaltoc.js (#715)
4d56e84e39 is described below

commit 4d56e84e39ee4fc1cdeb4b0e90a77bb78e3cddf2
Author: Amogh Desai 
AuthorDate: Thu Jan 12 05:13:14 2023 +0530

Adding correct reference to globaltoc.js (#715)

Co-authored-by: Amogh 
---
 .../apache-airflow/2.5.0/_static/globaltoc.js  | 25 ++
 .../apache-airflow/2.5.0/best-practices.html   |  2 +-
 .../2.5.0/cli-and-env-variables-ref.html   |  2 +-
 .../apache-airflow/2.5.0/configurations-ref.html   |  2 +-
 docs-archive/apache-airflow/2.5.0/dag-run.html |  2 +-
 .../apache-airflow/2.5.0/dag-serialization.html|  2 +-
 .../apache-airflow/2.5.0/database-erd-ref.html |  2 +-
 .../2.5.0/deprecated-rest-api-ref.html |  2 +-
 .../apache-airflow/2.5.0/extra-packages-ref.html   |  2 +-
 docs-archive/apache-airflow/2.5.0/faq.html |  2 +-
 docs-archive/apache-airflow/2.5.0/genindex.html|  2 +-
 docs-archive/apache-airflow/2.5.0/index.html   |  2 +-
 docs-archive/apache-airflow/2.5.0/integration.html |  2 +-
 docs-archive/apache-airflow/2.5.0/kubernetes.html  |  2 +-
 docs-archive/apache-airflow/2.5.0/license.html |  2 +-
 docs-archive/apache-airflow/2.5.0/lineage.html |  2 +-
 docs-archive/apache-airflow/2.5.0/listeners.html   |  2 +-
 .../apache-airflow/2.5.0/modules_management.html   |  2 +-
 .../2.5.0/operators-and-hooks-ref.html |  2 +-
 docs-archive/apache-airflow/2.5.0/plugins.html |  2 +-
 .../apache-airflow/2.5.0/privacy_notice.html   |  2 +-
 .../2.5.0/production-deployment.html   |  2 +-
 docs-archive/apache-airflow/2.5.0/project.html |  2 +-
 docs-archive/apache-airflow/2.5.0/py-modindex.html |  2 +-
 .../apache-airflow/2.5.0/release-process.html  |  2 +-
 .../apache-airflow/2.5.0/release_notes.html|  2 +-
 .../apache-airflow/2.5.0/templates-ref.html|  2 +-
 docs-archive/apache-airflow/2.5.0/timezone.html|  2 +-
 docs-archive/apache-airflow/2.5.0/usage-cli.html   |  2 +-
 29 files changed, 53 insertions(+), 28 deletions(-)

diff --git a/docs-archive/apache-airflow/2.5.0/_static/globaltoc.js 
b/docs-archive/apache-airflow/2.5.0/_static/globaltoc.js
new file mode 100644
index 00..a9aa7ef1f5
--- /dev/null
+++ b/docs-archive/apache-airflow/2.5.0/_static/globaltoc.js
@@ -0,0 +1,25 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+// The script adds a 'click' eventListener to the 'span.toctree-expand' 
element that makes the active submenus collapsible.
+// To make the submenus collapsible is sufficient to add/remove the 'current' 
class to the active 'li.toctree-l1' element.
+$("span.toctree-expand").on("click", function() {
+$(this).closest("li").toggleClass("current");
+  });
+  
\ No newline at end of file
diff --git a/docs-archive/apache-airflow/2.5.0/best-practices.html 
b/docs-archive/apache-airflow/2.5.0/best-practices.html
index eaa1a322be..e7d025f727 100644
--- a/docs-archive/apache-airflow/2.5.0/best-practices.html
+++ b/docs-archive/apache-airflow/2.5.0/best-practices.html
@@ -1802,7 +1802,7 @@ the full lifecycle of a DAG - from parsing to 
execution.
 
 
 
-
+
 
 
 
\ No newline at end of file
diff --git a/docs-archive/apache-airflow/2.5.0/cli-and-env-variables-ref.html 
b/docs-archive/apache-airflow/2.5.0/cli-and-env-variables-ref.html
index e388d92b7e..7798069605 100644
--- a/docs-archive/apache-airflow/2.5.0/cli-and-env-variables-ref.html
+++ b/docs-archive/apache-airflow/2.5.0/cli-and-env-variables-ref.html
@@ -5801,7 +5801,7 @@ Replace the {KEY}
 
 
-
+
 
 
 
\ No newline at end of file
diff --git a/docs-archive/apache-airflow/2.5.0/configurations-ref.html 
b/docs-archive/apache-airflow/2.5.0/configurations-ref.html
index 5b4fc135ca..63e0816713 100644
--- a/docs-archive/apache-airflow/2.5.0/configurations-ref.html
+++ b/docs-archive/apa

[GitHub] [airflow-site] potiuk closed issue #713: Docs site making request to globaltoc.js but it doesn't exist

2023-01-11 Thread GitBox


potiuk closed issue #713: Docs site making request to globaltoc.js but it 
doesn't exist
URL: https://github.com/apache/airflow-site/issues/713


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow-site] potiuk merged pull request #715: Adding correct reference to globaltoc.js

2023-01-11 Thread GitBox


potiuk merged PR #715:
URL: https://github.com/apache/airflow-site/pull/715


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] pierrejeambrun commented on pull request #28568: Update codespell and fix typos

2023-01-11 Thread GitBox


pierrejeambrun commented on PR #28568:
URL: https://github.com/apache/airflow/pull/28568#issuecomment-1379619168

   Needs to be applied on top of https://github.com/apache/airflow/pull/27822 
which will most certainly make it into 2.6.0 (improvement). Marking for 2.6.0 
for now and skipping.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #28406: Add defer mode to GKECreateClusterOperator and GKEDeleteClusterOperator

2023-01-11 Thread GitBox


potiuk commented on PR #28406:
URL: https://github.com/apache/airflow/pull/28406#issuecomment-1379612860

   conflicts.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #27833: Add deferrable mode for Big Query Transfer operator

2023-01-11 Thread GitBox


potiuk commented on PR #27833:
URL: https://github.com/apache/airflow/pull/27833#issuecomment-1379613388

   conflicts as well (and comments to answer).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch v2-5-test updated: Update pre-commit hooks (#28567)

2023-01-11 Thread pierrejeambrun
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v2-5-test by this push:
 new d65fd66e46 Update pre-commit hooks (#28567)
d65fd66e46 is described below

commit d65fd66e463ed8ce98d4060b7c8ae79626618277
Author: KarshVashi <41749592+karshva...@users.noreply.github.com>
AuthorDate: Sat Dec 24 01:03:59 2022 +

Update pre-commit hooks (#28567)

(cherry picked from commit 837e0fe2ea8859ae879d8382142c29a6416f02b9)
---
 .pre-commit-config.yaml  |  8 
 airflow/www/fab_security/manager.py  |  2 +-
 .../src/airflow_breeze/commands/testing_commands.py  |  2 +-
 dev/provider_packages/prepare_provider_packages.py   | 16 
 docs/exts/docs_build/docs_builder.py |  4 ++--
 docs/exts/extra_files_with_substitutions.py  |  2 +-
 docs/exts/provider_init_hack.py  |  2 +-
 kubernetes_tests/test_base.py|  2 +-
 tests/jobs/test_triggerer_job.py |  2 +-
 9 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index a6ed9b1f4d..577f0a1dda 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -148,7 +148,7 @@ repos:
   
\.cfg$|\.conf$|\.ini$|\.ldif$|\.properties$|\.readthedocs$|\.service$|\.tf$|Dockerfile.*$
   # Keep version of black in sync wit blacken-docs and pre-commit-hook-names
   - repo: https://github.com/psf/black
-rev: 22.3.0
+rev: 22.12.0
 hooks:
   - id: black
 name: Run black (python formatter)
@@ -210,7 +210,7 @@ repos:
 pass_filenames: true
   # TODO: Bump to Python 3.8 when support for Python 3.7 is dropped in Airflow.
   - repo: https://github.com/asottile/pyupgrade
-rev: v2.32.1
+rev: v3.3.1
 hooks:
   - id: pyupgrade
 name: Upgrade Python code automatically
@@ -259,7 +259,7 @@ repos:
   ^airflow/_vendor/
 additional_dependencies: ['toml']
   - repo: https://github.com/asottile/yesqa
-rev: v1.3.0
+rev: v1.4.0
 hooks:
   - id: yesqa
 name: Remove unnecessary noqa statements
@@ -268,7 +268,7 @@ repos:
   ^airflow/_vendor/
 additional_dependencies: ['flake8>=4.0.1']
   - repo: https://github.com/ikamensh/flynt
-rev: '0.76'
+rev: '0.77'
 hooks:
   - id: flynt
 name: Run flynt string format converter for Python
diff --git a/airflow/www/fab_security/manager.py 
b/airflow/www/fab_security/manager.py
index 96649b046b..ea8918053c 100644
--- a/airflow/www/fab_security/manager.py
+++ b/airflow/www/fab_security/manager.py
@@ -1013,7 +1013,7 @@ class BaseSecurityManager:
 
 @staticmethod
 def ldap_extract(ldap_dict: dict[str, list[bytes]], field_name: str, 
fallback: str) -> str:
-raw_value = ldap_dict.get(field_name, [bytes()])
+raw_value = ldap_dict.get(field_name, [b""])
 # decode - if empty string, default to fallback, otherwise take first 
element
 return raw_value[0].decode("utf-8") or fallback
 
diff --git a/dev/breeze/src/airflow_breeze/commands/testing_commands.py 
b/dev/breeze/src/airflow_breeze/commands/testing_commands.py
index 33781e3373..58d0b509a9 100644
--- a/dev/breeze/src/airflow_breeze/commands/testing_commands.py
+++ b/dev/breeze/src/airflow_breeze/commands/testing_commands.py
@@ -181,7 +181,7 @@ def _run_test(
 for container_id in container_ids:
 dump_path = FILES_DIR / 
f"container_logs_{container_id}_{date_str}.log"
 get_console(output=output).print(f"[info]Dumping container 
{container_id} to {dump_path}")
-with open(dump_path, "wt") as outfile:
+with open(dump_path, "w") as outfile:
 run_command(["docker", "logs", container_id], check=False, 
stdout=outfile)
 finally:
 run_command(
diff --git a/dev/provider_packages/prepare_provider_packages.py 
b/dev/provider_packages/prepare_provider_packages.py
index 2ef0859c89..ed1afb1e8f 100755
--- a/dev/provider_packages/prepare_provider_packages.py
+++ b/dev/provider_packages/prepare_provider_packages.py
@@ -1110,7 +1110,7 @@ def prepare_readme_file(context):
 template_name="PROVIDER_README", context=context, extension=".rst"
 )
 readme_file_path = os.path.join(TARGET_PROVIDER_PACKAGES_PATH, 
"README.rst")
-with open(readme_file_path, "wt") as readme_file:
+with open(readme_file_path, "w") as readme_file:
 readme_file.write(readme_content)
 
 
@@ -1182,7 +1182,7 @@ def 
mark_latest_changes_as_documentation_only(provider_package_id: str, latest_c
 "as doc-only changes!"
 )
 with open(
-os.path.join(provider_details.source_provider_package_path, 
".latest-doc-only

  1   2   3   4   >