[GitHub] [airflow] potiuk commented on issue #29648: Optionally exclude task deferral time from overall runtime of the task

2023-02-20 Thread via GitHub


potiuk commented on issue #29648:
URL: https://github.com/apache/airflow/issues/29648#issuecomment-1438000722

   I am inclined to close it as "won't do" :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on pull request #29616: Refactor docker-compose quick start test

2023-02-20 Thread via GitHub


Taragolis commented on PR #29616:
URL: https://github.com/apache/airflow/pull/29616#issuecomment-1437988917

   Something new 😮 
   
   ```console
   airflow-scheduler_1  | 
 airflow-scheduler_1  | BACKEND=redis
 airflow-scheduler_1  | DB_HOST=redis
 airflow-scheduler_1  | DB_PORT=6379
 airflow-scheduler_1  | 
 airflow-scheduler_1  | 
/home/airflow/.local/lib/python3.7/site-packages/airflow/models/base.py:49 
MovedIn20Warning: Deprecated API features detected! These feature(s) are not 
compatible with SQLAlchemy 2.0. To prevent incompatible upgrades prior to 
updating applications, ensure requirements files are pinned to 
"sqlalchemy<2.0". Set environment variable SQLALCHEMY_WARN_20=1 to show all 
deprecation warnings.  Set environment variable 
SQLALCHEMY_SILENCE_UBER_WARNING=1 to silence this message. (Background on 
SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
 airflow-scheduler_1  |      _
 airflow-scheduler_1  |  |__( )_  __/__  /  __
 airflow-scheduler_1  |   /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
 airflow-scheduler_1  | ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
 airflow-scheduler_1  |  _/_/  |_/_/  /_//_//_/  \//|__/
 airflow-scheduler_1  | [2023-02-21T07:28:42.954+] 
{executor_loader.py:114} INFO - Loaded executor: CeleryExecutor
 airflow-scheduler_1  | [2023-02-21T07:28:43.045+] 
{scheduler_job.py:724} INFO - Starting the scheduler
 airflow-scheduler_1  | [2023-02-21T07:28:43.054+] 
{scheduler_job.py:731} INFO - Processing each file at most -1 times
 airflow-scheduler_1  | [2023-02-21T07:28:43.061+] {manager.py:164} 
INFO - Launched DagFileProcessorManager with pid: 32
 airflow-scheduler_1  | [2023-02-21T07:28:43.071+] 
{scheduler_job.py:1437} INFO - Resetting orphaned tasks for active dag runs
 airflow-scheduler_1  | [2023-02-21T07:28:43.080+] {settings.py:61} 
INFO - Configured default timezone Timezone('UTC')
 airflow-scheduler_1  | [2023-02-21T07:28:46.099+] 
{scheduler_job.py:788} ERROR - Exception when executing 
SchedulerJob._run_scheduler_loop
 airflow-scheduler_1  | Traceback (most recent call last):
 airflow-scheduler_1  |   File 
"/home/airflow/.local/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py",
 line 771, in _execute
 airflow-scheduler_1  | self._run_scheduler_loop()
 airflow-scheduler_1  |   File 
"/home/airflow/.local/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py",
 line 899, in _run_scheduler_loop
 airflow-scheduler_1  | num_queued_tis = self._do_scheduling(session)
 airflow-scheduler_1  |   File 
"/home/airflow/.local/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py",
 line 1006, in _do_scheduling
 airflow-scheduler_1  | num_queued_tis = 
self._critical_section_enqueue_task_instances(session=session)
 airflow-scheduler_1  |   File 
"/home/airflow/.local/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py",
 line 589, in _critical_section_enqueue_task_instances
 airflow-scheduler_1  | queued_tis = 
self._executable_task_instances_to_queued(max_tis, session=session)
 airflow-scheduler_1  |   File 
"/home/airflow/.local/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py",
 line 280, in _executable_task_instances_to_queued
 airflow-scheduler_1  | pools = Pool.slots_stats(lock_rows=True, 
session=session)
 airflow-scheduler_1  |   File 
"/home/airflow/.local/lib/python3.7/site-packages/airflow/utils/session.py", 
line 73, in wrapper
 airflow-scheduler_1  | return func(*args, **kwargs)
 airflow-scheduler_1  |   File 
"/home/airflow/.local/lib/python3.7/site-packages/airflow/models/pool.py", line 
174, in slots_stats
 airflow-scheduler_1  | .group_by(TaskInstance.pool, TaskInstance.state)
 airflow-scheduler_1  |   File 
"/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", 
line 2773, in all
 airflow-scheduler_1  | return self._iter().all()
 airflow-scheduler_1  |   File 
"/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/result.py", 
line 1129, in all
 airflow-scheduler_1  | return self._allrows()
 airflow-scheduler_1  |   File 
"/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/result.py", 
line 401, in _allrows
 airflow-scheduler_1  | rows = self._fetchall_impl()
 airflow-scheduler_1  |   File 
"/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/result.py", 
line 1813, in _fetchall_impl
 airflow-scheduler_1  | return list(self.iterator)
 airflow-scheduler_1  |   File 
"/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/loading.py", 
line 147, in chunks
 airflow-scheduler_1  | fetch = cursor._raw_all_rows()
 airflow-scheduler_1  |   File 
"/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/result.py", 
l

[GitHub] [airflow] eladkal commented on issue #29607: Status of testing Providers that were prepared on February 18, 2023

2023-02-20 Thread via GitHub


eladkal commented on issue #29607:
URL: https://github.com/apache/airflow/issues/29607#issuecomment-1437961577

   > #29552 Tested and it works. From my perspective there are 2 issues.
   > 
   > * proper documentation on how to use `drive_folder` and `folder_id` 
arguments together (as they can be confusing)
   > * log doesn't print the exact path where the file will be located. (I 
believe log shouldn't print the path, or print it correctly, is it worth making 
X amount of calls and go up the folder tree, since google returns only parent 
directory.
   > 
   > I am happy to address (in this or next release) them just let me know your 
thoughts
   
   Feel free to raise PR to adress this since these are not regressions it will 
not block this release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Lee-W closed pull request #29649: Livy operator async (fixing failed test in PR 29047)

2023-02-20 Thread via GitHub


Lee-W closed pull request #29649: Livy operator async (fixing failed test in PR 
29047)
URL: https://github.com/apache/airflow/pull/29649


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Lee-W commented on pull request #29649: Livy operator async (fixing failed test in PR 29047)

2023-02-20 Thread via GitHub


Lee-W commented on PR #29649:
URL: https://github.com/apache/airflow/pull/29649#issuecomment-1437959795

   Thanks @potiuk . I'll close this one and rebase on #29047 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #29649: Livy operator async (fixing failed test in PR 29047)

2023-02-20 Thread via GitHub


potiuk commented on PR #29649:
URL: https://github.com/apache/airflow/pull/29649#issuecomment-1437955076

   Please rebase. I fixed some "queue" stuck for our CI builds and you need to 
rebase to re-run tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #29644: Remove <2.0.0 limit on google-cloud-bigtable

2023-02-20 Thread via GitHub


potiuk commented on PR #29644:
URL: https://github.com/apache/airflow/pull/29644#issuecomment-1437952272

   > @uranusjr think I got everything. Last test failed due to 2 hour queue 
timeout.
   
   Yeah - we had some "blocked" build queue - I unblocked it now. Rebasing to 
re-run.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] aru-trackunit commented on issue #29607: Status of testing Providers that were prepared on February 18, 2023

2023-02-20 Thread via GitHub


aru-trackunit commented on issue #29607:
URL: https://github.com/apache/airflow/issues/29607#issuecomment-1437952596

   #29552 Tested and it works. 
   From my perspective there are 2 issues.
   - proper documentation on how to use `drive_folder` and `folder_id` 
arguments together (as they can be confusing)
   - log doesn't print the exact path where the file will be located. (I 
believe log shouldn't print the path, or print it correctly, is it worth making 
X amount of calls and go up the folder tree, since google returns only parent 
directory.
   
   I am happy to address (in this or next release) them just let me know your 
thoughts


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] r-richmond commented on pull request #29644: Remove <2.0.0 limit on google-cloud-bigtable

2023-02-20 Thread via GitHub


r-richmond commented on PR #29644:
URL: https://github.com/apache/airflow/pull/29644#issuecomment-1437945839

   @uranusjr think I got everything. Last test failed due to 2 hour queue 
timeout.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow-ci-infra] branch main updated: Update docker runner to 302.1

2023-02-20 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow-ci-infra.git


The following commit(s) were added to refs/heads/main by this push:
 new 80680f8  Update docker runner to 302.1
80680f8 is described below

commit 80680f83dca6c63981025855142770518e23e714
Author: Jarek Potiuk 
AuthorDate: Tue Feb 21 07:39:13 2023 +0100

Update docker runner to 302.1
---
 github-runner-ami/packer/vars/variables.pkrvars.hcl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/github-runner-ami/packer/vars/variables.pkrvars.hcl 
b/github-runner-ami/packer/vars/variables.pkrvars.hcl
index 519b4ce..66547db 100644
--- a/github-runner-ami/packer/vars/variables.pkrvars.hcl
+++ b/github-runner-ami/packer/vars/variables.pkrvars.hcl
@@ -19,5 +19,5 @@ vpc_id = "vpc-d73487bd"
 ami_name = "airflow-runner-ami"
 aws_regions = ["eu-central-1", "us-east-2"]
 packer_role_arn = "arn:aws:iam::827901512104:role/packer-role"
-runner_version = "2.301.1-airflow1"
+runner_version = "2.302.1-airflow1"
 session_manager_instance_profile_name = "packer_ssm_instance_profile"



[GitHub] [airflow] potiuk closed issue #29621: Fix adding annotations for dag persistence PVC

2023-02-20 Thread via GitHub


potiuk closed issue #29621: Fix adding annotations for dag persistence PVC
URL: https://github.com/apache/airflow/issues/29621


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk closed issue #28745: annotations in logs pvc

2023-02-20 Thread via GitHub


potiuk closed issue #28745: annotations in logs pvc
URL: https://github.com/apache/airflow/issues/28745


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #25297: on_failure_callback is not called when task is terminated externally

2023-02-20 Thread via GitHub


potiuk commented on issue #25297:
URL: https://github.com/apache/airflow/issues/25297#issuecomment-1437923131

   @seub - please provide a reproducible case and logs if you would like to 
report it. It might be something specific for your case, and to be brutally 
honest it's much easier for a reporter of one issue to provide the evidence 
rather than maintainers who look at 100s of issues and PRs a day to try to 
reproduce an issue that someone claims is happening without shouwing their 
evidences. 
   
   Please be empathetic towards maintainers who have a lot of things to handle. 
If you report "it does not work for me" for an issue that is already closed, 
you will nearly universaly be asked to report the issue and provide new 
evidences in a new issue. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #29648: Optionally exclude task deferral time from overall runtime of the task

2023-02-20 Thread via GitHub


potiuk commented on issue #29648:
URL: https://github.com/apache/airflow/issues/29648#issuecomment-1437918635

   > As an alternative and maybe simpler feature it'd also suit my use case if 
we could make it so that we are optionally able to "reset" the execution timer 
when the task resumes from deferral. Does that sound more reasonable?
   
   Not really - because the task can be resumed from deferral for different 
reasons. Many od the deferred tasks will have several different  resumable 
functions - and might go through different stages.  For example they can run 
few times resume in function a, different number of times in function b (both 
quick usually) , then they can run for a long time in function c and then make 
another several times resume in function d (also quick usually).
   
   Each of those resumed runs would have different expectations on how long 
they can run on its own - some of the checks will be quick. some of them might 
take longer.  So if you would like to have timeout separately for ach resume, 
you would have to add a feature of having different timeout configured for each 
resumed function. And that get's rather complex to configure, manage, and to 
keep control of. 
   
   I think the cumulative time is the only "reasonable" approach but as 
@andrewgodwin confirmed - it's non-trivial and inconsistent with sensors. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] andrewgodwin commented on issue #29648: Optionally exclude task deferral time from overall runtime of the task

2023-02-20 Thread via GitHub


andrewgodwin commented on issue #29648:
URL: https://github.com/apache/airflow/issues/29648#issuecomment-1437917495

   Yes, the changes required here are non-trivial - you'd need some sort of 
extra column to track cumulative runtime and then have the workers add values 
to that whenever they finished running a task section.
   
   Even if we could track such timing, I'd hesitate to then make timeouts use 
it, as that's breaking a core assumption about how those work - would we apply 
the same logic to rescheduling sensors, which act the same way as deferred 
operators? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #29616: Refactor docker-compose quick start test

2023-02-20 Thread via GitHub


potiuk commented on PR #29616:
URL: https://github.com/apache/airflow/pull/29616#issuecomment-1437911170

   > That is more interesting. I've seen before that statics checks sometimes 
failed with particular this version of docker 20.10.23+azure-2 and didn't seen 
that this happen in docker without azure-X. 
   
   Those tests seems to only fail on public runners (that's where azure-* is 
added in version) - so I think this is just coincidence.
   
   > Thats all, seems like it scheduler is just hang but service reported that 
it healthy. Is this a problem with recent changes in health check 
https://github.com/apache/airflow/pull/29408 or maybe problem with simple http 
server in scheduler?
   
   I agree. I think the problem was triggered by #29408 - but It would be great 
to get to the bottom of it.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #29616: Refactor docker-compose quick start test

2023-02-20 Thread via GitHub


uranusjr commented on code in PR #29616:
URL: https://github.com/apache/airflow/pull/29616#discussion_r1112580793


##
docker_tests/test_docker_compose_quick_start.py:
##
@@ -114,53 +104,60 @@ def wait_for_terminal_dag_state(dag_id, dag_run_id):
 break
 
 
-def test_trigger_dag_and_wait_for_result():
+def test_trigger_dag_and_wait_for_result(tmp_path_factory, monkeypatch):
+"""Simple test which reproduce setup docker-compose environment and 
trigger example dag."""
+tmp_dir = tmp_path_factory.mktemp("airflow-quick-start")
+monkeypatch.chdir(tmp_dir)
+monkeypatch.setenv("AIRFLOW_IMAGE_NAME", docker_image)
+monkeypatch.setenv("COMPOSE_PROJECT_NAME", "quick-start")
+
 compose_file_path = (
 SOURCE_ROOT / "docs" / "apache-airflow" / "howto" / "docker-compose" / 
"docker-compose.yaml"
 )
+copyfile(compose_file_path, tmp_dir / "docker-compose.yaml")
 
-with tempfile.TemporaryDirectory() as tmp_dir, tmp_chdir(tmp_dir), 
mock.patch.dict(
-"os.environ", AIRFLOW_IMAGE_NAME=docker_image
-):
-copyfile(str(compose_file_path), f"{tmp_dir}/docker-compose.yaml")
-os.mkdir(f"{tmp_dir}/dags")
-os.mkdir(f"{tmp_dir}/logs")
-os.mkdir(f"{tmp_dir}/plugins")
-(Path(tmp_dir) / 
".env").write_text(f"AIRFLOW_UID={subprocess.check_output(['id', 
'-u']).decode()}\n")
-print(".emv=", (Path(tmp_dir) / ".env").read_text())
-copyfile(
-str(SOURCE_ROOT / "airflow" / "example_dags" / 
"example_bash_operator.py"),
-f"{tmp_dir}/dags/example_bash_operator.py",
-)
+# Create required directories for docker compose quick start howto
+for subdir in ("dags", "logs", "plugins"):
+(tmp_dir / subdir).mkdir()
+
+dot_env_file = Path(tmp_dir) / ".env"

Review Comment:
   ```suggestion
   dot_env_file = Path(tmp_dir, ".env")
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #29616: Refactor docker-compose quick start test

2023-02-20 Thread via GitHub


potiuk commented on PR #29616:
URL: https://github.com/apache/airflow/pull/29616#issuecomment-1437905137

   Closed/Reopened to re-run.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk closed pull request #29616: Refactor docker-compose quick start test

2023-02-20 Thread via GitHub


potiuk closed pull request #29616: Refactor docker-compose quick start test
URL: https://github.com/apache/airflow/pull/29616


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] rkarish closed pull request #29034: Add functionality for operators to template all eligible fields (apac…

2023-02-20 Thread via GitHub


rkarish closed pull request #29034: Add functionality for operators to template 
all eligible fields (apac…
URL: https://github.com/apache/airflow/pull/29034


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] pshrivastava27 commented on issue #29432: Jinja templating doesn't work with container_resources when using dymanic task mapping with Kubernetes Pod Operator

2023-02-20 Thread via GitHub


pshrivastava27 commented on issue #29432:
URL: https://github.com/apache/airflow/issues/29432#issuecomment-1437887986

   > @jose-lpa using `Variable.get()` in the dag script is not recommended 
because the `DagFileProcessor` process the script each X minutes, and it loads 
this variable from the DB. Also this solution works with Variable but not all 
the other jinja templates.
   > 
   > @pshrivastava27 here is a solution for your need
   > 
   > ```python
   > import datetime
   > import os
   > 
   > from airflow import models
   > from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import (
   > KubernetesPodOperator,
   > )
   > from kubernetes.client import models as k8s_models
   > 
   > dvt_image = os.environ.get("DVT_IMAGE", "dev")
   > 
   > default_dag_args = {"start_date": datetime.datetime(2022, 1, 1)}
   > 
   > 
   > class PatchedResourceRequirements(k8s_models.V1ResourceRequirements):
   > template_fields = ("limits", "requests")
   > 
   > 
   > def pod_mem():
   > return "4000M"
   > 
   > 
   > def pod_cpu():
   > return "1000m"
   > 
   > 
   > with models.DAG(
   > "sample_dag",
   > schedule_interval=None,
   > default_args=default_dag_args,
   > render_template_as_native_obj=True,
   > user_defined_macros={
   > "pod_mem": pod_mem,
   > "pod_cpu": pod_cpu,
   > },
   > ) as dag:
   > 
   > task_1 = KubernetesPodOperator(
   > task_id="task_1",
   > name="task_1",
   > namespace="default",
   > image=dvt_image,
   > cmds=["bash", "-cx"],
   > arguments=["echo hello"],
   > service_account_name="sa-k8s",
   > container_resources=PatchedResourceRequirements(
   > limits={
   > "memory": "{{ pod_mem() }}",
   > "cpu": "{{ pod_cpu() }}",
   > }
   > ),
   > startup_timeout_seconds=1800,
   > get_logs=True,
   > image_pull_policy="Always",
   > config_file="/home/airflow/composer_kube_config",
   > dag=dag,
   > )
   > 
   > task_2 = KubernetesPodOperator.partial(
   > task_id="task_2",
   > name="task_2",
   > namespace="default",
   > image=dvt_image,
   > cmds=["bash", "-cx"],
   > service_account_name="sa-k8s",
   > container_resources=PatchedResourceRequirements(
   > limits={
   > "memory": "{{ pod_mem() }}",
   > "cpu": "{{ pod_cpu() }}",
   > }
   > ),
   > startup_timeout_seconds=1800,
   > get_logs=True,
   > image_pull_policy="Always",
   > config_file="/home/airflow/composer_kube_config",
   > dag=dag,
   > ).expand(arguments=[["echo hello"]])
   > 
   > task_1 >> task_2
   > ```
   > 
   > You can do the same for the other classes if needed.
   
   Thanks for the help! @hussein-awala 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #29625: Aggressively cache entry points in process

2023-02-20 Thread via GitHub


uranusjr commented on code in PR #29625:
URL: https://github.com/apache/airflow/pull/29625#discussion_r1112560468


##
airflow/utils/entry_points.py:
##
@@ -28,26 +30,33 @@
 
 log = logging.getLogger(__name__)
 
+EPnD = Tuple[metadata.EntryPoint, metadata.Distribution]
 
-def entry_points_with_dist(group: str) -> Iterator[tuple[metadata.EntryPoint, 
metadata.Distribution]]:
-"""Retrieve entry points of the given group.
-
-This is like the ``entry_points()`` function from importlib.metadata,
-except it also returns the distribution the entry_point was loaded from.
 
-:param group: Filter results to only this entrypoint group
-:return: Generator of (EntryPoint, Distribution) objects for the specified 
groups
-"""
+@functools.lru_cache(maxsize=None)
+def _get_grouped_entry_points() -> dict[str, list[EPnD]]:
 loaded: set[str] = set()
+mapping: dict[str, list[EPnD]] = collections.defaultdict(list)
 for dist in metadata.distributions():
 try:
 key = canonicalize_name(dist.metadata["Name"])

Review Comment:
   I checked the callers and currently this is used by loading 
`airflow.plugins` and `apache_airflow_provider`. Both of these implement their 
own deduplication logic, so I think it is safe to remove this entirely. 
Although this would not actually help the latter case, which still accesses 
`metadata` anyway…



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] amoghrajesh commented on issue #29322: DAG list, sorting lost when switching page

2023-02-20 Thread via GitHub


amoghrajesh commented on issue #29322:
URL: https://github.com/apache/airflow/issues/29322#issuecomment-1437874203

   @bbovenzi I need some help understanding how to fix this issue. Please refer 
to my last comment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] amoghrajesh commented on issue #29621: Fix adding annotations for dag persistence PVC

2023-02-20 Thread via GitHub


amoghrajesh commented on issue #29621:
URL: https://github.com/apache/airflow/issues/29621#issuecomment-1437873701

   @potiuk we can close this issue. The PR is merged


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] amoghrajesh commented on issue #28745: annotations in logs pvc

2023-02-20 Thread via GitHub


amoghrajesh commented on issue #28745:
URL: https://github.com/apache/airflow/issues/28745#issuecomment-1437873467

   @potiuk we can close this. It has been merged


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Lee-W opened a new pull request, #29649: Livy operator async (fixing failed test in PR 29047)

2023-02-20 Thread via GitHub


Lee-W opened a new pull request, #29649:
URL: https://github.com/apache/airflow/pull/29649

   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a 
newsfragment file, named `{pr_number}.significant.rst` or 
`{issue_number}.significant.rst`, in 
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] seub commented on issue #25297: on_failure_callback is not called when task is terminated externally

2023-02-20 Thread via GitHub


seub commented on issue #25297:
URL: https://github.com/apache/airflow/issues/25297#issuecomment-1437822040

   @potiuk It's the same issue, and it's easy to reproduce: just interrupt a 
task manually (in the UI) and observe that `on_failure_callback` is not called. 
Please try and let me know if it's working for you or not.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] josh-fell commented on issue #29607: Status of testing Providers that were prepared on February 18, 2023

2023-02-20 Thread via GitHub


josh-fell commented on issue #29607:
URL: https://github.com/apache/airflow/issues/29607#issuecomment-1437817901

   #29565 looks good.
   
   Thanks for organizing @eladkal!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] joshuaghezzi commented on issue #29648: Optionally exclude task deferral time from overall runtime of the task

2023-02-20 Thread via GitHub


joshuaghezzi commented on issue #29648:
URL: https://github.com/apache/airflow/issues/29648#issuecomment-1437787575

   As an alternative and maybe simpler feature it'd also suit my use case if we 
could make it so that we are optionally able to "reset" the execution timer 
when the task resumes from deferral.  Does that sound more reasonable?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] hussein-awala commented on pull request #29498: add missing read for K8S config file from conn in deferred `KubernetesPodOperator`

2023-02-20 Thread via GitHub


hussein-awala commented on PR #29498:
URL: https://github.com/apache/airflow/pull/29498#issuecomment-1437703231

   > Hi! May i ask in which format you will pass the config file to trigger? So 
it will be just a file passed as a parameter to trigger? Or how?
   
   @VladaZakharova - Yes, I pass the file path and let the triggerer loads it. 
Can you check my last commit?
   
   BTW, I am not sure if loading the config file from the env var `KUBECONFIG` 
is a good idea or not, because it's difficult to decide when we need to load it 
and when we don't.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] pgagnon commented on a diff in pull request #29623: Implement file credentials provider for AWS hook AssumeRoleWithWebIdentity

2023-02-20 Thread via GitHub


pgagnon commented on code in PR #29623:
URL: https://github.com/apache/airflow/pull/29623#discussion_r1112410526


##
airflow/providers/amazon/aws/hooks/base_aws.py:
##
@@ -312,19 +312,35 @@ def _get_web_identity_credential_fetcher(
 base_session = self.basic_session._session or 
botocore.session.get_session()
 client_creator = base_session.create_client
 federation = 
self.extra_config.get("assume_role_with_web_identity_federation")
-if federation == "google":
-web_identity_token_loader = 
self._get_google_identity_token_loader()
-else:
-raise AirflowException(
-f'Unsupported federation: {federation}. Currently "google" 
only are supported.'
-)
+
+web_identity_token_loader = (
+{
+"file": self._get_file_token_loader,
+"google": self._get_google_identity_token_loader,
+}.get(federation)()
+if type(federation) == str
+else None
+)

Review Comment:
   `botocore` is a low-level API, but it is not private. I'd be _very_ happy to 
work with you toward addressing your concerns about the implementation (really).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] s0neq commented on a diff in pull request #29635: YandexCloud provider: support Yandex SDK feature "endpoint"

2023-02-20 Thread via GitHub


s0neq commented on code in PR #29635:
URL: https://github.com/apache/airflow/pull/29635#discussion_r1112407914


##
airflow/providers/yandex/hooks/yandex.py:
##
@@ -122,7 +127,8 @@ def __init__(
 self.connection = self.get_connection(self.connection_id)
 self.extras = self.connection.extra_dejson
 credentials = self._get_credentials()
-self.sdk = yandexcloud.SDK(user_agent=self.provider_user_agent(), 
**credentials)
+endpoint = self._get_field("endpoint", "api.cloud.yandex.net")
+self.sdk = yandexcloud.SDK(user_agent=self.provider_user_agent(), 
endpoint=endpoint, **credentials)

Review Comment:
   some documentation on authorization is in README 
https://github.com/yandex-cloud/python-sdk :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #29623: Implement file credentials provider for AWS hook AssumeRoleWithWebIdentity

2023-02-20 Thread via GitHub


Taragolis commented on code in PR #29623:
URL: https://github.com/apache/airflow/pull/29623#discussion_r1112407307


##
airflow/providers/amazon/aws/hooks/base_aws.py:
##
@@ -312,19 +312,35 @@ def _get_web_identity_credential_fetcher(
 base_session = self.basic_session._session or 
botocore.session.get_session()
 client_creator = base_session.create_client
 federation = 
self.extra_config.get("assume_role_with_web_identity_federation")
-if federation == "google":
-web_identity_token_loader = 
self._get_google_identity_token_loader()
-else:
-raise AirflowException(
-f'Unsupported federation: {federation}. Currently "google" 
only are supported.'
-)
+
+web_identity_token_loader = (
+{
+"file": self._get_file_token_loader,
+"google": self._get_google_identity_token_loader,
+}.get(federation)()
+if type(federation) == str
+else None
+)

Review Comment:
   Use non-documented features of `botocore` and private components 🙄 What 
could definitely goes wrong?
   
   I know at least `aiobotocore`, `PynamoDB` had this problems in the past (or 
even nowadays) and I wondering why  Airflow still do not have.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] pgagnon commented on a diff in pull request #29623: Implement file credentials provider for AWS hook AssumeRoleWithWebIdentity

2023-02-20 Thread via GitHub


pgagnon commented on code in PR #29623:
URL: https://github.com/apache/airflow/pull/29623#discussion_r1112403111


##
airflow/providers/amazon/aws/hooks/base_aws.py:
##
@@ -312,19 +312,35 @@ def _get_web_identity_credential_fetcher(
 base_session = self.basic_session._session or 
botocore.session.get_session()
 client_creator = base_session.create_client
 federation = 
self.extra_config.get("assume_role_with_web_identity_federation")
-if federation == "google":
-web_identity_token_loader = 
self._get_google_identity_token_loader()
-else:
-raise AirflowException(
-f'Unsupported federation: {federation}. Currently "google" 
only are supported.'
-)
+
+web_identity_token_loader = (
+{
+"file": self._get_file_token_loader,
+"google": self._get_google_identity_token_loader,
+}.get(federation)()
+if type(federation) == str
+else None
+)

Review Comment:
   It still doesn't address the setting where you can't set external configs. 
The objective is to allow configuring connections that use 
`AssumeRoleWithWebIdentity` without relying on environment variables and AWS's 
configuration/shared credentials file.
   
   edit: (btw thank you for clarifying, I did misread your code snippet 
earlier.)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #29635: YandexCloud provider: support Yandex SDK feature "endpoint"

2023-02-20 Thread via GitHub


Taragolis commented on code in PR #29635:
URL: https://github.com/apache/airflow/pull/29635#discussion_r1112404383


##
airflow/providers/yandex/hooks/yandex.py:
##
@@ -122,7 +127,8 @@ def __init__(
 self.connection = self.get_connection(self.connection_id)
 self.extras = self.connection.extra_dejson
 credentials = self._get_credentials()
-self.sdk = yandexcloud.SDK(user_agent=self.provider_user_agent(), 
**credentials)
+endpoint = self._get_field("endpoint", "api.cloud.yandex.net")
+self.sdk = yandexcloud.SDK(user_agent=self.provider_user_agent(), 
endpoint=endpoint, **credentials)

Review Comment:
   ```suggestion
   sdk_config = {}
   endpoint = self._get_field("endpoint", None)
   if endpoint:
   sdk_config["endpoint"] = endpoint
   self.sdk = yandexcloud.SDK(user_agent=self.provider_user_agent(), 
**sdk_config, **credentials)
   ```
   
   I would avoid specify default value in Airflow Provider, if it changed in 
Yandex Cloud SDK it would break in the future. 
   
   BTW, is that all their 
[documentation](https://cloud.yandex.com/en/docs/functions/lang/python/sdk) for 
Python SDK? 🙄 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] s0neq commented on a diff in pull request #29635: YandexCloud provider: support Yandex SDK feature "endpoint"

2023-02-20 Thread via GitHub


s0neq commented on code in PR #29635:
URL: https://github.com/apache/airflow/pull/29635#discussion_r1112403729


##
airflow/providers/yandex/hooks/yandex.py:
##
@@ -122,7 +127,8 @@ def __init__(
 self.connection = self.get_connection(self.connection_id)
 self.extras = self.connection.extra_dejson
 credentials = self._get_credentials()
-self.sdk = yandexcloud.SDK(user_agent=self.provider_user_agent(), 
**credentials)
+endpoint = self._get_field("endpoint", False)
+self.sdk = yandexcloud.SDK(user_agent=self.provider_user_agent(), 
endpoint=endpoint, **credentials)

Review Comment:
   > But it not missing in our case, we would provide `False` in this case and 
it would evaluate to this wrong value
   > 
   > ```python
   > kwargs_no_endpoint = {"foo": "bar"}
   > print(kwargs_no_endpoint.get("endpoint", "api.cloud.yandex.net"))
   > 
   > kwargs_none_endpoint = {"foo": "bar", "endpoint": None}
   > print(kwargs_none_endpoint.get("endpoint", "api.cloud.yandex.net"))
   > 
   > kwargs_false_endpoint = {"foo": "bar", "endpoint": False}
   > print(kwargs_false_endpoint.get("endpoint", "api.cloud.yandex.net"))
   > ```
   
   very tru, thanks, force pushed the change, now default endpoint is set in 
provider
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] pgagnon commented on a diff in pull request #29623: Implement file credentials provider for AWS hook AssumeRoleWithWebIdentity

2023-02-20 Thread via GitHub


pgagnon commented on code in PR #29623:
URL: https://github.com/apache/airflow/pull/29623#discussion_r1112403111


##
airflow/providers/amazon/aws/hooks/base_aws.py:
##
@@ -312,19 +312,35 @@ def _get_web_identity_credential_fetcher(
 base_session = self.basic_session._session or 
botocore.session.get_session()
 client_creator = base_session.create_client
 federation = 
self.extra_config.get("assume_role_with_web_identity_federation")
-if federation == "google":
-web_identity_token_loader = 
self._get_google_identity_token_loader()
-else:
-raise AirflowException(
-f'Unsupported federation: {federation}. Currently "google" 
only are supported.'
-)
+
+web_identity_token_loader = (
+{
+"file": self._get_file_token_loader,
+"google": self._get_google_identity_token_loader,
+}.get(federation)()
+if type(federation) == str
+else None
+)

Review Comment:
   It still doesn't address the setting where you can't set external configs. 
The objective is to allow configuring connections that use 
`AssumeRoleWithWebIdentity` without relying on environment variables and AWS's 
configuration/shared credentials file.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on issue #20267: can't delete airflow connection from CLI when it has bad fernet key

2023-02-20 Thread via GitHub


github-actions[bot] commented on issue #20267:
URL: https://github.com/apache/airflow/issues/20267#issuecomment-1437689864

   This issue has been automatically marked as stale because it has been open 
for 365 days without any activity. There has been several Airflow releases 
since last activity on this issue. Kindly asking to recheck the report against 
latest Airflow version and let us know if the issue is reproducible. The issue 
will be closed in next 30 days if no further activity occurs from the issue 
author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on issue #20588: Health check path is not configured properly if AIRFLOW__WEBSERVER__BASE_URL is set via ENV variable

2023-02-20 Thread via GitHub


github-actions[bot] commented on issue #20588:
URL: https://github.com/apache/airflow/issues/20588#issuecomment-1437689838

   This issue has been automatically marked as stale because it has been open 
for 365 days without any activity. There has been several Airflow releases 
since last activity on this issue. Kindly asking to recheck the report against 
latest Airflow version and let us know if the issue is reproducible. The issue 
will be closed in next 30 days if no further activity occurs from the issue 
author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on issue #20821: Airflow can not schedule task in new subDAG

2023-02-20 Thread via GitHub


github-actions[bot] commented on issue #20821:
URL: https://github.com/apache/airflow/issues/20821#issuecomment-1437689819

   This issue has been automatically marked as stale because it has been open 
for 365 days without any activity. There has been several Airflow releases 
since last activity on this issue. Kindly asking to recheck the report against 
latest Airflow version and let us know if the issue is reproducible. The issue 
will be closed in next 30 days if no further activity occurs from the issue 
author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on issue #20949: Spark Submit Operator reporting job failure. Not able to poll the successful completion of Spark Job.

2023-02-20 Thread via GitHub


github-actions[bot] commented on issue #20949:
URL: https://github.com/apache/airflow/issues/20949#issuecomment-1437689787

   This issue has been automatically marked as stale because it has been open 
for 365 days without any activity. There has been several Airflow releases 
since last activity on this issue. Kindly asking to recheck the report against 
latest Airflow version and let us know if the issue is reproducible. The issue 
will be closed in next 30 days if no further activity occurs from the issue 
author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #29623: Implement file credentials provider for AWS hook AssumeRoleWithWebIdentity

2023-02-20 Thread via GitHub


Taragolis commented on code in PR #29623:
URL: https://github.com/apache/airflow/pull/29623#discussion_r1112397450


##
airflow/providers/amazon/aws/hooks/base_aws.py:
##
@@ -312,19 +312,35 @@ def _get_web_identity_credential_fetcher(
 base_session = self.basic_session._session or 
botocore.session.get_session()
 client_creator = base_session.create_client
 federation = 
self.extra_config.get("assume_role_with_web_identity_federation")
-if federation == "google":
-web_identity_token_loader = 
self._get_google_identity_token_loader()
-else:
-raise AirflowException(
-f'Unsupported federation: {federation}. Currently "google" 
only are supported.'
-)
+
+web_identity_token_loader = (
+{
+"file": self._get_file_token_loader,
+"google": self._get_google_identity_token_loader,
+}.get(federation)()
+if type(federation) == str
+else None
+)

Review Comment:
   > Unfortunately, this doesn't work out of the box for the vast majority of 
operators. Furthermore, this doesn't address the use case; there are many ways 
to obtain temporary credentials, but none currently allow configuring 
AssumeRoleWithWebIdentity without relying on external configs.
   
   It would work with all operators which required AWS Connection, the DAG I 
provide as a sample, in prod all you need: 
   
   1. Setup in your environment `AWS_ROLE_ARN` and `AWS_WEB_IDENTITY_TOKEN_FILE`
   2. Allow for `AWS_ROLE_ARN` to assume another role. This could be setup 
thought AWS IAM which do not required to change anything in Airflow environment.
   3. Setup your connection which in extra `{"role_arn": 
"your-required-role-here"}`
   
   Repeat step 2 and 3 for new roles what you required.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #29623: Implement file credentials provider for AWS hook AssumeRoleWithWebIdentity

2023-02-20 Thread via GitHub


Taragolis commented on code in PR #29623:
URL: https://github.com/apache/airflow/pull/29623#discussion_r1112397450


##
airflow/providers/amazon/aws/hooks/base_aws.py:
##
@@ -312,19 +312,35 @@ def _get_web_identity_credential_fetcher(
 base_session = self.basic_session._session or 
botocore.session.get_session()
 client_creator = base_session.create_client
 federation = 
self.extra_config.get("assume_role_with_web_identity_federation")
-if federation == "google":
-web_identity_token_loader = 
self._get_google_identity_token_loader()
-else:
-raise AirflowException(
-f'Unsupported federation: {federation}. Currently "google" 
only are supported.'
-)
+
+web_identity_token_loader = (
+{
+"file": self._get_file_token_loader,
+"google": self._get_google_identity_token_loader,
+}.get(federation)()
+if type(federation) == str
+else None
+)

Review Comment:
   > Unfortunately, this doesn't work out of the box for the vast majority of 
operators. Furthermore, this doesn't address the use case; there are many ways 
to obtain temporary credentials, but none currently allow configuring 
AssumeRoleWithWebIdentity without relying on external configs.
   
   It would work with all operators which required AWS Connection, the DAG I 
provide as a sample, in prod all you need is: 
   
   1. Setup in your environment `AWS_ROLE_ARN` and `AWS_WEB_IDENTITY_TOKEN_FILE`
   2. Allow for `AWS_ROLE_ARN` to assume another role. This could be setup 
thought AWS IAM which do not required to change anything in Airflow environment.
   3. Setup your connection which in extra `{"role_arn": 
"your-required-role-here"}`
   
   Repeat step 2 and 3 for new roles what you required.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #29623: Implement file credentials provider for AWS hook AssumeRoleWithWebIdentity

2023-02-20 Thread via GitHub


Taragolis commented on code in PR #29623:
URL: https://github.com/apache/airflow/pull/29623#discussion_r1112397450


##
airflow/providers/amazon/aws/hooks/base_aws.py:
##
@@ -312,19 +312,35 @@ def _get_web_identity_credential_fetcher(
 base_session = self.basic_session._session or 
botocore.session.get_session()
 client_creator = base_session.create_client
 federation = 
self.extra_config.get("assume_role_with_web_identity_federation")
-if federation == "google":
-web_identity_token_loader = 
self._get_google_identity_token_loader()
-else:
-raise AirflowException(
-f'Unsupported federation: {federation}. Currently "google" 
only are supported.'
-)
+
+web_identity_token_loader = (
+{
+"file": self._get_file_token_loader,
+"google": self._get_google_identity_token_loader,
+}.get(federation)()
+if type(federation) == str
+else None
+)

Review Comment:
   > Unfortunately, this doesn't work out of the box for the vast majority of 
operators. Furthermore, this doesn't address the use case; there are many ways 
to obtain temporary credentials, but none currently allow configuring 
AssumeRoleWithWebIdentity without relying on external configs.
   
   It would work with all operators which required AWS Connection, the DAG I 
provide as a sample, in prod all you need is: 
   
   1. Setup in your environment `AWS_ROLE_ARN` and `AWS_WEB_IDENTITY_TOKEN_FILE`
   2. Allow for `AWS_ROLE_ARN` to assume another role. This could be setup 
thought AWS IAM which do not required to change anything in Airflow environment.
   3. Setup your connection which in extra {"role_arn": 
"your-required-role-here"}
   
   Repeat step 2 and 3 for new roles what you required.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] pgagnon commented on a diff in pull request #29623: Implement file credentials provider for AWS hook AssumeRoleWithWebIdentity

2023-02-20 Thread via GitHub


pgagnon commented on code in PR #29623:
URL: https://github.com/apache/airflow/pull/29623#discussion_r1112391709


##
airflow/providers/amazon/aws/hooks/base_aws.py:
##
@@ -312,19 +312,35 @@ def _get_web_identity_credential_fetcher(
 base_session = self.basic_session._session or 
botocore.session.get_session()
 client_creator = base_session.create_client
 federation = 
self.extra_config.get("assume_role_with_web_identity_federation")
-if federation == "google":
-web_identity_token_loader = 
self._get_google_identity_token_loader()
-else:
-raise AirflowException(
-f'Unsupported federation: {federation}. Currently "google" 
only are supported.'
-)
+
+web_identity_token_loader = (
+{
+"file": self._get_file_token_loader,
+"google": self._get_google_identity_token_loader,
+}.get(federation)()
+if type(federation) == str
+else None
+)

Review Comment:
   > Seems like it provide ability to set different web identity token file, 
isn't it? 🤔
   
   Yes; I don't see a reason not to allow parameterization here. In fact, I 
think not doing so would be inconsistent and counter to the goal of allowing 
the feature to be fully configured through the connections subsystem.
   
   > I guess this option exists for allow create own method auth to AWS without 
overwrite everything in provider. If it complicated better to think how it make 
easier to end users.
   
   Custom classes will always only be used by our more sophisticated users. I 
do believe that they have their place in Airflow, but they shouldn't be 
considered a replacement for more accessible APIs, even if they functionally 
allow the same implementation.
   
   > Airflow 
[`AssumeRole`](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html)
 in two steps
   > 
   > 1. Obtain initial session by: either direct credentials or use `boto3` 
default strategy for obtain credentials (Environment Variables, EC2 IAM 
Profile, ECS Execution Role and etc)
   > 2. Use session obtained in Step 1 for assume role thought STS API 
[`AssumeRole`](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html).
   
   Unfortunately, this doesn't work out of the box for the vast majority of 
operators. Furthermore, this doesn't address the use case; there are many ways 
to obtain temporary credentials, but none currently allow configuring  
`AssumeRoleWithWebIdentity` without relying on external configs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] hussein-awala commented on issue #29621: Fix adding annotations for dag persistence PVC

2023-02-20 Thread via GitHub


hussein-awala commented on issue #29621:
URL: https://github.com/apache/airflow/issues/29621#issuecomment-1437660008

   @potiuk we can close it since #29622 is megred.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch revert-29408-docker-compose-change-example updated (9da3771744 -> 37aef76496)

2023-02-20 Thread taragolis
This is an automated email from the ASF dual-hosted git repository.

taragolis pushed a change to branch revert-29408-docker-compose-change-example
in repository https://gitbox.apache.org/repos/asf/airflow.git


 discard 9da3771744 Revert "Improve health checks in example docker-compose and 
clarify usage (#29408)"
 add 6ef5ba9104 Refactor Dataproc Trigger (#29364)
 add 5835b08e8b Adding possibility for annotations in logs pvc (#29270)
 add 901774718c Fix adding annotations for dag persistence PVC (#29622)
 add 3dbcf99d20 Update google cloud dlp package and adjust hook and 
operators (#29234)
 add f51742d20b Don't push secret in XCOM in 
BigQueryCreateDataTransferOperator (#29348)
 add 37aef76496 Revert "Improve health checks in example docker-compose and 
clarify usage (#29408)"

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (9da3771744)
\
 N -- N -- N   
refs/heads/revert-29408-docker-compose-change-example (37aef76496)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

No new revisions were added by this update.

Summary of changes:
 airflow/providers/google/cloud/hooks/dlp.py| 373 ++---
 .../google/cloud/operators/bigquery_dts.py |   3 +
 .../providers/google/cloud/operators/dataproc.py   |   4 +-
 airflow/providers/google/cloud/operators/dlp.py|  63 ++--
 .../providers/google/cloud/triggers/dataproc.py| 191 ---
 airflow/providers/google/provider.yaml |   4 +-
 chart/templates/dags-persistent-volume-claim.yaml  |   6 +-
 chart/templates/logs-persistent-volume-claim.yaml  |   4 +
 chart/values.schema.json   |   8 +
 chart/values.yaml  |   4 +
 generated/provider_dependencies.json   |   2 +-
 tests/providers/google/cloud/hooks/test_dlp.py | 327 --
 .../google/cloud/operators/test_bigquery_dts.py|  10 +-
 tests/providers/google/cloud/operators/test_dlp.py |  41 ++-
 .../google/cloud/triggers/test_dataproc.py |   4 +-
 15 files changed, 638 insertions(+), 406 deletions(-)



[GitHub] [airflow] Taragolis commented on a diff in pull request #29623: Implement file credentials provider for AWS hook AssumeRoleWithWebIdentity

2023-02-20 Thread via GitHub


Taragolis commented on code in PR #29623:
URL: https://github.com/apache/airflow/pull/29623#discussion_r1112384223


##
airflow/providers/amazon/aws/hooks/base_aws.py:
##
@@ -312,19 +312,35 @@ def _get_web_identity_credential_fetcher(
 base_session = self.basic_session._session or 
botocore.session.get_session()
 client_creator = base_session.create_client
 federation = 
self.extra_config.get("assume_role_with_web_identity_federation")
-if federation == "google":
-web_identity_token_loader = 
self._get_google_identity_token_loader()
-else:
-raise AirflowException(
-f'Unsupported federation: {federation}. Currently "google" 
only are supported.'
-)
+
+web_identity_token_loader = (
+{
+"file": self._get_file_token_loader,
+"google": self._get_google_identity_token_loader,
+}.get(federation)()
+if type(federation) == str
+else None
+)

Review Comment:
   > The objective is more about using multiple role_arn than multiple tokens. 
It is widespread in practice for users not to have the ability to override 
config files (for instance, in corporate environments or on PaaS), so defining 
profiles in the AWS config or shared credentials file is not always practical.
   
   Seems like it provide ability to set different web identity token file, 
isn't it? 🤔 
   
   ```python
   token_file = 
self.extra_config.get("assume_role_with_web_identity_token_file") or os.getenv(
   
AssumeRoleWithWebIdentityProvider._CONFIG_TO_ENV_VAR["web_identity_token_file"]
   )
   ```
   
   > This shares the same concern as point 1 regarding the inability to 
overwrite config files. Also, I think this option would be dismissed as too 
complex and burdensome by 99% of Airflow users.
   
   I guess this option exists for allow create own method auth to AWS without 
overwrite everything in provider. If it complicated better to think how it make 
easier to end users.
   
   > This does not permit configuration through the Airflow connections 
subsystem without relying on external configs, which are (1) not necessarily 
mutable and (2) more-or-less static.
   
   Airflow 
[`AssumeRole`](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html)
 in two steps
   1. Obtain initial session by: either direct credentials or use `boto3` 
default strategy for obtain credentials (Environment Variables, EC2 IAM 
Profile, ECS Execution Role and etc)
   2. Use session obtained in Step 1 for assume role thought STS API 
[`AssumeRole`](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html).
   
   So I can't see any problem for obtain in EKS IRSA or by other SA 
(`AWS_ROLE_ARN`, `AWS_WEB_IDENTITY_TOKEN_FILE` and `AWS_ROLE_SESSION_NAME` 
environment variables) and after that assume to required role.
   
   That simple snippet DAG for check
   
   ```python
   import os
   
   from airflow import DAG
   from airflow.decorators import task
   from airflow.models import Connection
   import pendulum
   
   with DAG(
   dag_id="test-irsa-role",
   start_date=pendulum.datetime(2023, 2, 1, tz="UTC"),
   schedule=None,
   catchup=False,
   tags=["aws", "assume-role", "29623"]
   ):
   @task
   def check_connection():
   os.environ["AWS_ROLE_ARN"] = 
"arn:aws:iam:::role/spam-egg-role"
   os.environ["AWS_WEB_IDENTITY_TOKEN_FILE"] = 
"/var/run/secrets/eks.amazonaws.com/serviceaccount/token"
   
   conn_id = "fake-conn-for-test"
   region_name = "eu-west-1"
   assumed_role = "arn:aws:iam:::role/foo-bar-role"
   
   assert assumed_role != os.environ["AWS_ROLE_ARN"]
   
   conn = Connection(
   conn_id=conn_id,
   conn_type="aws",
   extra={
   "role_arn": assumed_role,
   "region_name": region_name
   }
   )
   
   env_key = f"AIRFLOW_CONN_{conn.conn_id.upper()}"
   os.environ[env_key] = conn.get_uri()
   result, message = conn.test_connection()
   if result:
   return message
   
   raise PermissionError(message)
   
   check_connection()
   ```
   
   > This would be possible if boto3 allowed passing web_identity_token_file, 
role_arn, or role_session_name as parameters to clients or sessions, but sadly 
it does not at this point.
   
   This would be nice to address to [boto team](https://github.com/boto),  and 
may be they could share why it is not implemented into their SDK, I guess some 
reason exists for that.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For

[airflow] branch main updated: Don't push secret in XCOM in BigQueryCreateDataTransferOperator (#29348)

2023-02-20 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
 new f51742d20b Don't push secret in XCOM in 
BigQueryCreateDataTransferOperator (#29348)
f51742d20b is described below

commit f51742d20b2e53bcd90a19db21e4e12d2a287677
Author: Pankaj Singh <98807258+pankajas...@users.noreply.github.com>
AuthorDate: Tue Feb 21 04:36:50 2023 +0530

Don't push secret in XCOM in BigQueryCreateDataTransferOperator (#29348)

* Don't push secret in xcom in BigQueryCreateDataTransferOperator
---
 airflow/providers/google/cloud/operators/bigquery_dts.py|  3 +++
 tests/providers/google/cloud/operators/test_bigquery_dts.py | 10 --
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/airflow/providers/google/cloud/operators/bigquery_dts.py 
b/airflow/providers/google/cloud/operators/bigquery_dts.py
index ee1a3b548b..d786c903e7 100644
--- a/airflow/providers/google/cloud/operators/bigquery_dts.py
+++ b/airflow/providers/google/cloud/operators/bigquery_dts.py
@@ -138,6 +138,9 @@ class BigQueryCreateDataTransferOperator(BaseOperator):
 result = TransferConfig.to_dict(response)
 self.log.info("Created DTS transfer config %s", get_object_id(result))
 self.xcom_push(context, key="transfer_config_id", 
value=get_object_id(result))
+# don't push AWS secret in XCOM
+result.get("params", {}).pop("secret_access_key", None)
+result.get("params", {}).pop("access_key_id", None)
 return result
 
 
diff --git a/tests/providers/google/cloud/operators/test_bigquery_dts.py 
b/tests/providers/google/cloud/operators/test_bigquery_dts.py
index 78c92d52ed..aa52169a77 100644
--- a/tests/providers/google/cloud/operators/test_bigquery_dts.py
+++ b/tests/providers/google/cloud/operators/test_bigquery_dts.py
@@ -46,12 +46,15 @@ TRANSFER_CONFIG_ID = "id1234"
 
 TRANSFER_CONFIG_NAME = "projects/123abc/locations/321cba/transferConfig/1a2b3c"
 RUN_NAME = "projects/123abc/locations/321cba/transferConfig/1a2b3c/runs/123"
+transfer_config = TransferConfig(
+name=TRANSFER_CONFIG_NAME, params={"secret_access_key": "AIRFLOW_KEY", 
"access_key_id": "AIRFLOW_KEY_ID"}
+)
 
 
 class BigQueryCreateDataTransferOperatorTestCase(unittest.TestCase):
 @mock.patch(
 
"airflow.providers.google.cloud.operators.bigquery_dts.BiqQueryDataTransferServiceHook",
-**{"return_value.create_transfer_config.return_value": 
TransferConfig(name=TRANSFER_CONFIG_NAME)},
+**{"return_value.create_transfer_config.return_value": 
transfer_config},
 )
 def test_execute(self, mock_hook):
 op = BigQueryCreateDataTransferOperator(
@@ -59,7 +62,7 @@ class 
BigQueryCreateDataTransferOperatorTestCase(unittest.TestCase):
 )
 ti = mock.MagicMock()
 
-op.execute({"ti": ti})
+return_value = op.execute({"ti": ti})
 
 mock_hook.return_value.create_transfer_config.assert_called_once_with(
 authorization_code=None,
@@ -71,6 +74,9 @@ class 
BigQueryCreateDataTransferOperatorTestCase(unittest.TestCase):
 )
 ti.xcom_push.assert_called_with(execution_date=None, 
key="transfer_config_id", value="1a2b3c")
 
+assert "secret_access_key" not in return_value.get("params", {})
+assert "access_key_id" not in return_value.get("params", {})
+
 
 class BigQueryDeleteDataTransferConfigOperatorTestCase(unittest.TestCase):
 
@mock.patch("airflow.providers.google.cloud.operators.bigquery_dts.BiqQueryDataTransferServiceHook")



[GitHub] [airflow] potiuk merged pull request #29348: Don't push secret in XCOM in BigQueryCreateDataTransferOperator

2023-02-20 Thread via GitHub


potiuk merged PR #29348:
URL: https://github.com/apache/airflow/pull/29348


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk closed issue #29209: BigQueryCreateDataTransferOperator will log AWS credentials when transferring from S3

2023-02-20 Thread via GitHub


potiuk closed issue #29209: BigQueryCreateDataTransferOperator will log AWS 
credentials when transferring from S3 
URL: https://github.com/apache/airflow/issues/29209


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #29648: Optionally exclude task deferral time from overall runtime of the task

2023-02-20 Thread via GitHub


potiuk commented on issue #29648:
URL: https://github.com/apache/airflow/issues/29648#issuecomment-1437654137

   Maybe, but this would not be accurate - there are lots of nuances - when the 
task start running and when it is already deferred (both bringing back and 
deferring tasks take time). Plus it would require DB changes to keep the total 
time of "execution". CC: @andrewgodwin WDYT?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated: Update google cloud dlp package and adjust hook and operators (#29234)

2023-02-20 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
 new 3dbcf99d20 Update google cloud dlp package and adjust hook and 
operators (#29234)
3dbcf99d20 is described below

commit 3dbcf99d20d47cde0debdd5faf9bd9b2ebde1718
Author: Łukasz Wyszomirski 
AuthorDate: Tue Feb 21 00:01:13 2023 +0100

Update google cloud dlp package and adjust hook and operators (#29234)
---
 airflow/providers/google/cloud/hooks/dlp.py| 373 ++---
 airflow/providers/google/cloud/operators/dlp.py|  63 ++--
 airflow/providers/google/provider.yaml |   4 +-
 generated/provider_dependencies.json   |   2 +-
 tests/providers/google/cloud/hooks/test_dlp.py | 327 --
 tests/providers/google/cloud/operators/test_dlp.py |  41 ++-
 6 files changed, 534 insertions(+), 276 deletions(-)

diff --git a/airflow/providers/google/cloud/hooks/dlp.py 
b/airflow/providers/google/cloud/hooks/dlp.py
index cf3883516e..b7a71f5dd1 100644
--- a/airflow/providers/google/cloud/hooks/dlp.py
+++ b/airflow/providers/google/cloud/hooks/dlp.py
@@ -33,7 +33,7 @@ from typing import Sequence
 
 from google.api_core.gapic_v1.method import DEFAULT, _MethodDefault
 from google.api_core.retry import Retry
-from google.cloud.dlp_v2 import DlpServiceClient
+from google.cloud.dlp import DlpServiceClient
 from google.cloud.dlp_v2.types import (
 ByteContentItem,
 ContentItem,
@@ -41,7 +41,6 @@ from google.cloud.dlp_v2.types import (
 DeidentifyContentResponse,
 DeidentifyTemplate,
 DlpJob,
-FieldMask,
 InspectConfig,
 InspectContentResponse,
 InspectJobConfig,
@@ -55,6 +54,7 @@ from google.cloud.dlp_v2.types import (
 StoredInfoType,
 StoredInfoTypeConfig,
 )
+from google.protobuf.field_mask_pb2 import FieldMask
 
 from airflow.exceptions import AirflowException
 from airflow.providers.google.common.consts import CLIENT_INFO
@@ -101,7 +101,7 @@ class CloudDLPHook(GoogleBaseHook):
 delegate_to=delegate_to,
 impersonation_chain=impersonation_chain,
 )
-self._client = None
+self._client: DlpServiceClient | None = None
 
 def get_conn(self) -> DlpServiceClient:
 """
@@ -113,6 +113,15 @@ class CloudDLPHook(GoogleBaseHook):
 self._client = 
DlpServiceClient(credentials=self.get_credentials(), client_info=CLIENT_INFO)
 return self._client
 
+def _project_deidentify_template_path(self, project_id, template_id):
+return 
f"{DlpServiceClient.common_project_path(project_id)}/deidentifyTemplates/{template_id}"
+
+def _project_stored_info_type_path(self, project_id, info_type_id):
+return 
f"{DlpServiceClient.common_project_path(project_id)}/storedInfoTypes/{info_type_id}"
+
+def _project_inspect_template_path(self, project_id, inspect_template_id):
+return 
f"{DlpServiceClient.common_project_path(project_id)}/inspectTemplates/{inspect_template_id}"
+
 @GoogleBaseHook.fallback_to_default_project_id
 def cancel_dlp_job(
 self,
@@ -142,7 +151,14 @@ class CloudDLPHook(GoogleBaseHook):
 raise AirflowException("Please provide the ID of the DLP job 
resource to be cancelled.")
 
 name = DlpServiceClient.dlp_job_path(project_id, dlp_job_id)
-client.cancel_dlp_job(name=name, retry=retry, timeout=timeout, 
metadata=metadata)
+client.cancel_dlp_job(
+request=dict(
+name=name,
+),
+retry=retry,
+timeout=timeout,
+metadata=metadata,
+)
 
 def create_deidentify_template(
 self,
@@ -177,16 +193,18 @@ class CloudDLPHook(GoogleBaseHook):
 project_id = project_id or self.project_id
 
 if organization_id:
-parent = DlpServiceClient.organization_path(organization_id)
+parent = DlpServiceClient.common_organization_path(organization_id)
 elif project_id:
-parent = DlpServiceClient.project_path(project_id)
+parent = DlpServiceClient.common_project_path(project_id)
 else:
 raise AirflowException("Please provide either organization_id or 
project_id.")
 
 return client.create_deidentify_template(
-parent=parent,
-deidentify_template=deidentify_template,
-template_id=template_id,
+request=dict(
+parent=parent,
+deidentify_template=deidentify_template,
+template_id=template_id,
+),
 retry=retry,
 timeout=timeout,
 metadata=metadata,
@@ -227,12 +245,14 @@ class CloudDLPHook(GoogleBaseHook):
 """
 client = self.get_conn()
 
-parent = DlpServiceClient.project_path(project_id)
+parent = Dlp

[GitHub] [airflow] potiuk merged pull request #29234: Update google cloud dlp package and adjust hook and operators

2023-02-20 Thread via GitHub


potiuk merged PR #29234:
URL: https://github.com/apache/airflow/pull/29234


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #29643: Add google-cloud-run python client to GCP requirements

2023-02-20 Thread via GitHub


potiuk commented on issue #29643:
URL: https://github.com/apache/airflow/issues/29643#issuecomment-1437649040

   Feel free to attempt it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated (5835b08e8b -> 901774718c)

2023-02-20 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


from 5835b08e8b Adding possibility for annotations in logs pvc (#29270)
 add 901774718c Fix adding annotations for dag persistence PVC (#29622)

No new revisions were added by this update.

Summary of changes:
 chart/templates/dags-persistent-volume-claim.yaml | 6 +++---
 chart/values.yaml | 2 ++
 2 files changed, 5 insertions(+), 3 deletions(-)



[GitHub] [airflow] potiuk merged pull request #29622: Fix adding annotations for dag persistence PVC

2023-02-20 Thread via GitHub


potiuk merged PR #29622:
URL: https://github.com/apache/airflow/pull/29622


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated: Adding possibility for annotations in logs pvc (#29270)

2023-02-20 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
 new 5835b08e8b Adding possibility for annotations in logs pvc (#29270)
5835b08e8b is described below

commit 5835b08e8bc3e11f4f98745266d10bbae510b258
Author: Amogh Desai 
AuthorDate: Tue Feb 21 04:27:35 2023 +0530

Adding possibility for annotations in logs pvc (#29270)

* Adding possibility for annotations in logs pvc


Co-authored-by: Amogh 
---
 chart/templates/logs-persistent-volume-claim.yaml | 4 
 chart/values.schema.json  | 8 
 chart/values.yaml | 2 ++
 3 files changed, 14 insertions(+)

diff --git a/chart/templates/logs-persistent-volume-claim.yaml 
b/chart/templates/logs-persistent-volume-claim.yaml
index deeaa2ef97..07374c1fd5 100644
--- a/chart/templates/logs-persistent-volume-claim.yaml
+++ b/chart/templates/logs-persistent-volume-claim.yaml
@@ -29,6 +29,10 @@ metadata:
 {{- with .Values.labels }}
 {{- toYaml . | nindent 4 }}
 {{- end }}
+  {{- with .Values.logs.persistence.annotations }}
+  annotations:
+{{- toYaml . | nindent 4 }}
+  {{- end }}
 spec:
   accessModes: ["ReadWriteMany"]
   resources:
diff --git a/chart/values.schema.json b/chart/values.schema.json
index db6d3b3fa9..12ab8a99b0 100644
--- a/chart/values.schema.json
+++ b/chart/values.schema.json
@@ -5491,6 +5491,14 @@
 ],
 "default": null
 },
+"annotations": {
+"description": "Annotations to add to logs PVC",
+"type": "object",
+"default": {},
+"additionalProperties": {
+"type": "string"
+}
+},
 "existingClaim": {
 "description": "The name of an existing PVC to 
use.",
 "type": [
diff --git a/chart/values.yaml b/chart/values.yaml
index ef59660773..b0c06bc086 100644
--- a/chart/values.yaml
+++ b/chart/values.yaml
@@ -1944,6 +1944,8 @@ logs:
 enabled: false
 # Volume size for logs
 size: 100Gi
+# Annotations for the logs PVC
+annotations: {}
 # If using a custom storageClass, pass name here
 storageClassName:
 ## the name of an existing PVC to use



[GitHub] [airflow] potiuk merged pull request #29270: Adding possibility for annotations in logs pvc

2023-02-20 Thread via GitHub


potiuk merged PR #29270:
URL: https://github.com/apache/airflow/pull/29270


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #28647: Intermitent log on deferrable operator

2023-02-20 Thread via GitHub


potiuk commented on issue #28647:
URL: https://github.com/apache/airflow/issues/28647#issuecomment-1437642318

   cc:  @andrewgodwin  maybe that rings a bell for you ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #29094: Add `max_active_tis_per_dagrun` for Dynamic Task Mapping

2023-02-20 Thread via GitHub


potiuk commented on PR #29094:
URL: https://github.com/apache/airflow/pull/29094#issuecomment-1437640955

   I think this one is great - though I do not understand all the details of 
mapped tasks and concurrency there. @uranusjr @ashb - maybe you can take a 
look, sounds like a great feature of Dynamic Task Mapping.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated (ee0a56a2ca -> 6ef5ba9104)

2023-02-20 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


from ee0a56a2ca Trigger Class migration to internal API (#29099)
 add 6ef5ba9104 Refactor Dataproc Trigger (#29364)

No new revisions were added by this update.

Summary of changes:
 .../providers/google/cloud/operators/dataproc.py   |   4 +-
 .../providers/google/cloud/triggers/dataproc.py| 191 -
 .../google/cloud/triggers/test_dataproc.py |   4 +-
 3 files changed, 74 insertions(+), 125 deletions(-)



[GitHub] [airflow] potiuk merged pull request #29364: Refactor Dataproc Trigger

2023-02-20 Thread via GitHub


potiuk merged PR #29364:
URL: https://github.com/apache/airflow/pull/29364


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #25060: Airflow scheduler crashes due to 'duplicate key value violates unique constraint "task_instance_pkey"'

2023-02-20 Thread via GitHub


potiuk commented on issue #25060:
URL: https://github.com/apache/airflow/issues/25060#issuecomment-1437625953

   @uranusjr  - wasn't the "null" mapped task instance fixed since 2.4.3 ? I 
cannot find it easily


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] pankajastro commented on pull request #29348: Don't push secret in XCOM in BigQueryCreateDataTransferOperator

2023-02-20 Thread via GitHub


pankajastro commented on PR #29348:
URL: https://github.com/apache/airflow/pull/29348#issuecomment-1437626057

   > > And I guess we need to add tests for that, for avoid situation that 
somehow it return in the future.
   > 
   > Agree
   
   Added here 4eade3237e25f92095d74b76cd1f16f99945c842 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #29648: Optionally exclude task deferral time from overall runtime of the task

2023-02-20 Thread via GitHub


boring-cyborg[bot] commented on issue #29648:
URL: https://github.com/apache/airflow/issues/29648#issuecomment-1437605958

   Thanks for opening your first issue here! Be sure to follow the issue 
template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] joshuaghezzi opened a new issue, #29648: Optionally exclude task deferral time from overall runtime of the task

2023-02-20 Thread via GitHub


joshuaghezzi opened a new issue, #29648:
URL: https://github.com/apache/airflow/issues/29648

   ### Description
   
   Ability to optionally exclude the time that the task was deferred from the 
overall execution time of the task.
   
   ### Use case/motivation
   
   I've created a custom operator which defers execution for a specified amount 
of time when a specific exception occurs. The task that uses the operator has a 
timeout on it which is less than the duration of the deferral so once the 
deferral resumes the task will timeout. Instead of raising the timeout to 
account for the time the task could be deferred which changes the intent of the 
timeout, I feel it'd useful to optionally exclude the time the task was 
deferred. The execution time would essentially be paused while the task is 
deferred and would start from where it was paused once the task resumes.
   
   ### Related issues
   
   #19382 - brought expected behaviour in line with documentation.
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] pgagnon commented on a diff in pull request #29623: Implement file credentials provider for AWS hook AssumeRoleWithWebIdentity

2023-02-20 Thread via GitHub


pgagnon commented on code in PR #29623:
URL: https://github.com/apache/airflow/pull/29623#discussion_r1112360505


##
airflow/providers/amazon/aws/hooks/base_aws.py:
##
@@ -312,19 +312,35 @@ def _get_web_identity_credential_fetcher(
 base_session = self.basic_session._session or 
botocore.session.get_session()
 client_creator = base_session.create_client
 federation = 
self.extra_config.get("assume_role_with_web_identity_federation")
-if federation == "google":
-web_identity_token_loader = 
self._get_google_identity_token_loader()
-else:
-raise AirflowException(
-f'Unsupported federation: {federation}. Currently "google" 
only are supported.'
-)
+
+web_identity_token_loader = (
+{
+"file": self._get_file_token_loader,
+"google": self._get_google_identity_token_loader,
+}.get(federation)()
+if type(federation) == str
+else None
+)

Review Comment:
   @Taragolis I think two areas are being raised here. On the implementation, I 
am well aware that it can be improved, and I fully expected to go through a few 
rounds to get this in shape. If you have specific review comments, I would 
happily work with you on this. 🙂
   
   Regarding the alternatives that you have outlined, I have considered them 
previously, and I do not think they address the objective. Specifically:
   
   1. The objective is more about using multiple `role_arn` than multiple 
tokens. It is widespread in practice for users not to have the ability to 
override config files (for instance, in corporate environments or on PaaS), so 
defining profiles in the AWS config or shared credentials file is not always 
practical.
   2. This shares the same concern as point 1 regarding the inability to 
overwrite config files. Also, I think this option would be dismissed as too 
complex and burdensome by 99% of Airflow users.
   3. This does not permit configuration through the Airflow connections 
subsystem without relying on external configs, which are (1) not necessarily 
mutable and (2) more-or-less static. This would be possible if `boto3` allowed 
passing `web_identity_token_file,` `role_arn`, or `role_session_name` as 
parameters to clients or sessions, but sadly it does not at this point.
   
   This being said, I share your concern about the lack of system tests for 
connections, and this is something worthwhile to look at but probably out of 
the scope of this discussion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #29432: Jinja templating doesn't work with container_resources when using dymanic task mapping with Kubernetes Pod Operator

2023-02-20 Thread via GitHub


potiuk commented on issue #29432:
URL: https://github.com/apache/airflow/issues/29432#issuecomment-1437596856

   Done


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #29621: Fix adding annotations for dag persistence PVC

2023-02-20 Thread via GitHub


potiuk commented on issue #29621:
URL: https://github.com/apache/airflow/issues/29621#issuecomment-1437596195

   Feel free to fix them @amoghrajesh . Assigned you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #29393: S3TaskHandler continuously returns "*** Falling back to local log" even if log_pos is provided when log not in s3

2023-02-20 Thread via GitHub


potiuk commented on issue #29393:
URL: https://github.com/apache/airflow/issues/29393#issuecomment-1437595099

   cc: @dstandish - maybe you can comment on how it changes in 2.6 (and whether 
the problem is going to be fixed there). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #29580: Allow to specify which connection, variable or config are being looked up in the backend using *_lookup_pattern parameters

2023-02-20 Thread via GitHub


potiuk commented on PR #29580:
URL: https://github.com/apache/airflow/pull/29580#issuecomment-1437592624

   Yep. I thouthg about it and this seems like non-obvious great way of solving 
the "cost" problem connected with secret backends.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #29406: Add Fail Fast feature for DAGs

2023-02-20 Thread via GitHub


potiuk commented on PR #29406:
URL: https://github.com/apache/airflow/pull/29406#issuecomment-1437591152

   @ashb @uranusjr - can you think of some hidden side-effects. Seems like this 
implementation of "fail-fast" is simple and might just work - unless I missed 
something.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #29195: Fixing Task Duration view in case of manual DAG runs only (#22015)

2023-02-20 Thread via GitHub


Taragolis commented on code in PR #29195:
URL: https://github.com/apache/airflow/pull/29195#discussion_r1112342301


##
airflow/models/dag.py:
##
@@ -112,6 +113,8 @@
 
 log = logging.getLogger(__name__)
 
+_T = TypeVar("_T")
+

Review Comment:
   ```suggestion
   ```



##
airflow/models/dag.py:
##
@@ -1534,6 +1540,8 @@ def get_task_instances(
 start_date = (timezone.utcnow() - timedelta(30)).replace(
 hour=0, minute=0, second=0, microsecond=0
 )
+
+print("=S=", start_date)

Review Comment:
   ```suggestion
   ```



##
airflow/models/dag.py:
##
@@ -43,6 +43,7 @@
 Iterator,
 List,
 Sequence,
+TypeVar,

Review Comment:
   ```suggestion
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #29406: Add Fail Fast feature for DAGs

2023-02-20 Thread via GitHub


potiuk commented on code in PR #29406:
URL: https://github.com/apache/airflow/pull/29406#discussion_r1112342672


##
airflow/models/taskinstance.py:
##
@@ -172,6 +172,19 @@ def set_current_context(context: Context) -> 
Generator[Context, None, None]:
 )
 
 
+def stop_all_tasks_in_dag(tis: list[TaskInstance], session: Session, 
task_id_to_ignore: int):
+for ti in tis:
+if ti.task_id == task_id_to_ignore or ti.state in (
+TaskInstanceState.SUCCESS,
+TaskInstanceState.FAILED,
+):
+continue
+if ti.state == TaskInstanceState.RUNNING:
+ti.error(session)

Review Comment:
   Otherwise it will be quite magical



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #29406: Add Fail Fast feature for DAGs

2023-02-20 Thread via GitHub


potiuk commented on code in PR #29406:
URL: https://github.com/apache/airflow/pull/29406#discussion_r1112342504


##
airflow/models/taskinstance.py:
##
@@ -172,6 +172,19 @@ def set_current_context(context: Context) -> 
Generator[Context, None, None]:
 )
 
 
+def stop_all_tasks_in_dag(tis: list[TaskInstance], session: Session, 
task_id_to_ignore: int):
+for ti in tis:
+if ti.task_id == task_id_to_ignore or ti.state in (
+TaskInstanceState.SUCCESS,
+TaskInstanceState.FAILED,
+):
+continue
+if ti.state == TaskInstanceState.RUNNING:
+ti.error(session)

Review Comment:
   We shoudl add some logging telling that we are doing it. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #29518: Google Providers - Fix _MethodDefault deepcopy failure

2023-02-20 Thread via GitHub


potiuk commented on PR #29518:
URL: https://github.com/apache/airflow/pull/29518#issuecomment-1437581881

   @IKholopov - can you please split this one in two PRs:
   
   1) introduce base class
   2) adding deepcopy
   
   That will be much better when it comes to historical look at the changes. 
It's quite hard to find the deepcopy fix among all the changed base operators.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #29518: Google Providers - Fix _MethodDefault deepcopy failure

2023-02-20 Thread via GitHub


potiuk commented on code in PR #29518:
URL: https://github.com/apache/airflow/pull/29518#discussion_r1112340761


##
airflow/providers/google/cloud/operators/cloud_base.py:
##
@@ -0,0 +1,34 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""This module contains a Google API base operator."""
+from __future__ import annotations
+
+from google.api_core.gapic_v1.method import DEFAULT
+
+from airflow.models import BaseOperator
+
+
+class GoogleCloudBaseOperator(BaseOperator):
+"""
+Abstract base class that takes care of common specifics of the operators 
built
+on top of Google API client libraries.
+"""
+
+def __deepcopy__(self, memo):
+memo[id(DEFAULT)] = DEFAULT

Review Comment:
   (Link to some description/issue explaining it because otherwise it is pretty 
magical.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #29518: Google Providers - Fix _MethodDefault deepcopy failure

2023-02-20 Thread via GitHub


potiuk commented on code in PR #29518:
URL: https://github.com/apache/airflow/pull/29518#discussion_r1112340285


##
airflow/providers/google/cloud/operators/cloud_base.py:
##
@@ -0,0 +1,34 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""This module contains a Google API base operator."""
+from __future__ import annotations
+
+from google.api_core.gapic_v1.method import DEFAULT
+
+from airflow.models import BaseOperator
+
+
+class GoogleCloudBaseOperator(BaseOperator):
+"""
+Abstract base class that takes care of common specifics of the operators 
built
+on top of Google API client libraries.
+"""
+
+def __deepcopy__(self, memo):
+memo[id(DEFAULT)] = DEFAULT

Review Comment:
   Would be nice to add explanation why we are doing it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] EKami commented on issue #13311: pendulum.tz.zoneinfo.exceptions.InvalidTimezone

2023-02-20 Thread via GitHub


EKami commented on issue #13311:
URL: https://github.com/apache/airflow/issues/13311#issuecomment-1437579039

   Nope, sorry guys, I gave up on Airflow


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis opened a new pull request, #29647: Revert "Improve health checks in example docker-compose and clarify usage"

2023-02-20 Thread via GitHub


Taragolis opened a new pull request, #29647:
URL: https://github.com/apache/airflow/pull/29647

   Just in case check is same error happen as described here #29616


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] 01/01: Revert "Improve health checks in example docker-compose and clarify usage (#29408)"

2023-02-20 Thread taragolis
This is an automated email from the ASF dual-hosted git repository.

taragolis pushed a commit to branch revert-29408-docker-compose-change-example
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 9da37717449f5267b4ef8c4962cb89f6fa5f102c
Author: Andrey Anshin 
AuthorDate: Tue Feb 21 01:33:42 2023 +0400

Revert "Improve health checks in example docker-compose and clarify usage 
(#29408)"

This reverts commit 08bec93e5cb651c5180af13bad1c43d763c7583d.
---
 .../howto/docker-compose/docker-compose.yaml   | 40 ++
 1 file changed, 11 insertions(+), 29 deletions(-)

diff --git a/docs/apache-airflow/howto/docker-compose/docker-compose.yaml 
b/docs/apache-airflow/howto/docker-compose/docker-compose.yaml
index 719fffe8ab..38e8e8b471 100644
--- a/docs/apache-airflow/howto/docker-compose/docker-compose.yaml
+++ b/docs/apache-airflow/howto/docker-compose/docker-compose.yaml
@@ -36,15 +36,11 @@
 # _AIRFLOW_WWW_USER_PASSWORD   - Password for the administrator account (if 
requested).
 #Default: airflow
 # _PIP_ADDITIONAL_REQUIREMENTS - Additional PIP requirements to add when 
starting all containers.
-#Use this option ONLY for quick checks. 
Installing requirements at container
-#startup is done EVERY TIME the service is 
started.
-#A better way is to build a custom image or 
extend the official image
-#as described in 
https://airflow.apache.org/docs/docker-stack/build.html.
 #Default: ''
 #
 # Feel free to modify this file to suit your needs.
 ---
-version: '3.8'
+version: '3'
 x-airflow-common:
   &airflow-common
   # In order to add custom dependencies or upgrade provider packages you can 
use your extended image.
@@ -64,13 +60,6 @@ x-airflow-common:
 AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
 AIRFLOW__CORE__LOAD_EXAMPLES: 'true'
 AIRFLOW__API__AUTH_BACKENDS: 
'airflow.api.auth.backend.basic_auth,airflow.api.auth.backend.session'
-# yamllint disable rule:line-length
-# Use simple http server on scheduler for health checks
-# See 
https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/logging-monitoring/check-health.html#scheduler-health-check-server
-# yamllint enable rule:line-length
-AIRFLOW__SCHEDULER__ENABLE_HEALTH_CHECK: 'true'
-# WARNING: Use _PIP_ADDITIONAL_REQUIREMENTS option ONLY for a quick checks
-# for other purpose (development, test and especially production usage) 
build/extend Airflow image.
 _PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:-}
   volumes:
 - ${AIRFLOW_PROJ_DIR:-.}/dags:/opt/airflow/dags
@@ -95,9 +84,8 @@ services:
   - postgres-db-volume:/var/lib/postgresql/data
 healthcheck:
   test: ["CMD", "pg_isready", "-U", "airflow"]
-  interval: 10s
+  interval: 5s
   retries: 5
-  start_period: 5s
 restart: always
 
   redis:
@@ -106,23 +94,21 @@ services:
   - 6379
 healthcheck:
   test: ["CMD", "redis-cli", "ping"]
-  interval: 10s
+  interval: 5s
   timeout: 30s
   retries: 50
-  start_period: 30s
 restart: always
 
   airflow-webserver:
 <<: *airflow-common
 command: webserver
 ports:
-  - "8080:8080"
+  - 8080:8080
 healthcheck:
   test: ["CMD", "curl", "--fail", "http://localhost:8080/health";]
-  interval: 30s
+  interval: 10s
   timeout: 10s
   retries: 5
-  start_period: 30s
 restart: always
 depends_on:
   <<: *airflow-common-depends-on
@@ -133,11 +119,10 @@ services:
 <<: *airflow-common
 command: scheduler
 healthcheck:
-  test: ["CMD", "curl", "--fail", "http://localhost:8974/health";]
-  interval: 30s
+  test: ["CMD-SHELL", 'airflow jobs check --job-type SchedulerJob 
--hostname "$${HOSTNAME}"']
+  interval: 10s
   timeout: 10s
   retries: 5
-  start_period: 30s
 restart: always
 depends_on:
   <<: *airflow-common-depends-on
@@ -151,10 +136,9 @@ services:
   test:
 - "CMD-SHELL"
 - 'celery --app airflow.executors.celery_executor.app inspect ping -d 
"celery@$${HOSTNAME}"'
-  interval: 30s
+  interval: 10s
   timeout: 10s
   retries: 5
-  start_period: 30s
 environment:
   <<: *airflow-common-env
   # Required to handle warm shutdown of the celery workers properly
@@ -171,10 +155,9 @@ services:
 command: triggerer
 healthcheck:
   test: ["CMD-SHELL", 'airflow jobs check --job-type TriggererJob 
--hostname "$${HOSTNAME}"']
-  interval: 30s
+  interval: 10s
   timeout: 10s
   retries: 5
-  start_period: 30s
 restart: always
 depends_on:
   <<: *airflow-common-depends-on
@@ -281,13 +264,12 @@ services:
 profiles:
   - flower
 ports:
-  - ":"
+  -

[airflow] branch revert-29408-docker-compose-change-example created (now 9da3771744)

2023-02-20 Thread taragolis
This is an automated email from the ASF dual-hosted git repository.

taragolis pushed a change to branch revert-29408-docker-compose-change-example
in repository https://gitbox.apache.org/repos/asf/airflow.git


  at 9da3771744 Revert "Improve health checks in example docker-compose and 
clarify usage (#29408)"

This branch includes the following new commits:

 new 9da3771744 Revert "Improve health checks in example docker-compose and 
clarify usage (#29408)"

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.




[GitHub] [airflow] raphaelauv commented on issue #29432: Jinja templating doesn't work with container_resources when using dymanic task mapping with Kubernetes Pod Operator

2023-02-20 Thread via GitHub


raphaelauv commented on issue #29432:
URL: https://github.com/apache/airflow/issues/29432#issuecomment-1437573499

   @potiuk could we add this one to 
https://github.com/apache/airflow/milestone/68 ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] hussein-awala commented on issue #29432: Jinja templating doesn't work with container_resources when using dymanic task mapping with Kubernetes Pod Operator

2023-02-20 Thread via GitHub


hussein-awala commented on issue #29432:
URL: https://github.com/apache/airflow/issues/29432#issuecomment-1437570393

   @jose-lpa using `Variable.get()` in the dag script is not recommended 
because the `DagFileProcessor` process the script each X minutes, and it loads 
this variable from the DB. Also this solution works with Variable but not all 
the other jinja templates.
   
   @pshrivastava27 here is a solution for your need
   ```python
   import datetime
   import os
   
   from airflow import models
   from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import (
   KubernetesPodOperator,
   )
   from kubernetes.client import models as k8s_models
   
   dvt_image = os.environ.get("DVT_IMAGE", "dev")
   
   default_dag_args = {"start_date": datetime.datetime(2022, 1, 1)}
   
   
   class PatchedResourceRequirements(k8s_models.V1ResourceRequirements):
   template_fields = ("limits", "requests")
   
   
   def pod_mem():
   return "4000M"
   
   
   def pod_cpu():
   return "1000m"
   
   
   with models.DAG(
   "sample_dag",
   schedule_interval=None,
   default_args=default_dag_args,
   render_template_as_native_obj=True,
   user_defined_macros={
   "pod_mem": pod_mem,
   "pod_cpu": pod_cpu,
   },
   ) as dag:
   
   task_1 = KubernetesPodOperator(
   task_id="task_1",
   name="task_1",
   namespace="default",
   image=dvt_image,
   cmds=["bash", "-cx"],
   arguments=["echo hello"],
   service_account_name="sa-k8s",
   container_resources=PatchedResourceRequirements(
   limits={
   "memory": "{{ pod_mem() }}",
   "cpu": "{{ pod_cpu() }}",
   }
   ),
   startup_timeout_seconds=1800,
   get_logs=True,
   image_pull_policy="Always",
   config_file="/home/airflow/composer_kube_config",
   dag=dag,
   )
   
   task_2 = KubernetesPodOperator.partial(
   task_id="task_2",
   name="task_2",
   namespace="default",
   image=dvt_image,
   cmds=["bash", "-cx"],
   service_account_name="sa-k8s",
   container_resources=PatchedResourceRequirements(
   limits={
   "memory": "{{ pod_mem() }}",
   "cpu": "{{ pod_cpu() }}",
   }
   ),
   startup_timeout_seconds=1800,
   get_logs=True,
   image_pull_policy="Always",
   config_file="/home/airflow/composer_kube_config",
   dag=dag,
   ).expand(arguments=[["echo hello"]])
   
   task_1 >> task_2
   ``` 
   
   You can do the same for the other classes if needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #29315: AIP-44 Migrate BaseJob.heartbeat to InternalAPI

2023-02-20 Thread via GitHub


potiuk commented on issue #29315:
URL: https://github.com/apache/airflow/issues/29315#issuecomment-1437565241

   Assigned you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated (69babdcf74 -> ee0a56a2ca)

2023-02-20 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


from 69babdcf74 Use provided `USE_AIRFLOW_VERSION` env in verify providers 
step (#29081)
 add ee0a56a2ca Trigger Class migration to internal API (#29099)

No new revisions were added by this update.

Summary of changes:
 airflow/api_internal/endpoints/rpc_api_endpoint.py |  9 -
 airflow/models/trigger.py  | 10 +-
 2 files changed, 17 insertions(+), 2 deletions(-)



[GitHub] [airflow] potiuk closed issue #28613: AIP-44 Migrate Trigger class to Internal API

2023-02-20 Thread via GitHub


potiuk closed issue #28613: AIP-44 Migrate Trigger class to Internal API
URL: https://github.com/apache/airflow/issues/28613


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #29099: Trigger Class migration to internal API

2023-02-20 Thread via GitHub


potiuk merged PR #29099:
URL: https://github.com/apache/airflow/pull/29099


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #29348: Don't push secret in XCOM in BigQueryCreateDataTransferOperator

2023-02-20 Thread via GitHub


potiuk commented on PR #29348:
URL: https://github.com/apache/airflow/pull/29348#issuecomment-1437563814

   > And I guess we need to add tests for that, for avoid situation that 
somehow it return in the future.
   
   Agree


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated (54c4d590ca -> 69babdcf74)

2023-02-20 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


from 54c4d590ca fix clear dag run openapi spec responses by adding 
additional return type (#29600)
 add 69babdcf74 Use provided `USE_AIRFLOW_VERSION` env in verify providers 
step (#29081)

No new revisions were added by this update.

Summary of changes:
 scripts/in_container/verify_providers.py | 30 --
 1 file changed, 16 insertions(+), 14 deletions(-)



[GitHub] [airflow] potiuk merged pull request #29081: Use provided `USE_AIRFLOW_VERSION` env in verify providers step

2023-02-20 Thread via GitHub


potiuk merged PR #29081:
URL: https://github.com/apache/airflow/pull/29081


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on issue #29640: NoBoundaryInMultipartDefect raised using S3Hook

2023-02-20 Thread via GitHub


Taragolis commented on issue #29640:
URL: https://github.com/apache/airflow/issues/29640#issuecomment-1437551366

   Just for confirmation do you have same problem with same version of `boto3` 
and `botocore` if you call 
[S3.Client.download_file](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.download_file)
 or 
[S3.Client.download_fileobj](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.download_fileobj)
 over this file?
   
   Could you tried to run this instead of your code?
   
   ```python
   def download_from_s3_native(key: str, bucket_name: str, local_path: str) -> 
str:
   hook = S3Hook(aws_conn_id='s3_conn')
   s3_client = hook.conn
   with open(local_path, "wb") as data:
   s3_client.download_fileobj(key, bucket_name, data)
   
   return local_path
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #29495: Change permissions of config/password files created by airflow

2023-02-20 Thread via GitHub


potiuk commented on code in PR #29495:
URL: https://github.com/apache/airflow/pull/29495#discussion_r1112322251


##
airflow/configuration.py:
##
@@ -1545,6 +1548,12 @@ def initialize_config() -> AirflowConfigParser:
 return local_conf
 
 
+def make_group_other_inaccessible(file_path: str):
+with suppress(Exception):
+permissions = os.stat(file_path)
+os.chmod(file_path, permissions.st_mode & (stat.S_IRUSR | 
stat.S_IWUSR))

Review Comment:
   Changed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #29495: Change permissions of config/password files created by airflow

2023-02-20 Thread via GitHub


potiuk commented on code in PR #29495:
URL: https://github.com/apache/airflow/pull/29495#discussion_r1112320830


##
airflow/configuration.py:
##
@@ -1545,6 +1548,12 @@ def initialize_config() -> AirflowConfigParser:
 return local_conf
 
 
+def make_group_other_inaccessible(file_path: str):
+with suppress(Exception):
+permissions = os.stat(file_path)
+os.chmod(file_path, permissions.st_mode & (stat.S_IRUSR | 
stat.S_IWUSR))

Review Comment:
   I'd say log an error, not to raise it. There might be various reasons why 
changing permissions is not possible - for example when the filesystem config 
file gets generated on happens to be NFS.  Failing hard in this case is kinda 
strange at this point as the file has already been generated (so next time if 
we run airflow it will succeed because the whole branch is skiped in this case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on issue #29640: NoBoundaryInMultipartDefect raised using S3Hook

2023-02-20 Thread via GitHub


Taragolis commented on issue #29640:
URL: https://github.com/apache/airflow/issues/29640#issuecomment-1437543247

   > shouldn't be such an exception :)
   
   `¯\_(ツ)_/¯` https://github.com/boto/botocore/issues/2608


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] steren commented on a diff in pull request #28525: Add CloudRunExecuteJobOperator

2023-02-20 Thread via GitHub


steren commented on code in PR #28525:
URL: https://github.com/apache/airflow/pull/28525#discussion_r1112314074


##
airflow/providers/google/cloud/operators/cloud_run.py:
##
@@ -0,0 +1,115 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""This module contains Google Cloud Run Jobs operators."""
+from __future__ import annotations
+
+from typing import TYPE_CHECKING, Sequence
+
+from airflow.models import BaseOperator
+from airflow.providers.google.cloud.hooks.cloud_run import CloudRunJobHook
+from airflow.providers.google.cloud.links.cloud_run import 
CloudRunJobExecutionLink
+
+if TYPE_CHECKING:
+from airflow.utils.context import Context
+
+
+class CloudRunExecuteJobOperator(BaseOperator):
+"""
+Executes an existing Cloud Run job.
+
+.. seealso::
+For more information on how to use this operator, take a look at the 
guide:
+:ref:`howto/operator:CloudRunExecuteJobOperator`
+
+:param job_name: The name of the cloud run job to execute
+:param region: The region of the Cloud Run job (for example europe-west1)
+:param project_id: The ID of the GCP project that owns the job.
+If set to ``None`` or missing, the default project_id
+from the GCP connection is used.
+:param gcp_conn_id: The connection ID to use to connect to Google Cloud.
+:param delegate_to: The account to impersonate, if any.
+For this to work, the service account making the request
+must have domain-wide delegation enabled.
+:param wait_until_finished: If True, wait for the end of job execution
+before exiting. If False (default), only submits job.
+:param impersonation_chain: Optional service account to impersonate
+using short-term credentials, or chained list of accounts required
+to get the access_token of the last account in the list,
+which will be impersonated in the request. If set as a string, the
+account must grant the originating account the Service Account 
Token
+Creator IAM role. If set as a sequence, the identities from the 
list
+must grant Service Account Token Creator IAM role to the directly
+preceding identity, with first account from the list granting this
+role to the originating account (templated).
+"""
+
+template_fields: Sequence[str] = ("job_name", "region", "project_id", 
"gcp_conn_id")
+operator_extra_links = (CloudRunJobExecutionLink(),)
+
+def __init__(
+self,
+job_name: str,

Review Comment:
   Maybe this could be made optional? 
   I assume Airflow users don't necessarily want to name the job created but 
could be fine with an auto-generated job name.
   
   I'm not sure how Airflow works for other operators. It's up to you.



##
airflow/providers/google/cloud/hooks/cloud_run.py:
##
@@ -0,0 +1,245 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""This module contains a Google Cloud Run Hook."""
+from __future__ import annotations
+
+import json
+import time
+from typing import Any, Callable, Dict, List, Sequence, Union, cast
+
+from google.api_core.client_options import ClientOptions
+from googleapiclient.discovery import build
+
+from airflow.providers.google.common.hooks.base_google import GoogleBaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+DEFAULT_CLOUD_RUN_REGION = "us-central1"
+
+
+class CloudRunJobSteps:
+"

[GitHub] [airflow] steren commented on a diff in pull request #28525: Add CloudRunExecuteJobOperator

2023-02-20 Thread via GitHub


steren commented on code in PR #28525:
URL: https://github.com/apache/airflow/pull/28525#discussion_r1112314074


##
airflow/providers/google/cloud/operators/cloud_run.py:
##
@@ -0,0 +1,115 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""This module contains Google Cloud Run Jobs operators."""
+from __future__ import annotations
+
+from typing import TYPE_CHECKING, Sequence
+
+from airflow.models import BaseOperator
+from airflow.providers.google.cloud.hooks.cloud_run import CloudRunJobHook
+from airflow.providers.google.cloud.links.cloud_run import 
CloudRunJobExecutionLink
+
+if TYPE_CHECKING:
+from airflow.utils.context import Context
+
+
+class CloudRunExecuteJobOperator(BaseOperator):
+"""
+Executes an existing Cloud Run job.
+
+.. seealso::
+For more information on how to use this operator, take a look at the 
guide:
+:ref:`howto/operator:CloudRunExecuteJobOperator`
+
+:param job_name: The name of the cloud run job to execute
+:param region: The region of the Cloud Run job (for example europe-west1)
+:param project_id: The ID of the GCP project that owns the job.
+If set to ``None`` or missing, the default project_id
+from the GCP connection is used.
+:param gcp_conn_id: The connection ID to use to connect to Google Cloud.
+:param delegate_to: The account to impersonate, if any.
+For this to work, the service account making the request
+must have domain-wide delegation enabled.
+:param wait_until_finished: If True, wait for the end of job execution
+before exiting. If False (default), only submits job.
+:param impersonation_chain: Optional service account to impersonate
+using short-term credentials, or chained list of accounts required
+to get the access_token of the last account in the list,
+which will be impersonated in the request. If set as a string, the
+account must grant the originating account the Service Account 
Token
+Creator IAM role. If set as a sequence, the identities from the 
list
+must grant Service Account Token Creator IAM role to the directly
+preceding identity, with first account from the list granting this
+role to the originating account (templated).
+"""
+
+template_fields: Sequence[str] = ("job_name", "region", "project_id", 
"gcp_conn_id")
+operator_extra_links = (CloudRunJobExecutionLink(),)
+
+def __init__(
+self,
+job_name: str,

Review Comment:
   Maybe name could be optional? 
   
   I assume Airflow users don't necessarily want to name the job created but 
could be fine with an auto-generated job name.
   
   I'm not sure how Airflow works for other operators. It's up to you.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on pull request #29489: Fix failing multiple-output-inference tests for Python 3.8+

2023-02-20 Thread via GitHub


Taragolis commented on PR #29489:
URL: https://github.com/apache/airflow/pull/29489#issuecomment-1437535260

   > Then we will fix it there :)
   
   That why I mention this PR, might be it would save some time when it time 
come to Python 3.11
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated: fix clear dag run openapi spec responses by adding additional return type (#29600)

2023-02-20 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
 new 54c4d590ca fix clear dag run openapi spec responses by adding 
additional return type (#29600)
54c4d590ca is described below

commit 54c4d590caa9399f9fb331811531e0ea8d56aa41
Author: Alexander Liotta 
AuthorDate: Mon Feb 20 15:53:17 2023 -0500

fix clear dag run openapi spec responses by adding additional return type 
(#29600)
---
 airflow/api_connexion/openapi/v1.yaml| 4 +++-
 airflow/www/static/js/types/api-generated.ts | 3 ++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/airflow/api_connexion/openapi/v1.yaml 
b/airflow/api_connexion/openapi/v1.yaml
index f64cd7c8c8..505d2b6439 100644
--- a/airflow/api_connexion/openapi/v1.yaml
+++ b/airflow/api_connexion/openapi/v1.yaml
@@ -890,7 +890,9 @@ paths:
   content:
 application/json:
   schema:
-$ref: '#/components/schemas/DAGRun'
+anyOf:
+  - $ref: '#/components/schemas/DAGRun'
+  - $ref: '#/components/schemas/TaskInstanceCollection'
 '400':
   $ref: '#/components/responses/BadRequest'
 '401':
diff --git a/airflow/www/static/js/types/api-generated.ts 
b/airflow/www/static/js/types/api-generated.ts
index 7228d5fdf9..e206bc1ba4 100644
--- a/airflow/www/static/js/types/api-generated.ts
+++ b/airflow/www/static/js/types/api-generated.ts
@@ -3002,7 +3002,8 @@ export interface operations {
   /** Success. */
   200: {
 content: {
-  "application/json": components["schemas"]["DAGRun"];
+  "application/json": Partial &
+Partial;
 };
   };
   400: components["responses"]["BadRequest"];



  1   2   3   >