[GitHub] [airflow] uranusjr opened a new pull request, #28982: Resolve all variables in pickled XCom iterator
uranusjr opened a new pull request, #28982: URL: https://github.com/apache/airflow/pull/28982 This is needed for cross-process communication. See discussion starting here: https://github.com/apache/airflow/pull/28191#issuecomment-1384931149 One thing I’m not sure about is the test; testing against a raw SQL string feels very brittle and can break from engine to engine, or SQLAlchemy version to version. But I can’t think of a better way. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on pull request #28191: Add custom pickling hooks to LazyXComAccess
uranusjr commented on PR #28191: URL: https://github.com/apache/airflow/pull/28191#issuecomment-1384977262 Only when you want to pass results of a dynamic task into a subprocess (most commonly virtualenv, Docker, and Kubernetes, I think). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #28191: Add custom pickling hooks to LazyXComAccess
potiuk commented on PR #28191: URL: https://github.com/apache/airflow/pull/28191#issuecomment-1384945654 > Edit: Found it. We need literal_binds. https://docs.sqlalchemy.org/en/14/faq/sqlexpressions.html#rendering-bound-parameters-inline Yeah. That's cool. Just wonder - is the current form happening often? Or is it only when there is a specific way of using dynamic tasks (was not obvious for me when reading the description/error?). I think answer to that question might determine whether 2.5.1rc1 is good to release or not. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on pull request #28191: Add custom pickling hooks to LazyXComAccess
uranusjr commented on PR #28191: URL: https://github.com/apache/airflow/pull/28191#issuecomment-1384938029 Yes it is. The SQLAlchemy says that should give us a fully compiled SQL, but apparently it does not actually. I don’t know why (or _how_ can we produce a fully compiled SQL). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #28191: Add custom pickling hooks to LazyXComAccess
potiuk commented on PR #28191: URL: https://github.com/apache/airflow/pull/28191#issuecomment-1384934338 Isn't that SQL automatically generated by `query.statement.compile` from an ORM mapping ? At least that was my impression what happens here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on pull request #28191: Add custom pickling hooks to LazyXComAccess
uranusjr commented on PR #28191: URL: https://github.com/apache/airflow/pull/28191#issuecomment-1384931149 Actually I think this is something simpler and unrelated. With SQLite I get ``` INFO - sqlalchemy.exc.ProgrammingError: (sqlite3.ProgrammingError) Incorrect number of bindings supplied. The current statement uses 4, and there are 0 supplied. INFO - [SQL: SELECT xcom.value INFO - FROM xcom JOIN dag_run ON xcom.dag_run_id = dag_run.id INFO - WHERE xcom."key" = ? AND xcom.task_id = ? AND xcom.dag_id = ? AND xcom.run_id = ? ORDER BY xcom.map_index ASC] ``` Looks like a simple SQL error. Please open a new issue with better description. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #28191: Add custom pickling hooks to LazyXComAccess
potiuk commented on PR #28191: URL: https://github.com/apache/airflow/pull/28191#issuecomment-1384921406 cc: @pierrejeambrun @ephraimbuddy -> potential blocker for 2.5.1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #28191: Add custom pickling hooks to LazyXComAccess
potiuk commented on PR #28191: URL: https://github.com/apache/airflow/pull/28191#issuecomment-1384921032 cc: @uranusjr -> seems that MySQL hits us back again. I am not 100% sure whether this is a blocker to 2.5.1 - but it might be if it turns out that it is going to affect all MySQL users with Dynamic Task mapping (certainly it has potential to impact them). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] hterik commented on a diff in pull request #26639: Multi-threads support for processing diff queues in Kubernetes Executor
hterik commented on code in PR #26639: URL: https://github.com/apache/airflow/pull/26639#discussion_r1071805534 ## airflow/executors/kubernetes_executor.py: ## @@ -634,13 +690,37 @@ def sync(self) -> None: self.log.debug("self.queued: %s", self.queued_tasks) self.kube_scheduler.sync() -last_resource_version: dict[str, str] = defaultdict(lambda: "0") -while True: +"""processing result queue""" +self.last_resource_version = defaultdict(lambda: "0") +multi_threads_queue_process( Review Comment: multi_threads_queue_process will start and stop multiple threads for each sync() tick. I'm a bit out of touch on the real world overhead of this but I've always been taught that starting a thread comes with a lot of overhead. Can one use a fixed ThreadPool instead? This would also make the batching and queueing logic a lot easier, as dstandish suggested [above](https://github.com/apache/airflow/pull/26639/files#r1036345333) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] hterik commented on a diff in pull request #26639: Multi-threads support for processing diff queues in Kubernetes Executor
hterik commented on code in PR #26639: URL: https://github.com/apache/airflow/pull/26639#discussion_r1071796755 ## airflow/executors/kubernetes_executor.py: ## @@ -599,8 +655,31 @@ def sync(self) -> None: raise AirflowException(NOT_STARTED_MESSAGE) self.kube_scheduler.sync() -last_resource_version = None -while True: +"""processing result queue""" +multi_threads_queue_process( +queue_size=self.result_queue.qsize(), +queue_type='result', +process_method=self.process_result_queue, Review Comment: Today they might only use the `[]` and `in` operators, which in isolation are thread-safe, but tomorrow someone might add a for-loop or other side-effects, giving you "changed size during iteration" error. It is a lot of complexity to keep in mind if this can be modified from several threads at the same time. I'm not 100% sure but i believe this piece would potentially break, https://github.com/apache/airflow/blob/65010fda091242870a410c65478eae362899763b/airflow/executors/base_executor.py#L334-L336, if some thread is modifying event_buffer while one thread is iterating the loop, because it takes a snapshot of the keys before starting to pop them off. My suggestion is to isolate the multithreaded operations to more pure functions, then return their results on a queue which is consumed in the same main thread as is handling it today. Otherwise this could benefit from more documentation, on which member-properties that must be protected by locks, or what type of operations that are allowed in each function. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28979: Fix rendering parameters in PapermillOperator
potiuk commented on code in PR #28979: URL: https://github.com/apache/airflow/pull/28979#discussion_r1071795648 ## airflow/providers/papermill/operators/papermill.py: ## @@ -57,36 +59,43 @@ class PapermillOperator(BaseOperator): def __init__( self, *, -input_nb: str | None = None, -output_nb: str | None = None, +input_nb: str | NoteBook | None = None, +output_nb: str | NoteBook | None = None, Review Comment: I think it's OK to check their existence in the constructor. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] 01/01: Read logs from all containers in KPO
This is an automated email from the ASF dual-hosted git repository. dstandish pushed a commit to branch mlnsharma/main in repository https://gitbox.apache.org/repos/asf/airflow.git commit 292c23283e3f97b8d281fbcd581c2f4d7abfa98e Author: Narasimha Sharma AuthorDate: Mon Jan 16 22:10:55 2023 -0800 Read logs from all containers in KPO --- .../cncf/kubernetes/operators/kubernetes_pod.py| 13 +++- .../providers/cncf/kubernetes/utils/pod_manager.py | 87 -- kubernetes_tests/test_kubernetes_pod_operator.py | 2 +- .../cncf/kubernetes/utils/test_pod_manager.py | 23 ++ 4 files changed, 117 insertions(+), 8 deletions(-) diff --git a/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py b/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py index c34fc5c02b..96fb98fb53 100644 --- a/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py +++ b/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py @@ -168,6 +168,9 @@ class KubernetesPodOperator(BaseOperator): :param labels: labels to apply to the Pod. (templated) :param startup_timeout_seconds: timeout in seconds to startup the pod. :param get_logs: get the stdout of the container as logs of the tasks. +:param log_containers: list of container names or bool value to collect logs. +If bool value is True, all container logs are collected, +if False, only 'base' container logs are collected. :param image_pull_policy: Specify a policy to cache or always pull an image. :param annotations: non-identifying metadata you can attach to the Pod. Can be a large range of data, and can include characters @@ -248,6 +251,7 @@ class KubernetesPodOperator(BaseOperator): reattach_on_restart: bool = True, startup_timeout_seconds: int = 120, get_logs: bool = True, +log_containers: list[str] | bool = False, image_pull_policy: str | None = None, annotations: dict | None = None, container_resources: k8s.V1ResourceRequirements | None = None, @@ -311,6 +315,7 @@ class KubernetesPodOperator(BaseOperator): self.cluster_context = cluster_context self.reattach_on_restart = reattach_on_restart self.get_logs = get_logs +self.log_containers = log_containers self.image_pull_policy = image_pull_policy self.node_selector = node_selector or {} self.annotations = annotations or {} @@ -496,9 +501,13 @@ class KubernetesPodOperator(BaseOperator): self.await_pod_start(pod=self.pod) if self.get_logs: -self.pod_manager.fetch_container_logs( +# if log_containers is False, fetch logs from base container, otherwise +# fetch logs for all containers or for the specified input list of container names +self.pod_manager.fetch_input_container_logs( pod=self.pod, -container_name=self.BASE_CONTAINER_NAME, +log_containers=( +self.log_containers if self.log_containers else [self.BASE_CONTAINER_NAME] +), follow=True, ) else: diff --git a/airflow/providers/cncf/kubernetes/utils/pod_manager.py b/airflow/providers/cncf/kubernetes/utils/pod_manager.py index 56ef95eef0..1d1b2f06a6 100644 --- a/airflow/providers/cncf/kubernetes/utils/pod_manager.py +++ b/airflow/providers/cncf/kubernetes/utils/pod_manager.py @@ -84,6 +84,20 @@ def container_is_running(pod: V1Pod, container_name: str) -> bool: return container_status.state.running is not None +def container_is_completed(pod: V1Pod, container_name: str) -> bool: +""" +Examines V1Pod ``pod`` to determine whether ``container_name`` is running. +If that container is present and completed, returns True. Returns False otherwise. +""" +container_statuses = pod.status.container_statuses if pod and pod.status else None +if not container_statuses: +return False +container_status = next((status for status in container_statuses if status.name == container_name), None) +if not container_status: +return False +return container_status.state.terminated is not None + + def get_container_termination_message(pod: V1Pod, container_name: str): with suppress(AttributeError, TypeError): container_statuses = pod.status.container_statuses @@ -264,15 +278,68 @@ class PodManager(LoggingMixin): ) time.sleep(1) -def await_container_completion(self, pod: V1Pod, container_name: str) -> None: +def fetch_input_container_logs( +self, pod: V1Pod, log_containers: list[str] | bool, follow=False +) -> list[PodLoggingStatus]: +""" +Follow the logs of containers in the pod specified by input parameter and stream to airflow logging. +Returns when all the contain
[airflow] branch mlnsharma/main created (now 292c23283e)
This is an automated email from the ASF dual-hosted git repository. dstandish pushed a change to branch mlnsharma/main in repository https://gitbox.apache.org/repos/asf/airflow.git at 292c23283e Read logs from all containers in KPO This branch includes the following new commits: new 292c23283e Read logs from all containers in KPO The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference.
[GitHub] [airflow] mlnsharma opened a new pull request, #28981: Implemented log_containers option to read from all containers in KubernetesPodOperator
mlnsharma opened a new pull request, #28981: URL: https://github.com/apache/airflow/pull/28981 Added a new 'log_containers' param to KubernetesPodOperator. This param can take a list of container names or a boolean True/False as input with default value being boolean False. If boolean value False (default) is provided, only 'base' container logs are displayed, which is backward compatible. If boolean value True is provided, all container logs (except for airflow-xcom-sidecar container) are displayed. If a list of container names is provided, only those container logs are displayed. related: #27282 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mlnsharma closed pull request #27802: Read logs from all containers in pod for KubernetesPodOperator
mlnsharma closed pull request #27802: Read logs from all containers in pod for KubernetesPodOperator URL: https://github.com/apache/airflow/pull/27802 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] AndrewTsao commented on pull request #28191: Add custom pickling hooks to LazyXComAccess
AndrewTsao commented on PR #28191: URL: https://github.com/apache/airflow/pull/28191#issuecomment-1384749135 Hi, I tested #28146 got another exception like this. My SQL version 5.7. Python 3.8, OS is centos 7.9. python modules list: ``` apache-airflow 2.5.1rc1 apache-airflow-providers-celery 3.0.0 apache-airflow-providers-common-sql 1.0.0 apache-airflow-providers-ftp 3.0.0 apache-airflow-providers-http3.0.0 apache-airflow-providers-imap3.0.0 apache-airflow-providers-microsoft-mssql 3.1.0 apache-airflow-providers-microsoft-psrp 2.1.0 apache-airflow-providers-microsoft-winrm 3.0.0 apache-airflow-providers-mysql 3.0.0 apache-airflow-providers-redis 3.0.0 apache-airflow-providers-samba 4.0.0 apache-airflow-providers-sftp3.0.0 apache-airflow-providers-sqlite 3.0.0 apache-airflow-providers-ssh 3.0.0 apache-airflow-providers-common-sql 1.0.0 apache-airflow-providers-microsoft-mssql 3.1.0 apache-airflow-providers-mysql 3.0.0 apache-airflow-providers-sqlite 3.0.0 Flask-SQLAlchemy 2.5.1 marshmallow-sqlalchemy 0.26.1 mysql-connector-python 8.0.29 mysqlclient 2.1.1 pymssql 2.2.5 SQLAlchemy 1.4.46 SQLAlchemy-JSONField 1.0.0 SQLAlchemy-Utils 0.38.3 ``` ```sh [2023-01-17, 09:38:01 CST] {process_utils.py:179} INFO - Executing cmd: /home/andi/airflow/venv38/bin/python -m virtualenv /tmp/venvtb4apfe2 --system-site-packages [2023-01-17, 09:38:01 CST] {process_utils.py:183} INFO - Output: [2023-01-17, 09:38:02 CST] {process_utils.py:187} INFO - created virtual environment CPython3.8.0.final.0-64 in 236ms [2023-01-17, 09:38:02 CST] {process_utils.py:187} INFO - creator CPython3Posix(dest=/tmp/venvtb4apfe2, clear=False, no_vcs_ignore=False, global=True) [2023-01-17, 09:38:02 CST] {process_utils.py:187} INFO - seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/andi/.local/share/virtualenv) [2023-01-17, 09:38:02 CST] {process_utils.py:187} INFO - added seed packages: pip==22.3.1, setuptools==65.6.3, wheel==0.38.4 [2023-01-17, 09:38:02 CST] {process_utils.py:187} INFO - activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator [2023-01-17, 09:38:02 CST] {process_utils.py:179} INFO - Executing cmd: /tmp/venvtb4apfe2/bin/pip install -r /tmp/venvtb4apfe2/requirements.txt [2023-01-17, 09:38:02 CST] {process_utils.py:183} INFO - Output: [2023-01-17, 09:38:03 CST] {process_utils.py:187} INFO - Looking in indexes: http://pypi:8081 [2023-01-17, 09:38:08 CST] {process_utils.py:179} INFO - Executing cmd: /tmp/venvtb4apfe2/bin/python /tmp/venvtb4apfe2/script.py /tmp/venvtb4apfe2/script.in /tmp/venvtb4apfe2/script.out /tmp/venvtb4apfe2/string_args.txt [2023-01-17, 09:38:08 CST] {process_utils.py:183} INFO - Output: [2023-01-17, 09:38:10 CST] {process_utils.py:187} INFO - Traceback (most recent call last): [2023-01-17, 09:38:10 CST] {process_utils.py:187} INFO - File "/home/andi/airflow/venv38/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context [2023-01-17, 09:38:10 CST] {process_utils.py:187} INFO - self.dialect.do_execute( [2023-01-17, 09:38:10 CST] {process_utils.py:187} INFO - File "/home/andi/airflow/venv38/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute [2023-01-17, 09:38:10 CST] {process_utils.py:187} INFO - cursor.execute(statement, parameters) [2023-01-17, 09:38:10 CST] {process_utils.py:187} INFO - File "/home/andi/airflow/venv38/lib/python3.8/site-packages/MySQLdb/cursors.py", line 206, in execute [2023-01-17, 09:38:10 CST] {process_utils.py:187} INFO - res = self._query(query) [2023-01-17, 09:38:10 CST] {process_utils.py:187} INFO - File "/home/andi/airflow/venv38/lib/python3.8/site-packages/MySQLdb/cursors.py", line 319, in _query [2023-01-17, 09:38:10 CST] {process_utils.py:187} INFO - db.query(q) [2023-01-17, 09:38:10 CST] {process_utils.py:187} INFO - File "/home/andi/airflow/venv38/lib/python3.8/site-packages/MySQLdb/connections.py", line 254, in query [2023-01-17, 09:38:10 CST] {process_utils.py:187} INFO - _mysql.connection.query(self, query) [2023-01-17, 09:38:10 CST] {process_utils.py:187} INFO - MySQLdb.ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '%s AND xcom.task_id = %s AND xcom.dag_id = %s AND xcom.run_id = %s ORDER BY xcom'
[GitHub] [airflow] uranusjr commented on a diff in pull request #28979: Fix rendering parameters in PapermillOperator
uranusjr commented on code in PR #28979: URL: https://github.com/apache/airflow/pull/28979#discussion_r1071672038 ## airflow/providers/papermill/operators/papermill.py: ## @@ -33,6 +33,8 @@ class NoteBook(File): """Jupyter notebook""" +template_fields: ClassVar = {*File.template_fields, "parameters"} Review Comment: ```suggestion template_fields: ClassVar[str] = {*File.template_fields, "parameters"} ``` (I don’t think the annotation is actually needed at all though, it should inherit the type from `File`.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr opened a new pull request, #28980: Ensure non-JSON types go through encoder
uranusjr opened a new pull request, #28980: URL: https://github.com/apache/airflow/pull/28980 Additional encoding logic is added for frozenset and set so the output is JSON-compatible. Also removed some unnecessary error handling. The default encoder simply raises TypeError, so we don't need to catch TypeError from serialize(). Fix #28741. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #28979: Fix rendering parameters in PapermillOperator
Taragolis commented on code in PR #28979: URL: https://github.com/apache/airflow/pull/28979#discussion_r1071666872 ## airflow/providers/papermill/operators/papermill.py: ## @@ -57,36 +59,43 @@ class PapermillOperator(BaseOperator): def __init__( self, *, -input_nb: str | None = None, -output_nb: str | None = None, +input_nb: str | NoteBook | None = None, +output_nb: str | NoteBook | None = None, Review Comment: may be we should also make this fields mandatory -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on pull request #28979: Fix rendering parameters in PapermillOperator
Taragolis commented on PR #28979: URL: https://github.com/apache/airflow/pull/28979#issuecomment-1384729239 cc: @TPapajCin @marvinfretly @nicnguyen3103 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis opened a new pull request, #28979: Fix rendering parameters in PapermillOperator
Taragolis opened a new pull request, #28979: URL: https://github.com/apache/airflow/pull/28979 Add missing "parameters" into `NoteBook`'s templated fields and remove redundant loop on inlets/outlets related: #28977 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr opened a new pull request, #28978: Minor improvements to serde helpers
uranusjr opened a new pull request, #28978: URL: https://github.com/apache/airflow/pull/28978 No functional changes, just some cleanups and docstring formatting. Extracted from #28742 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr closed pull request #28742: Fix serialize() usages in custom JSON encoders
uranusjr closed pull request #28742: Fix serialize() usages in custom JSON encoders URL: https://github.com/apache/airflow/pull/28742 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on pull request #28742: Fix serialize() usages in custom JSON encoders
uranusjr commented on PR #28742: URL: https://github.com/apache/airflow/pull/28742#issuecomment-1384707382 Rethinking this, I _think_ this can simply be achieved by a better implemented `default`. I’ll split out that main change and other minor improvements here into two PRs so things are easier to manage. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] shyft-mike commented on a diff in pull request #28846: Updated app to support configuring the caching hash method for FIPS
shyft-mike commented on code in PR #28846: URL: https://github.com/apache/airflow/pull/28846#discussion_r1071642334 ## airflow/www/app.py: ## @@ -131,7 +132,29 @@ def create_app(config=None, testing=False): init_robots(flask_app) +# Configure caching +webserver_caching_hash_method = conf.get(section="webserver", key="CACHING_HASH_METHOD", fallback=None) cache_config = {"CACHE_TYPE": "flask_caching.backends.filesystem", "CACHE_DIR": gettempdir()} Review Comment: Ah, great, thanks! Addressing the other comments now 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on issue #28399: Dataproc Generator should allow creating spot instances
github-actions[bot] commented on issue #28399: URL: https://github.com/apache/airflow/issues/28399#issuecomment-1384687968 This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ferruzzi commented on a diff in pull request #28816: introduce a method to convert dictionaries to boto-style key-value lists
ferruzzi commented on code in PR #28816: URL: https://github.com/apache/airflow/pull/28816#discussion_r1071626275 ## airflow/providers/amazon/aws/hooks/s3.py: ## @@ -42,6 +42,7 @@ from airflow.exceptions import AirflowException from airflow.providers.amazon.aws.exceptions import S3HookUriParseFailure from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook +from airflow.providers.amazon.aws.utils.aws_api import format_tags Review Comment: I could get behind that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk closed issue #26931: Templating (like {{ ds }} ) stopped working in papermill after upgrade from 2.3.x to 2.4.x
potiuk closed issue #26931: Templating (like {{ ds }} ) stopped working in papermill after upgrade from 2.3.x to 2.4.x URL: https://github.com/apache/airflow/issues/26931 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Dinghang commented on a diff in pull request #26639: Multi-threads support for processing diff queues in Kubernetes Executor
Dinghang commented on code in PR #26639: URL: https://github.com/apache/airflow/pull/26639#discussion_r1071622999 ## airflow/executors/kubernetes_executor.py: ## @@ -62,6 +64,50 @@ KubernetesWatchType = Tuple[str, str, Optional[str], Dict[str, str], str] +def multi_threads_queue_process( +queue_size: int, +queue_type: str, +process_method: Callable, +max_threads: int, +log: Logger, +batch_size: Optional[int] = None, +) -> None: +""" +Helper method to enable multi-threads for processing queues used with kubernetes executor +:param queue_size: the size of the queue getting processed +:param queue_type: the type of the queue +:param process_method: the real method processing the queue +:param max_threads: the max num of threads to be used +:param log: log +:param batch_size: the max num of items we want to process in this round. + If it's not set, the current queue size will be used. +""" +if queue_size == 0: +log.info(f'There is no item to process in the {queue_type} queue.') +return + +start_time = time.time() +log.info(f'Start processing {queue_type} queue with at most {max_threads} threads.') + +batch_size = min(batch_size or queue_size, queue_size) +max_threads = min(max_threads, queue_size) + +threads = [] +quotient, remainder = divmod(batch_size, max_threads) +for i in range(max_threads): +sub_batch_size = quotient + 1 if i < remainder else quotient +t = Thread(target=process_method, args=[sub_batch_size]) +threads.append(t) +t.start() +for t in threads: +t.join() Review Comment: Hi @dstandish thanks for the info. Most likely would like to keep the current implementation. About the "one thread may finish more quickly than the others". This could happen but it's not a problem since we are focusing on improving is for extreme case with hundreds or thousands items in the queue. Overall threads are making it more efficient. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #28548: Move comment to the right position
potiuk commented on PR #28548: URL: https://github.com/apache/airflow/pull/28548#issuecomment-1384656059 Closing the change - seems not justified. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk closed pull request #28548: Move comment to the right position
potiuk closed pull request #28548: Move comment to the right position URL: https://github.com/apache/airflow/pull/28548 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] hussein-awala commented on a diff in pull request #28961: Emit DataDog statsd metrics with metadata tags
hussein-awala commented on code in PR #28961: URL: https://github.com/apache/airflow/pull/28961#discussion_r1071620930 ## airflow/config_templates/config.yml: ## @@ -851,6 +851,13 @@ metrics: type: string example: ~ default: "" +statsd_datadog_metrics_tags: + description: | +If set to True, Airflow will add metadata as tags for some of the emitted metrics + version_added: 2.6.0 + type: boolean + example: ~ + default: "False" Review Comment: Nothing needs to be changed, I'll change the default value to `True` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (3adaa9b34e -> 65010fda09)
This is an automated email from the ASF dual-hosted git repository. pierrejeambrun pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git from 3adaa9b34e AWS system tests selective log purge (#28819) add 65010fda09 Fix for LocalKubernetesExecutor scheduler is not serving logs (#28638) No new revisions were added by this update. Summary of changes: airflow/cli/commands/scheduler_command.py | 2 +- airflow/executors/base_executor.py | 2 ++ airflow/executors/celery_kubernetes_executor.py| 2 ++ airflow/executors/local_executor.py| 2 ++ airflow/executors/local_kubernetes_executor.py | 2 ++ airflow/executors/sequential_executor.py | 2 ++ tests/cli/commands/test_scheduler_command.py | 1 + tests/executors/test_base_executor.py | 4 tests/executors/test_celery_kubernetes_executor.py | 3 +++ tests/executors/test_local_executor.py | 3 +++ tests/executors/test_local_kubernetes_executor.py | 3 +++ tests/executors/test_sequential_executor.py| 3 +++ 12 files changed, 28 insertions(+), 1 deletion(-)
[GitHub] [airflow] hussein-awala commented on a diff in pull request #28961: Emit DataDog statsd metrics with metadata tags
hussein-awala commented on code in PR #28961: URL: https://github.com/apache/airflow/pull/28961#discussion_r1071620409 ## airflow/dag_processing/processor.py: ## @@ -484,7 +486,13 @@ def manage_slas(self, dag: DAG, session: Session = None) -> None: callback(dag, task_list, blocking_task_list, slas, blocking_tis) notification_sent = True except Exception: -Stats.incr("sla_callback_notification_failure") +Stats.incr( +"sla_callback_notification_failure", +tags={ +"dag_id": dag.dag_id, +"func_name": callback.func_name, # type: ignore[attr-defined] Review Comment: Actually this is the way to get the callback name in python 2, but it's removed in python 3 and replaced by `callback.__name__`, so we need to fix this. Would you prefer I fix it here or in a separate PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pierrejeambrun merged pull request #28638: Fix for LocalKubernetesExecutor scheduler is not serving logs
pierrejeambrun merged PR #28638: URL: https://github.com/apache/airflow/pull/28638 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Dinghang commented on a diff in pull request #26639: Multi-threads support for processing diff queues in Kubernetes Executor
Dinghang commented on code in PR #26639: URL: https://github.com/apache/airflow/pull/26639#discussion_r1052930708 ## airflow/executors/kubernetes_executor.py: ## @@ -370,7 +416,17 @@ def sync(self) -> None: """ self.log.debug("Syncing KubernetesExecutor") self._health_check_kube_watcher() -while True: + +multi_threads_queue_process( +queue_size=self.watcher_queue.qsize(), +queue_type='watcher', +process_method=self.process_watcher_queue, Review Comment: Hi @dstandish , not sure what you meant network calls here. Here, process_watcher_queue is just a method to process items in the watcher queue. ## airflow/executors/kubernetes_executor.py: ## @@ -599,8 +655,31 @@ def sync(self) -> None: raise AirflowException(NOT_STARTED_MESSAGE) self.kube_scheduler.sync() -last_resource_version = None -while True: +"""processing result queue""" +multi_threads_queue_process( +queue_size=self.result_queue.qsize(), +queue_type='result', +process_method=self.process_result_queue, Review Comment: Hi @hterik , that's a good point! both self.running and self.event_buffer are doing some atomic operations. AFAIK, they are thread-safe. ## airflow/executors/kubernetes_executor.py: ## @@ -370,7 +416,17 @@ def sync(self) -> None: """ self.log.debug("Syncing KubernetesExecutor") self._health_check_kube_watcher() -while True: + +multi_threads_queue_process( +queue_size=self.watcher_queue.qsize(), +queue_type='watcher', +process_method=self.process_watcher_queue, Review Comment: Hi @dstandish , sorry for the late reply. Was AFK due to some personal issues. Just got back. Will be addressing comments accordingly. Gonna ping you again once it's ready. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ashb commented on a diff in pull request #28816: introduce a method to convert dictionaries to boto-style key-value lists
ashb commented on code in PR #28816: URL: https://github.com/apache/airflow/pull/28816#discussion_r1071617889 ## airflow/providers/amazon/aws/hooks/s3.py: ## @@ -42,6 +42,7 @@ from airflow.exceptions import AirflowException from airflow.providers.amazon.aws.exceptions import S3HookUriParseFailure from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook +from airflow.providers.amazon.aws.utils.aws_api import format_tags Review Comment: `airflow/providers/amazon/aws/utils/tags.py`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Dinghang commented on pull request #26639: Multi-threads support for processing diff queues in Kubernetes Executor
Dinghang commented on PR #26639: URL: https://github.com/apache/airflow/pull/26639#issuecomment-1384642241 Hi Team, gonna get some time addressing comments this week. Expect to reply back next week. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] snjypl commented on pull request #28638: Fix for LocalKubernetesExecutor scheduler is not serving logs
snjypl commented on PR #28638: URL: https://github.com/apache/airflow/pull/28638#issuecomment-1384637725 > Could use a rebase before merging, it's 20+ commits behind master Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] snjypl opened a new pull request, #28976: Migrate DagFileProcessorManager.clear_nonexistent_import_errors to internal API
snjypl opened a new pull request, #28976: URL: https://github.com/apache/airflow/pull/28976 Fixes: #28785 --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk closed pull request #28540: Using pypi proxy for speeding up development
potiuk closed pull request #28540: Using pypi proxy for speeding up development URL: https://github.com/apache/airflow/pull/28540 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #28540: Using pypi proxy for speeding up development
potiuk commented on PR #28540: URL: https://github.com/apache/airflow/pull/28540#issuecomment-1384632819 Now when I think of it, I think this is not really needed, it seems like a really niche case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] JGoldman110 commented on issue #28947: Status of testing of Apache Airflow 2.5.1rc1
JGoldman110 commented on issue #28947: URL: https://github.com/apache/airflow/issues/28947#issuecomment-1384632106 #28788 - Swagger UI looks good -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #28457: Add deferrable mode to DataprocCreateBatchOperator
potiuk commented on PR #28457: URL: https://github.com/apache/airflow/pull/28457#issuecomment-1384630289 Needs conflict resolving. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #28520: Set pip install opts in _PIP_ADDITIONAL_REQUIREMENTS
potiuk commented on PR #28520: URL: https://github.com/apache/airflow/pull/28520#issuecomment-1384614065 closing this one as it seems to be achievable by PIP_* variables. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk closed pull request #28520: Set pip install opts in _PIP_ADDITIONAL_REQUIREMENTS
potiuk closed pull request #28520: Set pip install opts in _PIP_ADDITIONAL_REQUIREMENTS URL: https://github.com/apache/airflow/pull/28520 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] vincbeck commented on a diff in pull request #28964: Add transfer operator S3 to (generic) SQL
vincbeck commented on code in PR #28964: URL: https://github.com/apache/airflow/pull/28964#discussion_r1071602052 ## airflow/providers/amazon/aws/transfers/s3_to_sql.py: ## @@ -0,0 +1,158 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +from __future__ import annotations + +import os +from typing import TYPE_CHECKING, Any, Sequence + +from airflow.exceptions import AirflowException +from airflow.hooks.base import BaseHook +from airflow.models import BaseOperator +from airflow.providers.amazon.aws.hooks.s3 import S3Hook +from airflow.providers.common.sql.hooks.sql import DbApiHook + +if TYPE_CHECKING: +from airflow.utils.context import Context + +from typing_extensions import Literal + +try: +import csv as csv +except ImportError as e: +from airflow.exceptions import AirflowOptionalProviderFeatureException + +raise AirflowOptionalProviderFeatureException from e + + +class S3ToSqlOperator(BaseOperator): +""" +Loads Data from S3 into a SQL Database. +Data should be readable as CSV. + +This operator downloads a file from an S3, reads it via `csv.reader` +and inserts the data into a SQL database using `insert_rows` method. +All SQL hooks are supported, as long as it is of type DbApiHook Review Comment: I dont think it makes sense as default but I do think that providing at least one parser in code makes sense. A CSV parser would be excellent! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] vincbeck commented on a diff in pull request #28964: Add transfer operator S3 to (generic) SQL
vincbeck commented on code in PR #28964: URL: https://github.com/apache/airflow/pull/28964#discussion_r1071602052 ## airflow/providers/amazon/aws/transfers/s3_to_sql.py: ## @@ -0,0 +1,158 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +from __future__ import annotations + +import os +from typing import TYPE_CHECKING, Any, Sequence + +from airflow.exceptions import AirflowException +from airflow.hooks.base import BaseHook +from airflow.models import BaseOperator +from airflow.providers.amazon.aws.hooks.s3 import S3Hook +from airflow.providers.common.sql.hooks.sql import DbApiHook + +if TYPE_CHECKING: +from airflow.utils.context import Context + +from typing_extensions import Literal + +try: +import csv as csv +except ImportError as e: +from airflow.exceptions import AirflowOptionalProviderFeatureException + +raise AirflowOptionalProviderFeatureException from e + + +class S3ToSqlOperator(BaseOperator): +""" +Loads Data from S3 into a SQL Database. +Data should be readable as CSV. + +This operator downloads a file from an S3, reads it via `csv.reader` +and inserts the data into a SQL database using `insert_rows` method. +All SQL hooks are supported, as long as it is of type DbApiHook Review Comment: I dont think it makes sense as default but I do think that providing at least one parser in code or documentation makes sense. A CSV parser would be excellent! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #27012: External python operator logging
potiuk commented on issue #27012: URL: https://github.com/apache/airflow/issues/27012#issuecomment-1384609843 Are there any actions form that? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #27802: Read logs from all containers in pod for KubernetesPodOperator
potiuk commented on PR #27802: URL: https://github.com/apache/airflow/pull/27802#issuecomment-1384606835 CAn you solve the conflicts please @mlnsharma before others can take a look at it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #28452: TaskInstances do not succeed when using enable_logging=True option in DockerSwarmOperator
potiuk commented on issue #28452: URL: https://github.com/apache/airflow/issues/28452#issuecomment-1384606025 Marked it as good first issue, hopefully someone wil take a look at this one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] harishkrao commented on a diff in pull request #28950: Sensor for Databricks partition and table changes
harishkrao commented on code in PR #28950: URL: https://github.com/apache/airflow/pull/28950#discussion_r1071586211 ## airflow/providers/databricks/sensors/databricks.py: ## @@ -0,0 +1,231 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# +"""This module contains Databricks sensors.""" + +from __future__ import annotations + +from datetime import datetime, timedelta +from typing import Any, Callable, Sequence + +from airflow.exceptions import AirflowException +from airflow.providers.common.sql.hooks.sql import fetch_all_handler +from airflow.providers.databricks.hooks.databricks_sql import DatabricksSqlHook +from airflow.sensors.base import BaseSensorOperator +from airflow.utils.context import Context + + +class DatabricksSqlSensor(BaseSensorOperator): +""" +Generic Databricks SQL sensor. Review Comment: Thank you, good points keeping the SQL sensor generic. I am working on modifying the branch accordingly. I will push the changes soon. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (cd637c223f -> 3adaa9b34e)
This is an automated email from the ASF dual-hosted git repository. onikolas pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git from cd637c223f Fix label name for `reauth` field in Docker Connection (#28974) add 3adaa9b34e AWS system tests selective log purge (#28819) No new revisions were added by this update. Summary of changes: tests/system/providers/amazon/aws/example_batch.py | 21 +-- tests/system/providers/amazon/aws/example_glue.py | 30 +++- .../system/providers/amazon/aws/example_lambda.py | 22 ++-- .../providers/amazon/aws/example_sagemaker.py | 27 +++--- .../amazon/aws/example_sagemaker_endpoint.py | 32 + .../system/providers/amazon/aws/utils/__init__.py | 42 -- 6 files changed, 103 insertions(+), 71 deletions(-)
[GitHub] [airflow] o-nikolas merged pull request #28819: AWS system tests selective log purge
o-nikolas merged PR #28819: URL: https://github.com/apache/airflow/pull/28819 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pierrejeambrun commented on pull request #28638: Fix for LocalKubernetesExecutor scheduler is not serving logs
pierrejeambrun commented on PR #28638: URL: https://github.com/apache/airflow/pull/28638#issuecomment-1384545009 Could use a rebase before merging, it's 20+ commits behind master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ferruzzi commented on a diff in pull request #28816: introduce a method to convert dictionaries to boto-style key-value lists
ferruzzi commented on code in PR #28816: URL: https://github.com/apache/airflow/pull/28816#discussion_r1071576124 ## airflow/providers/amazon/aws/hooks/s3.py: ## @@ -42,6 +42,7 @@ from airflow.exceptions import AirflowException from airflow.providers.amazon.aws.exceptions import S3HookUriParseFailure from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook +from airflow.providers.amazon.aws.utils.aws_api import format_tags Review Comment: Pretty sure Jarek had mentioned not liking stuff in the __init__ files, so I've been trying to avoid it when I can think of a better name. Sometimes it's easier than others. Naming is hard. In this case I think aws_api.py is a sound option. @eladkal - do you definitely want it changed or were you just making sure I was good with it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch constraints-main updated: Updating constraints. Build id:
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch constraints-main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/constraints-main by this push: new 66a94cc546 Updating constraints. Build id: 66a94cc546 is described below commit 66a94cc546000f646d09fe7bb6e04fe181cf2797 Author: Automated GitHub Actions commit AuthorDate: Mon Jan 16 20:55:37 2023 + Updating constraints. Build id: This update in constraints is automatically committed by the CI 'constraints-push' step based on HEAD of '' in '' with commit sha . All tests passed in this build so we determined we can push the updated constraints. See https://github.com/apache/airflow/blob/main/README.md#installing-from-pypi for details. --- constraints-3.10.txt | 9 - constraints-3.7.txt | 8 constraints-3.8.txt | 8 constraints-3.9.txt | 9 - constraints-no-providers-3.10.txt | 2 +- constraints-no-providers-3.7.txt | 2 +- constraints-no-providers-3.8.txt | 2 +- constraints-no-providers-3.9.txt | 2 +- constraints-source-providers-3.10.txt | 9 - constraints-source-providers-3.7.txt | 8 constraints-source-providers-3.8.txt | 8 constraints-source-providers-3.9.txt | 9 - 12 files changed, 36 insertions(+), 40 deletions(-) diff --git a/constraints-3.10.txt b/constraints-3.10.txt index 378869f182..191fb8b770 100644 --- a/constraints-3.10.txt +++ b/constraints-3.10.txt @@ -1,5 +1,5 @@ # -# This constraints file was automatically generated on 2023-01-16T00:27:32Z +# This constraints file was automatically generated on 2023-01-16T20:54:58Z # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the providers from PIP-released packages at the moment of the constraint generation. @@ -323,7 +323,6 @@ idna==3.4 ijson==3.2.0.post0 imagesize==1.4.1 importlib-metadata==6.0.0 -importlib-resources==5.10.2 impyla==0.18.0 incremental==22.10.0 inflection==0.5.1 @@ -400,8 +399,8 @@ ntlm-auth==1.5.0 numpy==1.22.4 oauthlib==3.2.2 objsize==0.6.1 -openapi-schema-validator==0.3.4 -openapi-spec-validator==0.5.1 +openapi-schema-validator==0.4.0 +openapi-spec-validator==0.5.2 opsgenie-sdk==2.1.5 oracledb==1.2.1 orjson==3.8.5 @@ -513,7 +512,7 @@ rfc3986==1.5.0 rich-click==1.6.0 rich==13.1.0 rsa==4.9 -ruff==0.0.222 +ruff==0.0.223 s3transfer==0.6.0 sarif-om==1.0.4 sasl==0.3.1 diff --git a/constraints-3.7.txt b/constraints-3.7.txt index ea660f9657..d6eba00248 100644 --- a/constraints-3.7.txt +++ b/constraints-3.7.txt @@ -1,5 +1,5 @@ # -# This constraints file was automatically generated on 2023-01-16T00:28:07Z +# This constraints file was automatically generated on 2023-01-16T20:55:34Z # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the providers from PIP-released packages at the moment of the constraint generation. @@ -400,8 +400,8 @@ ntlm-auth==1.5.0 numpy==1.21.1 oauthlib==3.2.2 objsize==0.6.1 -openapi-schema-validator==0.3.4 -openapi-spec-validator==0.5.1 +openapi-schema-validator==0.4.0 +openapi-spec-validator==0.5.2 opsgenie-sdk==2.1.5 oracledb==1.2.1 orjson==3.8.5 @@ -513,7 +513,7 @@ rfc3986==1.5.0 rich-click==1.6.0 rich==13.1.0 rsa==4.9 -ruff==0.0.222 +ruff==0.0.223 s3transfer==0.6.0 sarif-om==1.0.4 sasl==0.3.1 diff --git a/constraints-3.8.txt b/constraints-3.8.txt index 55d1ca9c01..a6ec001096 100644 --- a/constraints-3.8.txt +++ b/constraints-3.8.txt @@ -1,5 +1,5 @@ # -# This constraints file was automatically generated on 2023-01-16T00:27:59Z +# This constraints file was automatically generated on 2023-01-16T20:55:28Z # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the providers from PIP-released packages at the moment of the constraint generation. @@ -401,8 +401,8 @@ ntlm-auth==1.5.0 numpy==1.22.4 oauthlib==3.2.2 objsize==0.6.1 -openapi-schema-validator==0.3.4 -openapi-spec-validator==0.5.1 +openapi-schema-validator==0.4.0 +openapi-spec-validator==0.5.2 opsgenie-sdk==2.1.5 oracledb==1.2.1 orjson==3.8.5 @@ -515,7 +515,7 @@ rfc3986==1.5.0 rich-click==1.6.0 rich==13.1.0 rsa==4.9 -ruff==0.0.222 +ruff==0.0.223 s3transfer==0.6.0 sarif-om==1.0.4 sasl==0.3.1 diff --git a/constraints-3.9.txt b/constraints-3.9.txt index 4de6daf953..e2bad7f3e6 100644 --- a/constraints-3.9.txt +++ b/constraints-3.9.txt @@ -1,5 +1,5 @@ # -# This constraints file was automatically generated on 2023-01-16T00:27:57Z +# This
[GitHub] [airflow] ferruzzi commented on a diff in pull request #28816: introduce a method to convert dictionaries to boto-style key-value lists
ferruzzi commented on code in PR #28816: URL: https://github.com/apache/airflow/pull/28816#discussion_r1071472110 ## airflow/providers/amazon/aws/hooks/s3.py: ## @@ -42,6 +42,7 @@ from airflow.exceptions import AirflowException from airflow.providers.amazon.aws.exceptions import S3HookUriParseFailure from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook +from airflow.providers.amazon.aws.utils.aws_api import format_tags Review Comment: I'm not sure what else you;'d name the module here that would make more sense? It could be `boto_api` or `boto_utils` maybe but I'm not sure either of those are any better, really? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] maggesssss commented on a diff in pull request #28964: Add transfer operator S3 to (generic) SQL
magges commented on code in PR #28964: URL: https://github.com/apache/airflow/pull/28964#discussion_r1071573009 ## airflow/providers/amazon/aws/transfers/s3_to_sql.py: ## @@ -0,0 +1,158 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +from __future__ import annotations + +import os +from typing import TYPE_CHECKING, Any, Sequence + +from airflow.exceptions import AirflowException +from airflow.hooks.base import BaseHook +from airflow.models import BaseOperator +from airflow.providers.amazon.aws.hooks.s3 import S3Hook +from airflow.providers.common.sql.hooks.sql import DbApiHook + +if TYPE_CHECKING: +from airflow.utils.context import Context + +from typing_extensions import Literal + +try: +import csv as csv +except ImportError as e: +from airflow.exceptions import AirflowOptionalProviderFeatureException + +raise AirflowOptionalProviderFeatureException from e + + +class S3ToSqlOperator(BaseOperator): +""" +Loads Data from S3 into a SQL Database. +Data should be readable as CSV. + +This operator downloads a file from an S3, reads it via `csv.reader` +and inserts the data into a SQL database using `insert_rows` method. +All SQL hooks are supported, as long as it is of type DbApiHook Review Comment: @vincbeck I agree. I like the Idea of using a `parser` parameter. Do you think it makes sense to use a default parser? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ferruzzi commented on a diff in pull request #28819: AWS system tests selective log purge
ferruzzi commented on code in PR #28819: URL: https://github.com/apache/airflow/pull/28819#discussion_r1071571584 ## tests/system/providers/amazon/aws/utils/__init__.py: ## @@ -249,7 +251,43 @@ def set_env_id() -> str: return env_id -def purge_logs( +def all_tasks_passed(ti) -> bool: +task_runs = ti.get_dagrun().get_task_instances() +return all([_task.state != State.FAILED for _task in task_runs]) + + +@task(trigger_rule=TriggerRule.ALL_DONE) +def prune_logs( +logs: list[tuple[str, str | None]], +force_delete: bool = False, +retry: bool = False, +retry_times: int = 3, +ti=None, +): +""" +If all tasks in this dagrun have succeeded, then delete the associated logs. +Otherwise, append the logs with a retention policy. This allows the logs +to be used for troubleshooting but assures they won't build up indefinitely. + +:param logs: A list of log_group/stream_prefix tuples to delete. +:param force_delete: Whether to check log streams within the log group before +removal. If True, removes the log group and all its log streams inside it. +:param retry: Whether to retry if the log group/stream was not found. In some +cases, the log group/stream is created seconds after the main resource has +been created. By default, it retries for 3 times with a 5s waiting period. +:param retry_times: Number of retries. +:param ti: Used to check the status of the tasks. This gets pulled from the +DAG's context and does not need to be passed manually. +""" +if all_tasks_passed(ti): +_purge_logs(logs, force_delete, retry, retry_times) +else: +client: BaseClient = boto3.client("logs") +for group, _ in logs: +client.put_retention_policy(logGroupName=group, retentionInDays=30) Review Comment: This is a pretty small blast-radius and a very easy thing to change later. Let's merge it with 30 and if Ash (or anyone else, for that matter) has any strong feelings on it, it's a super easy change to make. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] o-nikolas commented on pull request #28638: Fix for LocalKubernetesExecutor scheduler is not serving logs
o-nikolas commented on PR #28638: URL: https://github.com/apache/airflow/pull/28638#issuecomment-1384527196 Anyone have some time to give this a second review/approval? Would be nice to get this merged for @snjypl Maybe @potiuk, @dstandish or @pierrejeambrun? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] o-nikolas commented on pull request #28161: AIP-51 - Executor Coupling in Logging
o-nikolas commented on PR #28161: URL: https://github.com/apache/airflow/pull/28161#issuecomment-1384526101 Anyone have some time to give this a second review/approval? Would be nice to get this merged for @snjypl Maybe @potiuk, @eladkal or @pierrejeambrun? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] o-nikolas commented on a diff in pull request #28925: Add log for AWS Glue Job Console URL
o-nikolas commented on code in PR #28925: URL: https://github.com/apache/airflow/pull/28925#discussion_r1071561782 ## airflow/providers/amazon/aws/operators/glue.py: ## @@ -144,6 +149,20 @@ def execute(self, context: Context): self.wait_for_completion, ) glue_job_run = glue_job.initialize_job(self.script_args, self.run_job_kwargs) +glue_job_run_url = ( +f"{BASE_AWS_CONSOLE_LINK}/gluestudio/home?" ++ f"region={glue_job.conn_region_name}#/job/{urllib.parse.quote(self.job_name, safe='')}/run/" ++ glue_job_run["JobRunId"] +) Review Comment: Thanks for the PR! Could you not reuse the format string from the Link class so as to not duplicate it here? The two could easily get out of sync. Something like (NOTE: untested): ```suggestion glue_job_run_url = GlueJobRunDetailsLink.format_str.format( region_name=glue_job.conn_region_name, job_name=urllib.parse.quote(self.job_name, safe=''), job_run_id=glue_job_run["JobRunId"]) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow-site] potiuk commented on pull request #623: Remove Bash from the process of building the site
potiuk commented on PR #623: URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1384518413 There are a number of problems with dependencies the current approach has (and it caused us some problems in the recent past) so I think this is somethign we have to do anyway soon. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow-site] potiuk commented on pull request #623: Remove Bash from the process of building the site
potiuk commented on PR #623: URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1384517156 H,,, I honestly think we should not invest too much in it. We are discussing about switching to Pelikan which is going to change the process completely anyway. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] eladkal commented on issue #28938: Status of testing Providers that were prepared on January 14, 2023
eladkal commented on issue #28938: URL: https://github.com/apache/airflow/issues/28938#issuecomment-1384512167 Docker provider will be excluded from this release and I will cut RC2 for it after this wave -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] vincbeck commented on a diff in pull request #28900: [WIP] Convert DagFileProcessor.execute_callbacks to Internal API
vincbeck commented on code in PR #28900: URL: https://github.com/apache/airflow/pull/28900#discussion_r1071556627 ## airflow/api_internal/actions/dag.py: ## @@ -0,0 +1,42 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +from __future__ import annotations + +from sqlalchemy import exc +from sqlalchemy.orm import Session + +from airflow.api_internal.internal_api_call import internal_api_call +from airflow.exceptions import TaskNotFound +from airflow.models import Operator +from airflow.utils.session import NEW_SESSION, provide_session + + +class InternalApiDagActions: +@staticmethod +@internal_api_call +@provide_session +def get_serialized_dag(dag_id: str, task_id: str, session: Session = NEW_SESSION) -> Operator | None: Review Comment: We can, I was just scared of introducing some cyclic dependencies but we can deal with them if they show up I guess -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] o-nikolas commented on a diff in pull request #28934: Remove hard-coded executor-database coupling
o-nikolas commented on code in PR #28934: URL: https://github.com/apache/airflow/pull/28934#discussion_r1071551207 ## airflow/cli/commands/standalone_command.py: ## @@ -159,14 +159,28 @@ def calculate_env(self): # Make sure we're using a local executor flavour executor_class, _ = ExecutorLoader.import_default_executor_cls() if not executor_class.is_local: -if "sqlite" in conf.get("database", "sql_alchemy_conn"): -self.print_output("standalone", "Forcing executor to SequentialExecutor") -env["AIRFLOW__CORE__EXECUTOR"] = executor_constants.SEQUENTIAL_EXECUTOR -else: -self.print_output("standalone", "Forcing executor to LocalExecutor") -env["AIRFLOW__CORE__EXECUTOR"] = executor_constants.LOCAL_EXECUTOR +env["AIRFLOW__CORE__EXECUTOR"] = self.get_local_executor( +database=conf.get("database", "sql_alchemy_conn") +) +self.print_output("standalone", f"Forcing executor to {env['AIRFLOW__CORE__EXECUTOR']}") Review Comment: The more I reflect on this, the more I think the original code is actually fine. This logic is only executed if the executor selected by the user (or provided by the user via plugin) is not local, in which case we select a working default for them. However, they are completely able to implement/select a compatible local executor themselves if they wish to be opinionated about which executor to use. I think this logic and the helper method below actually might complicate things a bit more than it helps. WDYT? Also CC: @pierrejeambrun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk closed issue #28975: Additional info about Segmentation Violation in LocalTaskJob (#27381): @Taragolis
potiuk closed issue #28975: Additional info about Segmentation Violation in LocalTaskJob (#27381): @Taragolis URL: https://github.com/apache/airflow/issues/28975 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #28975: Additional info about Segmentation Violation in LocalTaskJob (#27381): @Taragolis
potiuk commented on issue #28975: URL: https://github.com/apache/airflow/issues/28975#issuecomment-1384495920 Wrong click I guess. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #28941: Tasks are in the queued status after restarting the redis container
potiuk commented on issue #28941: URL: https://github.com/apache/airflow/issues/28941#issuecomment-1384493990 Then it nees to be looked up by somoene who knows celery and redis more than I do. I do not know redis that much - I guess there are delays in processing and saving the stored data. If you kill redis abruptly by kill -9 for example , then (obviously like any other software) it might loose some data that it keeps in memory and absolutely nothing can be done about it. There will be hangning tasks in this case which you will have to clear. That's the usual recovery mechanism from catastrophic failures. No system in the world can be made resilient to it really unless you do a lot of operational overhead and redundancy (and if you would like to do that, then it is more of a deployment issue). I think you should make sure that you are stopping redis in the "gentle" way that gives it a chance to flush everything to the disk and make sure that is actually restoring it from there Please then open a new issue with Helm chart and ideally showing all the logs (incuding debug logs showing redis storing and restoring the data - to make sure that it actually happens). If you can reproduce that knowng tha tredis is storing/restoring the queue, then I think that's something that somone who is a celery expert should take a look at so it's worth opening an issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] o-nikolas commented on a diff in pull request #28934: Remove hard-coded executor-database coupling
o-nikolas commented on code in PR #28934: URL: https://github.com/apache/airflow/pull/28934#discussion_r1071544358 ## airflow/sensors/base.py: ## @@ -257,11 +258,14 @@ def _get_next_poke_interval( def prepare_for_execution(self) -> BaseOperator: task = super().prepare_for_execution() + # Sensors in `poke` mode can block execution of DAGs when running # with single process executor, thus we change the mode to`reschedule` # to allow parallel task being scheduled and executed -if conf.get("core", "executor") == "DebugExecutor": -self.log.warning("DebugExecutor changes sensor mode to 'reschedule'.") +executor_name = conf.get("core", "executor") +executor = ExecutorLoader.load_executor(executor_name) Review Comment: You can just use `import_default_executor_cls` here I reckon ## airflow/cli/commands/standalone_command.py: ## @@ -159,14 +159,28 @@ def calculate_env(self): # Make sure we're using a local executor flavour executor_class, _ = ExecutorLoader.import_default_executor_cls() if not executor_class.is_local: -if "sqlite" in conf.get("database", "sql_alchemy_conn"): -self.print_output("standalone", "Forcing executor to SequentialExecutor") -env["AIRFLOW__CORE__EXECUTOR"] = executor_constants.SEQUENTIAL_EXECUTOR -else: -self.print_output("standalone", "Forcing executor to LocalExecutor") -env["AIRFLOW__CORE__EXECUTOR"] = executor_constants.LOCAL_EXECUTOR +env["AIRFLOW__CORE__EXECUTOR"] = self.get_local_executor( +database=conf.get("database", "sql_alchemy_conn") +) +self.print_output("standalone", f"Forcing executor to {env['AIRFLOW__CORE__EXECUTOR']}") return env +@classmethod +def get_local_executor(cls, database: str) -> str: +""" +Get local executor for standalone command +for sqlite we need SEQUENTIAL_EXECUTOR otherwise LOCAL_EXECUTOR. +""" +try: +return [ +executor_name +for executor_name in ExecutorLoader.executors.keys() +if executor_name != "DaskExecutor" Review Comment: This looks like executor coupling, why are we excluding the Dask Executor here specifically? Could there ever be other executors that would need excluding in the same way? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (dc3a3c7c52 -> cd637c223f)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git from dc3a3c7c52 Add DataprocCancelOperationOperator (#28456) add cd637c223f Fix label name for `reauth` field in Docker Connection (#28974) No new revisions were added by this update. Summary of changes: airflow/providers/docker/hooks/docker.py | 5 - 1 file changed, 4 insertions(+), 1 deletion(-)
[GitHub] [airflow] potiuk merged pull request #28974: Fix label name for `reauth` field in Docker Connection
potiuk merged PR #28974: URL: https://github.com/apache/airflow/pull/28974 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (d24527bf75 -> dc3a3c7c52)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git from d24527bf75 Add documentation about cli `add connection` and AWS connection URI (#28852) add dc3a3c7c52 Add DataprocCancelOperationOperator (#28456) No new revisions were added by this update. Summary of changes: airflow/providers/google/cloud/hooks/dataproc.py | 4 ++ .../providers/google/cloud/operators/dataproc.py | 73 +++-- airflow/providers/google/cloud/sensors/dataproc.py | 76 +- .../operators/cloud/dataproc.rst | 22 +++ .../google/cloud/sensors/test_dataproc.py | 73 - .../cloud/dataproc/example_dataproc_batch.py | 60 - 6 files changed, 298 insertions(+), 10 deletions(-)
[GitHub] [airflow] potiuk merged pull request #28456: Add DataprocCancelOperationOperator
potiuk merged PR #28456: URL: https://github.com/apache/airflow/pull/28456 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #28947: Status of testing of Apache Airflow 2.5.1rc1
potiuk commented on issue #28947: URL: https://github.com/apache/airflow/issues/28947#issuecomment-1384439268 All Good (I left one change for @webster-chainalys to confirm :) * https://github.com/apache/airflow/pull/27720 is difficult to trigger but this is just exception handling and I checked it is present in the sources and it's really and edge case * https://github.com/apache/airflow/pull/28011 works fine - tested it with REALLY BAD zip file :) * https://github.com/apache/airflow/pull/28090 - doc change only * https://github.com/apache/airflow/pull/28094 - doc/docker-compose change only (I checked it is updated) * https://github.com/apache/airflow/pull/28138 - works. no warning generated * https://github.com/apache/airflow/pull/28170 - just a structure change for tests * https://github.com/apache/airflow/pull/28205 - just a structure change for tests * https://github.com/apache/airflow/pull/28207 - just a structure change for tests * https://github.com/apache/airflow/pull/28247 - test fix only * https://github.com/apache/airflow/pull/28283 - I will defer to @webster-chainalysis to test if it works for them * https://github.com/apache/airflow/pull/28443 - ARM image has plyvel * https://github.com/apache/airflow/pull/28498 - test only change * https://github.com/apache/airflow/pull/28514 https://github.com/apache/airflow/pull/28518 https://github.com/apache/airflow/pull/28519 - CI only changes * https://github.com/apache/airflow/pull/28537 - doc change, checked that empty_plugin is there. * https://github.com/apache/airflow/pull/28627 - internal refactor only. Seems to detect architecture properly * https://github.com/apache/airflow/pull/28672 - proper help printed * https://github.com/apache/airflow/pull/28725 - proper metadata in wheel: Requires-Dist: sqlalchemy (<2.0,>=1.4) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on issue #28975: Additional info about Segmentation Violation in LocalTaskJob (#27381): @Taragolis
Taragolis commented on issue #28975: URL: https://github.com/apache/airflow/issues/28975#issuecomment-1384396913 Oh.. It is probably miss click to "Convert to issue" on list of tasks. Isn't it? @potiuk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] vandonr-amz commented on issue #28938: Status of testing Providers that were prepared on January 14, 2023
vandonr-amz commented on issue #28938: URL: https://github.com/apache/airflow/issues/28938#issuecomment-1384396111 my Sagemaker changes were covered by a system test that's still green, so good to go :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on issue #28975: Additional info about Segmentation Violation in LocalTaskJob (#27381): @Taragolis
Taragolis commented on issue #28975: URL: https://github.com/apache/airflow/issues/28975#issuecomment-1384394131 👀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] johannaojeling commented on pull request #28764: Add support for running a Beam Go pipeline with an executable binary
johannaojeling commented on PR #28764: URL: https://github.com/apache/airflow/pull/28764#issuecomment-1384389915 @uranusjr can this be merged? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #28816: introduce a method to convert dictionaries to boto-style key-value lists
Taragolis commented on code in PR #28816: URL: https://github.com/apache/airflow/pull/28816#discussion_r1071479962 ## airflow/providers/amazon/aws/hooks/s3.py: ## @@ -42,6 +42,7 @@ from airflow.exceptions import AirflowException from airflow.providers.amazon.aws.exceptions import S3HookUriParseFailure from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook +from airflow.providers.amazon.aws.utils.aws_api import format_tags Review Comment: Maybe a stupid idea but what if we place it in `airflow/providers/amazon/aws/utils/__init__.py` or it for generic stuff only? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] vandonr-amz commented on a diff in pull request #28816: introduce a method to convert dictionaries to boto-style key-value lists
vandonr-amz commented on code in PR #28816: URL: https://github.com/apache/airflow/pull/28816#discussion_r1071478212 ## airflow/providers/amazon/aws/hooks/s3.py: ## @@ -42,6 +42,7 @@ from airflow.exceptions import AirflowException from airflow.providers.amazon.aws.exceptions import S3HookUriParseFailure from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook +from airflow.providers.amazon.aws.utils.aws_api import format_tags Review Comment: I named it boto_ something in my first commit, but then as Taragolis mentioned: > this is not boto3 behaviour this how what AWS API expected this parameters I changed it to aws -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ferruzzi commented on a diff in pull request #28816: introduce a method to convert dictionaries to boto-style key-value lists
ferruzzi commented on code in PR #28816: URL: https://github.com/apache/airflow/pull/28816#discussion_r1071472110 ## airflow/providers/amazon/aws/hooks/s3.py: ## @@ -42,6 +42,7 @@ from airflow.exceptions import AirflowException from airflow.providers.amazon.aws.exceptions import S3HookUriParseFailure from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook +from airflow.providers.amazon.aws.utils.aws_api import format_tags Review Comment: I'm not sure what else you;'d name the module here that would make more sense? It cold be boto_api or boto_utils maybe but I'm not sure either of those are any better, really? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #27328: SFTPOperator throws object of type 'PlainXComArg' has no len() when using with Taskflow API
potiuk commented on issue #27328: URL: https://github.com/apache/airflow/issues/27328#issuecomment-1384371084 Woul you like to fix it @jtommi ? And @uranusjr - we both approved the change. I am starting to think that we should attempt to do do some automated prevention of similar issues - this is all too easy (and seemingly obvious) to perform such valiations and conversions in the constructor rather than in execute methods. And apparently it is easy to get past the aproval of both of us, so possibly that's a sign we should summon our CI to prevent such things. Though I am not sure yet how to do it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #27328: SFTPOperator throws object of type 'PlainXComArg' has no len() when using with Taskflow API
potiuk commented on issue #27328: URL: https://github.com/apache/airflow/issues/27328#issuecomment-1384367659 Yep. It looks like this change https://github.com/apache/airflow/pull/2 broke it (the number indicates it's a bit hell-ish change). the checks are in constructor but should be in the execute() method. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Romsik788 commented on issue #28941: Tasks are in the queued status after restarting the redis container
Romsik788 commented on issue #28941: URL: https://github.com/apache/airflow/issues/28941#issuecomment-1384367636 > But it's up to you to make it robust and production-ready and resilient to any kind of failures, I am afraid. Ok, I have deployed Apache Airflow on kubernetes 1.22 using the official [helm chart](https://artifacthub.io/packages/helm/apache-airflow/airflow/1.7.0) with the following settings: ``` elasticsearch: enabled: true workers: persistence: storageClassName: oci-bv podAnnotations: log_format: json redis: persistence: storageClassName: oci-bv images: useDefaultImageForMigration: true createUserJob: useHelmHooks: false migrateDatabaseJob: useHelmHooks: false ``` But still I get the same result. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] jtommi opened a new issue, #27328: SFTPOperator throws object of type 'PlainXComArg' has no len() when using with Taskflow API
jtommi opened a new issue, #27328: URL: https://github.com/apache/airflow/issues/27328 ### Apache Airflow Provider(s) sftp ### Versions of Apache Airflow Providers apache-airflow-providers-sftp==4.1.0 ### Apache Airflow version 2.4.2 Python 3.10 ### Operating System Debian 11 (Official docker image) ### Deployment Docker-Compose ### Deployment details Base image is apache/airflow:2.4.2-python3.10 ### What happened When combining Taskflow API and SFTPOperator, it throws an exception that didn't happen with apache-airflow-providers-sftp 4.0.0 ### What you think should happen instead The DAG should work as expected ### How to reproduce ```python import pendulum from airflow import DAG from airflow.decorators import task from airflow.providers.sftp.operators.sftp import SFTPOperator with DAG( "example_sftp", schedule="@once", start_date=pendulum.datetime(2021, 1, 1, tz="UTC"), catchup=False, tags=["example"], ) as dag: @task def get_file_path(): return "test.csv" local_filepath = get_file_path() upload = SFTPOperator( task_id=f"upload_file_to_sftp", ssh_conn_id="sftp_connection", local_filepath=local_filepath, remote_filepath="test.csv", ) ``` ### Anything else ```logs [2022-10-27T15:21:38.106+]` {logging_mixin.py:120} INFO - [2022-10-27T15:21:38.102+] {dagbag.py:342} ERROR - Failed to import: /opt/airflow/dags/test.py Traceback (most recent call last): File "/home/airflow/.local/lib/python3.10/site-packages/airflow/models/dagbag.py", line 338, in parse loader.exec_module(new_module) File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed File "/opt/airflow/dags/test.py", line 21, in upload = SFTPOperator( File "/home/airflow/.local/lib/python3.10/site-packages/airflow/models/baseoperator.py", line 408, in apply_defaults result = func(self, **kwargs, default_args=default_args) File "/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/sftp/operators/sftp.py", line 116, in __init__ if len(self.local_filepath) != len(self.remote_filepath): TypeError: object of type 'PlainXComArg' has no len() ``` It looks like the offending code was introduced in commit 5f073e38dd46217b64dbc16d7b1055d89e8c3459 ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated: Add documentation about cli `add connection` and AWS connection URI (#28852)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new d24527bf75 Add documentation about cli `add connection` and AWS connection URI (#28852) d24527bf75 is described below commit d24527bf759c80dd22684a0fb51c283bbafb9298 Author: Arkadiusz Rudny <93520526+aru-tracku...@users.noreply.github.com> AuthorDate: Mon Jan 16 18:26:15 2023 +0100 Add documentation about cli `add connection` and AWS connection URI (#28852) --- .../connections/aws.rst | 12 .../logging/s3-task-handler.rst | 20 ++-- 2 files changed, 26 insertions(+), 6 deletions(-) diff --git a/docs/apache-airflow-providers-amazon/connections/aws.rst b/docs/apache-airflow-providers-amazon/connections/aws.rst index 14956d1de1..e885887980 100644 --- a/docs/apache-airflow-providers-amazon/connections/aws.rst +++ b/docs/apache-airflow-providers-amazon/connections/aws.rst @@ -146,6 +146,18 @@ Snippet to create Connection and convert to URI os.environ[env_key] = conn_uri print(conn.test_connection()) + + .. warning:: When using the Airflow CLI, a ``@`` may need to be added when: + +- login +- password +- host +- port + +are not given, see example below. This is a known airflow limitation. + +``airflow connections add aws_conn --conn-uri aws://@/?region_name=eu-west-1`` + Using instance profile ^^ diff --git a/docs/apache-airflow-providers-amazon/logging/s3-task-handler.rst b/docs/apache-airflow-providers-amazon/logging/s3-task-handler.rst index 8352432c22..016c2d5165 100644 --- a/docs/apache-airflow-providers-amazon/logging/s3-task-handler.rst +++ b/docs/apache-airflow-providers-amazon/logging/s3-task-handler.rst @@ -113,17 +113,25 @@ We are using the existing ``serviceAccount`` hence ``create: false`` with exist delete_worker_pods: 'False' encrypt_s3_logs: 'True' -Step3: Create Amazon Web Services connection in Airflow Web UI +Step3: Create Amazon Web Services connection ~~ With the above configurations, Webserver and Worker Pods can access Amazon S3 bucket and write logs without using any Access Key and Secret Key or Instance profile credentials. -The final step to create connections under Airflow UI before executing the DAGs. +- Using Airflow Web UI -* Login to Airflow Web UI with ``admin`` credentials and Navigate to ``Admin -> Connections`` -* Create connection for ``Amazon Web Services`` and select the options(Connection ID and Connection Type) as shown in the image. -* Select the correct region where S3 bucket is created in ``Extra`` text box. + The final step to create connections under Airflow UI before executing the DAGs. -.. image:: /img/aws-base-conn-airflow.png + * Login to Airflow Web UI with ``admin`` credentials and Navigate to ``Admin -> Connections`` + * Create connection for ``Amazon Web Services`` and select the options (Connection ID and Connection Type) as shown in the image. + * Select the correct region where S3 bucket is created in ``Extra`` text box. + + .. image:: /img/aws-base-conn-airflow.png + +- Using Airflow CLI + + ``airflow connections add aws_conn --conn-uri aws://@/?egion_name=eu-west-1`` + + Note that ``@`` used in ``-conn-uri`` parameter usually separates password and host but in this case it complies with uri validator used. Step4: Verify the logs ~~
[GitHub] [airflow] potiuk merged pull request #28852: Add documentation about cli `add connection` and AWS connection URI
potiuk merged PR #28852: URL: https://github.com/apache/airflow/pull/28852 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on pull request #28852: Add documentation about cli `add connection` and AWS connection URI
boring-cyborg[bot] commented on PR #28852: URL: https://github.com/apache/airflow/pull/28852#issuecomment-1384362745 Awesome work, congrats on your first merged pull request! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] rafalh commented on a diff in pull request #28970: Fix support for macros with dots in DataProcJobBuilder
rafalh commented on code in PR #28970: URL: https://github.com/apache/airflow/pull/28970#discussion_r1071454703 ## tests/providers/google/cloud/hooks/test_dataproc.py: ## @@ -1006,17 +1006,9 @@ def test_set_python_main(self): self.builder.set_python_main(main) assert main == self.builder.job["job"][self.job_type]["main_python_file_uri"] -@pytest.mark.parametrize( -"job_name", -[ -pytest.param("name", id="simple"), -pytest.param("name_with_dash", id="name with underscores"), -pytest.param("group.name", id="name with dot"), -pytest.param("group.name_with_dash", id="name with dot and underscores"), -], -) Review Comment: They were added with sanitization (dot substitution) in `DataProcJobBuilder`. Now that I removed sanitization from that class all test cases with dots would fail. I just reverted this test to the state before the changes in `DataProcJobBuilder` and added a test that checks if proper job_id is passed from the operator instead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated: Mark license block in doc as text (#28965)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new 4fff0555b5 Mark license block in doc as text (#28965) 4fff0555b5 is described below commit 4fff0555b505f7a51efb2bbd108378780eb850d9 Author: Tzu-ping Chung AuthorDate: Tue Jan 17 01:07:30 2023 +0800 Mark license block in doc as text (#28965) Sphinx was helpfully (but incorrectly) adding Python syntax highlighting to the block. This marks the block explicitly as pure text to avoid that. --- docs/apache-airflow/license.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/apache-airflow/license.rst b/docs/apache-airflow/license.rst index ceec3e78b6..e5bb13e36b 100644 --- a/docs/apache-airflow/license.rst +++ b/docs/apache-airflow/license.rst @@ -23,7 +23,7 @@ License .. image:: img/apache.jpg :width: 150 -:: +.. code-block:: text Apache License Version 2.0, January 2004
[GitHub] [airflow] potiuk merged pull request #28965: Mark license block in doc as text
potiuk merged PR #28965: URL: https://github.com/apache/airflow/pull/28965 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #28965: Mark license block in doc as text
potiuk commented on PR #28965: URL: https://github.com/apache/airflow/pull/28965#issuecomment-1384337521 I wish there was `::code-block:: licence` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk merged pull request #28780: Fix hot loop with a burst of log messages (#28778)
potiuk merged PR #28780: URL: https://github.com/apache/airflow/pull/28780 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk closed issue #28778: Script "clean-logs.sh" has an unexpected burst behavior
potiuk closed issue #28778: Script "clean-logs.sh" has an unexpected burst behavior URL: https://github.com/apache/airflow/issues/28778 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on pull request #28780: Fix hot loop with a burst of log messages (#28778)
boring-cyborg[bot] commented on PR #28780: URL: https://github.com/apache/airflow/pull/28780#issuecomment-1384335195 Awesome work, congrats on your first merged pull request! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated: Fix hot loop with a burst of log messages (#28778) (#28780)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new 4b1a36f833 Fix hot loop with a burst of log messages (#28778) (#28780) 4b1a36f833 is described below commit 4b1a36f833b77d3f0bec78958d1fb9f360b7b11b Author: Ivan Reche AuthorDate: Mon Jan 16 14:05:34 2023 -0300 Fix hot loop with a burst of log messages (#28778) (#28780) * Fix hot loop with a burst of log messages (#28778) * Add missing inline in Dockerfile --- Dockerfile | 3 ++- scripts/docker/clean-logs.sh | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/Dockerfile b/Dockerfile index 42b9db931f..74634887a9 100644 --- a/Dockerfile +++ b/Dockerfile @@ -1025,7 +1025,8 @@ while true; do xargs -0 rm -f seconds=$(( $(date -u +%s) % EVERY)) - (( seconds < 1 )) || sleep $((EVERY - seconds)) + (( seconds < 1 )) || sleep $((EVERY - seconds - 1)) + sleep 1 done EOF diff --git a/scripts/docker/clean-logs.sh b/scripts/docker/clean-logs.sh index 57b6e8b605..0c775b14e6 100644 --- a/scripts/docker/clean-logs.sh +++ b/scripts/docker/clean-logs.sh @@ -36,5 +36,6 @@ while true; do xargs -0 rm -f seconds=$(( $(date -u +%s) % EVERY)) - (( seconds < 1 )) || sleep $((EVERY - seconds)) + (( seconds < 1 )) || sleep $((EVERY - seconds - 1)) + sleep 1 done
[GitHub] [airflow] csm10495 commented on pull request #28808: Allow setting the name for the base container within K8s Pod Operator
csm10495 commented on PR #28808: URL: https://github.com/apache/airflow/pull/28808#issuecomment-1384319670 @eladkal can you check again? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org