[GitHub] [airflow] IDeepakI opened a new issue, #27654: Airflow webserver not starting when provided external DB
IDeepakI opened a new issue, #27654: URL: https://github.com/apache/airflow/issues/27654 ### Official Helm Chart version 1.7.0 (latest released) ### Apache Airflow version 2.3.1 ### Kubernetes Version 1.23.10 ### Helm Chart configuration ``` webserverSecretKey: 'ad2126ad-95c5-4050-b785-218f84b4bafa' defaultAirflowRepository: 'airflow' defaultAirflowTag: 'test' images: airflow: pullPolicy: Never webserver: service: type: NodePort resources: limits: cpu: 500m memory: 1028Mi config: core: enable_xcom_pickling: True pgbouncer: # The maximum number of connections to PgBouncer maxClientConn: 100 # The maximum number of server connections to the metadata database from PgBouncer metadataPoolSize: 10 # The maximum number of server connections to the result backend database from PgBouncer resultBackendPoolSize: 5 enabled: true redis: enabled: false data: brokerUrl: redis://redis.test.org:6379/0 postgresql: enabled: False enableBuiltInSecretEnvVars: AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: False AIRFLOW__CELERY__RESULT_BACKEND: False secret: - envName: "AIRFLOW__DATABASE__SQL_ALCHEMY_CONN" secretName: "airflow-metadata" secretKey: "AIRFLOW__DATABASE__SQL_ALCHEMY_CONN" - envName: "AIRFLOW__CELERY__RESULT_BACKEND" secretName: "airflow-metadata" secretKey: "AIRFLOW__CELERY__RESULT_BACKEND" extraEnvFrom: |- - configMapRef: name: 'airflow-variables ``` ### Docker Image customisations _No response_ ### What happened I was running simple airflow on kubernetes using helm with a configuration specified. But the webserver not getting initialized. here are the kubernetes pod ``` NAME READY STATUS RESTARTS AGE airflow-pgbouncer-db449d6b-98ldf 2/2 Running 0 49m airflow-run-airflow-migrations-6bdwg 0/1 Completed 0 49m airflow-scheduler-9f78574d6-cq8dc 2/2 Running 1 (9m42s ago) 49m airflow-statsd-7c7584d6f8-6ldmr1/1 Running 0 49m airflow-triggerer-6b4c678799-m6pg5 1/1 Running 4 (4m32s ago) 49m airflow-webserver-5575454d74-tc9zm 0/1 Running 5 (94s ago) 49m airflow-worker-0 2/2 Running 0 49m ``` here is webserver logs ``` [2022-11-14 07:30:49,068] {manager.py:543} INFO - Removed Permission View: menu_access on Permissions [2022-11-14 07:30:49,376] {manager.py:543} INFO - Removed Permission View: menu_access on Permissions [2022-11-14 07:30:49,377] {manager.py:543} INFO - Removed Permission View: menu_access on Permissions [2022-11-14 07:30:49,377] {manager.py:543} INFO - Removed Permission View: menu_access on Permissions [2022-11-14 07:31:01,429] {manager.py:508} INFO - Created Permission View: menu access on Permissions [2022-11-14 07:31:01,698] {manager.py:511} ERROR - Creation of Permission View Error: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "ab_permission_view_permission_id_view_menu_id_key" DETAIL: Key (permission_id, view_menu_id)=(5, 15) already exists. [SQL: INSERT INTO ab_permission_view (id, permission_id, view_menu_id) VALUES (nextval('ab_permission_view_id_seq'), %(permission_id)s, %(view_menu_id)s) RETURNING ab_permission_view.id] [parameters: {'permission_id': 5, 'view_menu_id': 15}] (Background on this error at: http://sqlalche.me/e/14/gkpj) [2022-11-14 07:31:01,698] {manager.py:511} ERROR - Creation of Permission View Error: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "ab_permission_view_permission_id_view_menu_id_key" DETAIL: Key (permission_id, view_menu_id)=(5, 15) already exists. [SQL: INSERT INTO ab_permission_view (id, permission_id, view_menu_id) VALUES (nextval('ab_permission_view_id_seq'), %(permission_id)s, %(view_menu_id)s) RETURNING ab_permission_view.id] [parameters: {'permission_id': 5, 'view_menu_id': 15}] (Background on this error at: http://sqlalche.me/e/14/gkpj) [2022-11-14 07:31:01,698] {manager.py:511} ERROR - Creation of Permission View Error: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "ab_permission_view_permission_id_view_menu_id_key" DETAIL: Key (permission_id, view_menu_id)=(5, 15) already exists. [SQL: INSERT INTO ab_permission_view (id, permission_id, view_menu_id) VALUES (nextval('ab_permission_view_id_seq'), %(permission_id)s, %(view_menu_id)s) RETURNING ab_permission_view.id] [parameters: {'permission_id': 5, 'view_menu_id': 15}] (Background on this error at:
[GitHub] [airflow] Bowrna commented on issue #27614: Double execution of failure callback for task
Bowrna commented on issue #27614: URL: https://github.com/apache/airflow/issues/27614#issuecomment-1313193827 @potiuk If someone could throw light on this it would be useful Once it is getting executed as part of the `taskinstance` callback, while another time it is getting executed as part of `DagFileProcessorProcess` Is it expected for callback to be executed from both of these file. I will go back and debug this further and share more details in this thread. I will also check if success callback is getting executed via `DagFileProcessorProcess` too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Bowrna commented on issue #27614: Double execution of failure callback for task
Bowrna commented on issue #27614: URL: https://github.com/apache/airflow/issues/27614#issuecomment-1313184795 > https://gist.github.com/Bowrna/1994894beea39fa8e1c269591b7f0346#file-airflow_local_settings-py-L120 @tirkarthi Yes you are right. Thanks for pointing out this mistake -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] vksunilk commented on pull request #26986: Persist DataprocLink for workflow operators regardless of job status
vksunilk commented on PR #26986: URL: https://github.com/apache/airflow/pull/26986#issuecomment-1313163186 > Nope :( .Static checks are failing. I recommend you to install pre-commit . It will correct the errors for you automatically. Done now. Precommit helped! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on a diff in pull request #24652: Add @task.snowpark decorator
dstandish commented on code in PR #24652: URL: https://github.com/apache/airflow/pull/24652#discussion_r1021115584 ## tests/system/providers/snowflake/example_snowflake_snowpark.py: ## @@ -0,0 +1,101 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +""" +Example use of Snowflake Snowpark related decorator. +""" +from __future__ import annotations + +import sys + +import pytest + +if not sys.version_info[0:2] == (3, 8): +pytest.skip("unsupported python version", allow_module_level=True) +from datetime import datetime +from random import uniform + +import snowflake.snowpark +from snowflake.snowpark.functions import col, count, lit, random as spRandom, sproc, uniform as spUniform +from snowflake.snowpark.types import FloatType + +from airflow import DAG, AirflowException +from airflow.decorators import task + +SNOWFLAKE_CONN_ID = "my_snowflake_conn" +DAG_ID = "example_snowflake_snowpark" + +with DAG( +DAG_ID, +start_date=datetime(2021, 1, 1), +schedule_interval="@once", Review Comment: ```suggestion schedule="@once", ``` ## airflow/providers/snowflake/decorators/snowpark.py: ## @@ -0,0 +1,154 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +from typing import TYPE_CHECKING, Callable, Sequence + +from airflow.decorators.base import DecoratedOperator, task_decorator_factory +from airflow.exceptions import AirflowException +from airflow.operators.python import PythonOperator +from airflow.providers.snowflake.hooks.snowflake import SnowflakeHook + +if TYPE_CHECKING: +from airflow.decorators.base import TaskDecorator + +try: +import snowflake.snowpark # noqa +except ImportError: +raise AirflowException( +"The snowflake-snowpark-python package is not installed. Make sure you are using Python 3.8." +) + + +class _SnowparkDecoratedOperator(DecoratedOperator, PythonOperator): +""" +Wraps a Python callable and captures args/kwargs when called for execution. + +:param snowflake_conn_id: Reference to +:ref:`Snowflake connection id` +:param parameters: (optional) the parameters to render the SQL query with. +:param warehouse: name of warehouse (will overwrite any warehouse defined in the connection's extra JSON) +:param database: name of database (will overwrite database defined in connection) +:param schema: name of schema (will overwrite schema defined in connection) +:param role: name of role (will overwrite any role defined in connection's extra JSON) +:param authenticator: authenticator for Snowflake. +'snowflake' (default) to use the internal Snowflake authenticator +'externalbrowser' to authenticate using your web browser and +Okta, ADFS or any other SAML 2.0-compliant identify provider +(IdP) that has been defined for your account +'https://.okta.com' to authenticate +through native Okta. +:param session_parameters: You can set session-level parameters at the time you connect to Snowflake +:param python_callable: A reference to an object that is callable +:param op_kwargs: a dictionary of keyword arguments that will get unpacked in your function (templated) +:param op_args: a list of positional arguments that will get unpacked when +calling your callable (templated) +:param multiple_outputs: if set, function return value will be +unrolled to multiple XCom values.
[GitHub] [airflow] hterik commented on pull request #26710: Add config to control Kubernetes Client retry behaviour
hterik commented on PR #26710: URL: https://github.com/apache/airflow/pull/26710#issuecomment-1313158050 stale ping -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] JulesTriomphe commented on pull request #27420: ExtraVolumeMounts in sidecars and initContainers
JulesTriomphe commented on PR #27420: URL: https://github.com/apache/airflow/pull/27420#issuecomment-1313134461 Thank you for following the PR and merging it @potiuk ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj commented on pull request #24652: Add @task.snowpark decorator
mik-laj commented on PR #24652: URL: https://github.com/apache/airflow/pull/24652#issuecomment-1313111826 @uranusjr @potiuk It is ready for review. I solved all the problem with installing the snowflake-snowpark-python package in Docker @uranusjr Now the snowflake-snowpark-python package is required for apache-airflow-providers-snowflake, but only installed for Python 3.8 to simplify its installations. @potiuk i updated `EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS` env var. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj commented on a diff in pull request #24652: Add @task.snowpark decorator
mik-laj commented on code in PR #24652: URL: https://github.com/apache/airflow/pull/24652#discussion_r1021081382 ## Dockerfile: ## @@ -1234,8 +1234,10 @@ ARG ADDITIONAL_PYTHON_DEPS="" # https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates # * authlib, gcloud_aio_auth, adal are needed to generate constraints for PyPI packages and can be removed after we release # new google, azure providers +# * cloudpickle==2.0.0 is required by snowfalke-snowpark-python, which conflicts with new versions of apache beam, so +# we pin pinning apache-beam also to the latest version which uses cloudpickle==2.0.0. Review Comment: Fixed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj commented on a diff in pull request #24652: Add @task.snowpark decorator
mik-laj commented on code in PR #24652: URL: https://github.com/apache/airflow/pull/24652#discussion_r1021081277 ## airflow/providers/snowflake/decorators/snowpark.py: ## @@ -0,0 +1,143 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from typing import TYPE_CHECKING, Callable, Dict, Optional, Sequence + +from airflow.decorators.base import DecoratedOperator, task_decorator_factory +from airflow.operators.python import PythonOperator +from airflow.providers.snowflake.operators.snowflake import get_db_hook + +if TYPE_CHECKING: +import snowflake + +from airflow.decorators.base import TaskDecorator + +try: +import snowflake.snowpark # noqa +except ImportError as e: +# Snowpark is an optional feature and if imports are missing, it should be silently ignored +# As of Airflow 2.3 and above the operator can throw OptionalProviderFeatureException Review Comment: I made this package required but installed only for Python 3.8 because it would be difficult to integrate with our Docker image to have one extra package installed in a Docker image for Python 3.8 only. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] notatallshaw-gts commented on pull request #27111: Update SLA wording to reflect it is relative to Dag Run start
notatallshaw-gts commented on PR #27111: URL: https://github.com/apache/airflow/pull/27111#issuecomment-1313106307 > With my Airflow (1.14.15), the SLA miss was not detected in your scenario (#26566) until the second DAG run. > > I might have found the reason. Airflow 1.14.15 does check for the SLA one interval ahead of each task, but it does so for SUCCESS/SKIPPED tasks only, not the currently running tasks. When a new run is triggered, the SUCCESS/SKIPPED tasks from the _previous_ run will be one interval behind, so looking one interval ahead works for these tasks. That’s when the SLA miss is triggered. Definitely later than I expected. Do you mean 1.10.15? Regardless this fits with the experience of SLAs I had with Airflow 1.x era, that is they came much later than I expected and I didn't find them useful Fortunately I currently only have to work with Airflow 2.x instances, and I have been setting up a lot of SLAs recently and my practical experience of them is they work as I've described in this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #24652: Add @task.snowpark decorator
uranusjr commented on code in PR #24652: URL: https://github.com/apache/airflow/pull/24652#discussion_r1021049915 ## Dockerfile: ## @@ -1234,8 +1234,10 @@ ARG ADDITIONAL_PYTHON_DEPS="" # https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates # * authlib, gcloud_aio_auth, adal are needed to generate constraints for PyPI packages and can be removed after we release # new google, azure providers +# * cloudpickle==2.0.0 is required by snowfalke-snowpark-python, which conflicts with new versions of apache beam, so +# we pin pinning apache-beam also to the latest version which uses cloudpickle==2.0.0. Review Comment: ```suggestion # we pin apache-beam also to the latest version which uses cloudpickle==2.0.0. ``` typo -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #24652: Add @task.snowpark decorator
uranusjr commented on code in PR #24652: URL: https://github.com/apache/airflow/pull/24652#discussion_r1021049915 ## Dockerfile: ## @@ -1234,8 +1234,10 @@ ARG ADDITIONAL_PYTHON_DEPS="" # https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates # * authlib, gcloud_aio_auth, adal are needed to generate constraints for PyPI packages and can be removed after we release # new google, azure providers +# * cloudpickle==2.0.0 is required by snowfalke-snowpark-python, which conflicts with new versions of apache beam, so +# we pin pinning apache-beam also to the latest version which uses cloudpickle==2.0.0. Review Comment: ```suggestion # we pin apache-beam also to the latest version which uses cloudpickle==2.0.0. ```typo -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #27634: Correct job name matching in SagemakerProcessingOperator
uranusjr commented on code in PR #27634: URL: https://github.com/apache/airflow/pull/27634#discussion_r1021048895 ## airflow/providers/amazon/aws/hooks/sagemaker.py: ## @@ -954,33 +955,49 @@ def find_processing_job_by_name(self, processing_job_name: str) -> bool: ) return bool(self.count_processing_jobs_by_name(processing_job_name)) +@staticmethod +def _name_matches_pattern( +processing_job_name: str, +found_name: str, +job_name_suffix: str | None = None, +) -> bool: +pattern = re.compile(f"^{processing_job_name}({job_name_suffix})?$") +return bool(pattern.search(found_name)) Review Comment: ```suggestion return pattern.fullmatch(found_name) is not None ``` Faster since this is a full string match. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on pull request #27651: Fix static check coming from merging pre-normalization change
uranusjr commented on PR #27651: URL: https://github.com/apache/airflow/pull/27651#issuecomment-1313049879 Too trivial, not going to wait for CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (150dd927c3 -> f564d650da)
This is an automated email from the ASF dual-hosted git repository. uranusjr pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git from 150dd927c3 Filter out invalid schemas in Hive hook (#27647) add f564d650da Fix static check coming from merging pre-normalization change (#27651) No new revisions were added by this update. Summary of changes: airflow/dag_processing/manager.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[GitHub] [airflow] uranusjr merged pull request #27651: Fix static check coming from merging pre-normalization change
uranusjr merged PR #27651: URL: https://github.com/apache/airflow/pull/27651 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #27640: Added exclude_microseconds to cli
uranusjr commented on code in PR #27640: URL: https://github.com/apache/airflow/pull/27640#discussion_r1021046613 ## airflow/cli/cli_parser.py: ## @@ -433,6 +445,9 @@ def string_lower_type(val): ARG_RUN_ID = Arg(("-r", "--run-id"), help="Helps to identify this run") ARG_CONF = Arg(("-c", "--conf"), help="JSON string that gets pickled into the DagRun's conf attribute") ARG_EXEC_DATE = Arg(("-e", "--exec-date"), help="The execution date of the DAG", type=parsedate) +ARG_REPLACE_MICRO = Arg( +("--replace-microseconds",), help="whether microseconds should be zeroed", default=True, type=bool_type +) Review Comment: The `type=bool_type` part is not idiomatic, you likely want to do `--no-replace-microseconds` and `action="store_true"`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #27640: Added exclude_microseconds to cli
uranusjr commented on code in PR #27640: URL: https://github.com/apache/airflow/pull/27640#discussion_r1021046613 ## airflow/cli/cli_parser.py: ## @@ -433,6 +445,9 @@ def string_lower_type(val): ARG_RUN_ID = Arg(("-r", "--run-id"), help="Helps to identify this run") ARG_CONF = Arg(("-c", "--conf"), help="JSON string that gets pickled into the DagRun's conf attribute") ARG_EXEC_DATE = Arg(("-e", "--exec-date"), help="The execution date of the DAG", type=parsedate) +ARG_REPLACE_MICRO = Arg( +("--replace-microseconds",), help="whether microseconds should be zeroed", default=True, type=bool_type +) Review Comment: The `type=bool_type` part is wrong, you likely want `action="store_true"`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk opened a new pull request, #27651: Fix static check coming from merging pre-normalization change
potiuk opened a new pull request, #27651: URL: https://github.com/apache/airflow/pull/27651 The #27060 was built before normalisation was applied and merging it caused static-check failure --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #24742: Import errors thrown by *new* DAGs are hidden from RBAC users' web ui
potiuk commented on issue #24742: URL: https://github.com/apache/airflow/issues/24742#issuecomment-1313033873 The answer is pretty much always the same - It will be implemented, when someone implements it. This is how open-source software works. If somone badly needs a feature in OSS project, and no-one is actually doing it, then they can do various things: * implement them * find somene who implements it * them or they company can sponsor somoene who migh take an interest in it * wait until someone implements it * implement it in they own fork and possibly contribute it back * you (or someone from your company - or even paid by your company) can start a discussion in the devlist and explain why you need it and (in case of this problem goes really deep into the core of Airflow it likely will require someone to come up with idaes how to do what you want, discuss and get it to conclusion (and eventually likely voting on AIP - Airflow Improvement Proposal). I think following those are some of the ways you can make those things progress. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (449a9b8e53 -> 150dd927c3)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git from 449a9b8e53 bump alembic minimum version (#27629) add 150dd927c3 Filter out invalid schemas in Hive hook (#27647) No new revisions were added by this update. Summary of changes: airflow/providers/apache/hive/hooks/hive.py| 2 ++ tests/providers/apache/hive/__init__.py| 15 +++ tests/providers/apache/hive/hooks/test_hive.py | 6 ++ 3 files changed, 23 insertions(+)
[GitHub] [airflow] potiuk merged pull request #27647: Filter out invalid schemas in Hive hook
potiuk merged PR #27647: URL: https://github.com/apache/airflow/pull/27647 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk merged pull request #27629: Bump ```alembic``` minimum version
potiuk merged PR #27629: URL: https://github.com/apache/airflow/pull/27629 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated: bump alembic minimum version (#27629)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new 449a9b8e53 bump alembic minimum version (#27629) 449a9b8e53 is described below commit 449a9b8e53e3647a2423ccf033b06381ee39f41b Author: Pankaj Singh AuthorDate: Mon Nov 14 08:19:36 2022 +0530 bump alembic minimum version (#27629) --- setup.cfg | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/setup.cfg b/setup.cfg index 3c085f72e6..59e6686ab5 100644 --- a/setup.cfg +++ b/setup.cfg @@ -65,7 +65,7 @@ setup_requires = install_requires = # Alembic is important to handle our migrations in predictable and performant way. It is developed # together with SQLAlchemy. Our experience with Alembic is that it very stable in minor version -alembic>=1.5.1, <2.0 +alembic>=1.6.3, <2.0 argcomplete>=1.10 attrs>=22.1.0 blinker
[GitHub] [airflow] potiuk commented on pull request #27629: Bump ```alembic``` minimum version
potiuk commented on PR #27629: URL: https://github.com/apache/airflow/pull/27629#issuecomment-1312965089 Thanks @aa3pankaj for figuring that one out :). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #27629: Bump ```alembic``` minimum version
potiuk commented on PR #27629: URL: https://github.com/apache/airflow/pull/27629#issuecomment-1312964879 > It’s likely not worthwhile to debug to deep into this, so I think this is good enough. We can always review and revise the version range if issues come up. Correct - we have a number of "lower bounds" and unlike "upper bounds" - we have no to explain the lower bounds reasons in detail or evan dig-deeply into the reasons, as long as the "lower-bound" version is "old enough". It might be interesting to know but it has mostly academic value to know it as with every day the reason for that lower-bound is farhter in the past and becomes less and less relevant(as opposed to upper-bound - every day we keep it, it becomes more and more relevant and annoying). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #26658: Clear TaskGroup
uranusjr commented on code in PR #26658: URL: https://github.com/apache/airflow/pull/26658#discussion_r1021029012 ## airflow/www/views.py: ## @@ -2025,12 +2029,14 @@ def _clear_dag_tis( ] ) @action_logging -def clear(self): -"""Clears the Dag.""" +@provide_session +def clear(self, session: Session = NEW_SESSION): Review Comment: ```suggestion def clear(self, *, session: Session = NEW_SESSION): ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #26658: Clear TaskGroup
uranusjr commented on code in PR #26658: URL: https://github.com/apache/airflow/pull/26658#discussion_r1021028895 ## airflow/www/views.py: ## @@ -2047,14 +2053,42 @@ def clear(self): recursive = request.form.get("recursive") == "true" only_failed = request.form.get("only_failed") == "true" -task_ids: list[str | tuple[str, int]] -if map_indexes is None: -task_ids = [task_id] -else: -task_ids = [(task_id, map_index) for map_index in map_indexes] +task_ids: list[str | tuple[str, int]] = [] + +end_date = execution_date if not future else None +start_date = execution_date if not past else None + +if group_id is not None: +task_group_dict = dag.task_group.get_task_group_dict() +task_group = task_group_dict.get(group_id) +if task_group is None: +return redirect_or_json( +origin, msg=f"TaskGroup {group_id} could not be found", status="error", status_code=404 +) +task_ids = task_ids_or_regex = [t.task_id for t in task_group.iter_tasks()] + +# Lock the related dag runs to prevent from possible dead lock. +# https://github.com/apache/airflow/pull/26658 +dag_runs_query = session.query(DagRun.id).filter(DagRun.dag_id == dag_id).with_for_update() +if start_date is None and end_date is None: +dag_runs_query = dag_runs_query.filter(DagRun.execution_date == start_date) +else: +if start_date is not None: +dag_runs_query = dag_runs_query.filter(DagRun.execution_date >= start_date) + +if end_date is not None: +dag_runs_query = dag_runs_query.filter(DagRun.execution_date <= end_date) + +_ = dag_runs_query.all() Review Comment: ```suggestion locked_dag_run_ids = dag_runs_query.all() ``` And `del` the variable explicitly when we can free up those rows. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #26658: Clear TaskGroup
uranusjr commented on code in PR #26658: URL: https://github.com/apache/airflow/pull/26658#discussion_r1021027532 ## airflow/www/views.py: ## @@ -1966,6 +1966,7 @@ def trigger(self, session=None): flash(f"Triggered {dag_id}, it should start any moment now.") return redirect(origin) +@provide_session Review Comment: Since this is an internal function, we should be able to control all cases where `session` is needed, and explicitly pass it. This decorator is thus redundant and should be removed. Also, can we take this change to add type hints to all arguments? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk closed issue #27577: Allow for templating of pod-template.yaml at service start time or pod creation
potiuk closed issue #27577: Allow for templating of pod-template.yaml at service start time or pod creation URL: https://github.com/apache/airflow/issues/27577 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #27577: Allow for templating of pod-template.yaml at service start time or pod creation
potiuk commented on issue #27577: URL: https://github.com/apache/airflow/issues/27577#issuecomment-131293 You have not explained "WHEN" the pod_template would change. From what I understand, you want to do it at service deployment time. And seems this is what you want to do - either on service deployment time or based on some manually triggered event (which would be some kind of event reconfiguring the whole cluster). What you really need is a solution for your deployment not "airflow feature". What I think you shuld do is to simply use whatever k8s already provides - you are really talking about config map that will be mounted as your pod_template_file: https://kubernetes.io/docs/concepts/storage/volumes/#configmap You can mount this config map as a volume/folder available for all pods that need them and then you can use absolutely standard k8s mechanism to apply any changes to it (kubectl apply) including pre-processing it via jinja as you wish (you do not need sed, you can simply use jinja2 CLI for that https://pypi.org/project/jinja2-cli/ Converting it into discussion, but I suggest you try this first and see if you can make it works. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on pull request #27629: Bump ```alembic``` minimum version
uranusjr commented on PR #27629: URL: https://github.com/apache/airflow/pull/27629#issuecomment-1312944091 Thank you, this additional context may be important if we need to debug similar issues in the future. 1.6.4 only fixes one bug: https://alembic.sqlalchemy.org/en/latest/changelog.html#change-1.6.4 It doesn’t seem very related, to be honest (I would expect migrations to crash instead of hang for this), but it’s likely not worthwhile to debug to deep into this. We can always review and revise the version range if issues come up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk closed pull request #27548: Fix bug in various hooks that causes SQL run without a handler to always return None.
potiuk closed pull request #27548: Fix bug in various hooks that causes SQL run without a handler to always return None. URL: https://github.com/apache/airflow/pull/27548 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #27548: Fix bug in various hooks that causes SQL run without a handler to always return None.
potiuk commented on PR #27548: URL: https://github.com/apache/airflow/pull/27548#issuecomment-1312943328 Closing. to summarise - Expecting to add handler to get return values is by design (and for good reasons). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #27548: Fix bug in various hooks that causes SQL run without a handler to always return None.
potiuk commented on code in PR #27548: URL: https://github.com/apache/airflow/pull/27548#discussion_r1021018783 ## airflow/providers/common/sql/hooks/sql.py: ## @@ -273,7 +273,7 @@ def run( if not self.get_autocommit(conn): conn.commit() -if handler is None: +if handler is not None: Review Comment: This is based on misunderstanding on how it works. Even If you look at your code, it's not only backwards incompatible (which would be enough to reject this change "as is") but also wrong. When you look closer, the PR when implemened always return either None or empty array (because results are never appended to when handler is None). And this is as intended. Because we do NOT want to return whatever cursor interation returns. Queries that are run via DBAPIHook are mostly DML and DDL queries, not DQL queries. It usually makes little sense to use "plain" hook to receive the resutls of the DQL queries because you would have to return array of cursor results - which in most cases will be quite big of an array of row dictionaries. And when you iterate over all those rows returned by DBApi - the last thing you want to do is to store the full dictionary in an array. The idea with a handler was that you can use hook to run DQL and Handlers were precisely introduced to handle that. The cursor rows are NOT "processed" by the handlers. They are converted to a form that can be returned as array (and allows you to slim them down to a digestible size and format to put them as array of results - for example just return an ID of the row - and the handler can store the content of such row along the way of iteration in a file or whatever way that will not require storing all output in memory. You can write an identity handler of course (if you write the query in the way you know it will return one or few rows), but if you won't have any handler the None value is absolutely expected. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (fb9e5e612e -> 65b78b7dbd)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git from fb9e5e612e Add extraVolumeMounts to sidecars too (#27420) add 65b78b7dbd Add new files to parsing queue on every loop of dag processsing (#27060) No new revisions were added by this update. Summary of changes: airflow/dag_processing/manager.py| 31 +++ tests/dag_processing/test_manager.py | 77 2 files changed, 84 insertions(+), 24 deletions(-)
[GitHub] [airflow] potiuk closed issue #27010: DagProcessor doesnt pick new files until queued file parsing completes
potiuk closed issue #27010: DagProcessor doesnt pick new files until queued file parsing completes URL: https://github.com/apache/airflow/issues/27010 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk merged pull request #27060: 27010 : Add new files to parsing queue on every loop of dag processsing
potiuk merged PR #27060: URL: https://github.com/apache/airflow/pull/27060 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #20346: Fix inconsitencies in checking edit permissions for a DAG
potiuk commented on PR #20346: URL: https://github.com/apache/airflow/pull/20346#issuecomment-1312930544 Needs rebase I am afraid. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #27468: [Poc] Introduce Internal API as JSON RPC with JSON serialization
potiuk commented on PR #27468: URL: https://github.com/apache/airflow/pull/27468#issuecomment-1312930149 > I stand corrected, the AIP was there and I missed the voting. So any argument I make below is mood. The JSON-RPC is actually up for discussion still. Until the founding PR (or even after - I think the way we will implement int the way that we will be able to replace the implementation in the future if we find this is not good. In the implementation proposal of @mhenc this is literally an implementation detail - and vast majority of changes will be to make sure we implement tests amd make sure all the calls are mapped. This is because we are using "standard" HTTP stack rather than Grpc-like solution, because GRPC gets a bit more of a "tie-in" about the processing model for threading etc. And after a lot of reading and talking to some of my friends, and analysing issues - I tend to agree with @ash and @andrewgodwin that (among other issues), the threading model of GRPC is a big, unknown risk (Python Threads and GIL issues are still unsolved issue, and using proven set of backend fori/process-driven/gevent etc. approach is much more reilable, debuggable and scaleable). And JSON RPC is basically piggy-backing on that while providing a simple and standard semantics of mapping input/output parameters of methods. I'd say - let's give it a try and see from the final tests, if JSON-RPC is as good solution as we think it is, and we can revisit it later. Eventually the whole test harness and making sure that there are no DB methods left and that some of the components of ours are truly db-less will let us move forward with it. Once we get it implemented, Replacing the internals of communication between the components will be far less of a problem. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #22491: Add ignore_first_depends_on_past for scheduled jobs
potiuk commented on PR #22491: URL: https://github.com/apache/airflow/pull/22491#issuecomment-1312924806 > @pingzh this is great stuff, exactly what I need! But we are using managed version of Airflow on AWS which is 2.2.2. Is there any way to apply this functionality without updating airflow to 2.3? Nope. You need to insit on them to upgrade. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #27541: Remove default owner
potiuk commented on code in PR #27541: URL: https://github.com/apache/airflow/pull/27541#discussion_r1021006499 ## airflow/config_templates/default_airflow.cfg: ## @@ -524,9 +524,6 @@ username = password = [operators] -# The default owner assigned to each new operator, unless -# provided explicitly or passed via ``default_args`` -default_owner = airflow Review Comment: Following the discussion in https://github.com/apache/airflow/pull/27067#discussion_r1020614511 - this seems like a good candidate to remove IF we agree that we can treat "breaking" in a less strict and more "how likely it's going to break other's workflow" way. Very good example of a case where we could take a risk and classify it as "non-breaking" (even if it is technically a removal). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #11708: Git sync for plugins in the Helm Chart
potiuk commented on issue #11708: URL: https://github.com/apache/airflow/issues/11708#issuecomment-1312920790 Would be great. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] aa3pankaj commented on pull request #27629: Bump ```alembic``` minimum version
aa3pankaj commented on PR #27629: URL: https://github.com/apache/airflow/pull/27629#issuecomment-1312904299 cc: @potiuk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (cc571e8e0e -> fb9e5e612e)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git from cc571e8e0e Add nodePort declaration to chart/values.schema.json (#26945) add fb9e5e612e Add extraVolumeMounts to sidecars too (#27420) No new revisions were added by this update. Summary of changes: chart/templates/scheduler/scheduler-deployment.yaml | 6 ++ chart/templates/triggerer/triggerer-deployment.yaml | 3 +++ chart/templates/webserver/webserver-deployment.yaml | 3 +++ chart/templates/workers/worker-deployment.yaml | 9 + 4 files changed, 21 insertions(+)
[GitHub] [airflow] potiuk merged pull request #27420: ExtraVolumeMounts in sidecars and initContainers
potiuk merged PR #27420: URL: https://github.com/apache/airflow/pull/27420 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (be8a62e596 -> cc571e8e0e)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git from be8a62e596 Pig cli connection properties cannot be passed by connection extra (#27644) add cc571e8e0e Add nodePort declaration to chart/values.schema.json (#26945) No new revisions were added by this update. Summary of changes: chart/values.schema.json | 6 ++ tests/charts/test_webserver.py | 29 + 2 files changed, 35 insertions(+)
[GitHub] [airflow] potiuk merged pull request #26945: Add nodePort declaration to chart/values.schema.json
potiuk merged PR #26945: URL: https://github.com/apache/airflow/pull/26945 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on pull request #26945: Add nodePort declaration to chart/values.schema.json
boring-cyborg[bot] commented on PR #26945: URL: https://github.com/apache/airflow/pull/26945#issuecomment-1312902978 Awesome work, congrats on your first merged pull request! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk closed issue #26812: Add NodePort Option to the values schema
potiuk closed issue #26812: Add NodePort Option to the values schema URL: https://github.com/apache/airflow/issues/26812 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #10964: No response from gunicorn master within 120 seconds After Changing Worker Class
potiuk commented on issue #10964: URL: https://github.com/apache/airflow/issues/10964#issuecomment-1312901919 > I also had the same issue on the latest main, increasing the container resources fixed it. Yep. that could be a reaosn - not enough resources might cause exactly this klind of problem. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #27542: Add support for extraContainers in pgbouncer pods
potiuk commented on PR #27542: URL: https://github.com/apache/airflow/pull/27542#issuecomment-1312897756 I triggered the tests. I gues you will need to add tests for it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #27618: Status of testing of Apache Airflow 2.4.3rc1
potiuk commented on issue #27618: URL: https://github.com/apache/airflow/issues/27618#issuecomment-1312891663 Checked #27223 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] closed pull request #25219: Update statsd-exporter setup
github-actions[bot] closed pull request #25219: Update statsd-exporter setup URL: https://github.com/apache/airflow/pull/25219 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] closed pull request #26376: Add dataset event timestamp to dataset dag run queue
github-actions[bot] closed pull request #26376: Add dataset event timestamp to dataset dag run queue URL: https://github.com/apache/airflow/pull/26376 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] closed pull request #25982: bugfix: tweak support for project_id argument in BigQueryGetDataOperator
github-actions[bot] closed pull request #25982: bugfix: tweak support for project_id argument in BigQueryGetDataOperator URL: https://github.com/apache/airflow/pull/25982 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk closed issue #27575: To make airflow UI interactive
potiuk closed issue #27575: To make airflow UI interactive URL: https://github.com/apache/airflow/issues/27575 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #27575: To make airflow UI interactive
potiuk commented on issue #27575: URL: https://github.com/apache/airflow/issues/27575#issuecomment-1312861681 > I guess modifying workflow from UI needs to update DAG code, serialized dags, etc. and goes against the https://airflow.apache.org/docs/apache-airflow/stable/#workflows-as-code . This looks similar to #27164 Yeah. I re-read itl @tirkarthi. I thought the change is about changing the order of tasks DISPLAYED in the UI in the grid view - as It can be easily done as UI-only change. Indeed, changing "execution" sequence via UI is something that currently falls outside of the scope of Airflow and it is "won't do" case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #27614: Double execution of failure callback for task
potiuk commented on issue #27614: URL: https://github.com/apache/airflow/issues/27614#issuecomment-1312858958 @Bowrna - hard to say - if you are able to reproduce it, then likely it is a bug. Finding a problem and fixing might actually prove it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #27557: Implement extra controls for SLAs
potiuk commented on PR #27557: URL: https://github.com/apache/airflow/pull/27557#issuecomment-1312858194 That looks good to me as an interim solution (and UI-only improvement). @bbovenzi @ryanahamilton @ashb? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #8212: Can't read S3 remote logs when using gevent/eventlent webserver workers.
potiuk commented on issue #8212: URL: https://github.com/apache/airflow/issues/8212#issuecomment-1312857700 Too bad. I will try other things soon. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (9358928815 -> be8a62e596)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git from 9358928815 Remove custom spark home and custom binarires for spark (#27646) add be8a62e596 Pig cli connection properties cannot be passed by connection extra (#27644) No new revisions were added by this update. Summary of changes: airflow/providers/apache/pig/CHANGELOG.rst| 14 + airflow/providers/apache/pig/hooks/pig.py | 29 +-- airflow/providers/apache/pig/operators/pig.py | 9 ++--- airflow/providers/apache/pig/provider.yaml| 1 + tests/providers/apache/pig/hooks/test_pig.py | 13 +--- 5 files changed, 49 insertions(+), 17 deletions(-)
[GitHub] [airflow] potiuk merged pull request #27644: Pig cli properties cannot be passed by connection extra
potiuk merged PR #27644: URL: https://github.com/apache/airflow/pull/27644 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (fc7b95d44d -> 9358928815)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git from fc7b95d44d Add Pierre to committers list (#27643) add 9358928815 Remove custom spark home and custom binarires for spark (#27646) No new revisions were added by this update. Summary of changes: airflow/providers/apache/spark/CHANGELOG.rst | 10 ++ .../providers/apache/spark/hooks/spark_submit.py | 54 + .../apache/spark/operators/spark_submit.py | 3 +- .../connections/spark.rst | 3 +- .../apache/spark/hooks/test_spark_submit.py| 135 + 5 files changed, 69 insertions(+), 136 deletions(-)
[GitHub] [airflow] potiuk merged pull request #27646: Remove custom spark home and custom binarires for spark
potiuk merged PR #27646: URL: https://github.com/apache/airflow/pull/27646 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk opened a new pull request, #27647: Filter out invalid schemas in Hive hook
potiuk opened a new pull request, #27647: URL: https://github.com/apache/airflow/pull/27647 --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pierrejeambrun commented on pull request #27642: Update API & Python Client versions
pierrejeambrun commented on PR #27642: URL: https://github.com/apache/airflow/pull/27642#issuecomment-1312829552 Should I amend this PR to only change the API version. Then open an issue for the python/go client version ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #27642: Update API & Python Client versions
potiuk commented on PR #27642: URL: https://github.com/apache/airflow/pull/27642#issuecomment-1312826174 Why we do not make release of the Python Client, part our regular relase process @jedcunningham @ephraimbuddy. I think it is either no-op (when there is no change) , or it should be released at the same time as airflow. We should be able to release them at the same time when needed. WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk opened a new pull request, #27646: Remove custom spark home and custom binarires for spark
potiuk opened a new pull request, #27646: URL: https://github.com/apache/airflow/pull/27646 --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] eladkal commented on pull request #27642: Update API & Python Client versions
eladkal commented on PR #27642: URL: https://github.com/apache/airflow/pull/27642#issuecomment-1312818373 > I thought the client was auto generated. Not sure what is the realease process for it ? Both python client and go client have seperated release process. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] bdsoha commented on pull request #27639: Enable copying DagRun JSON to clipboard
bdsoha commented on PR #27639: URL: https://github.com/apache/airflow/pull/27639#issuecomment-1312811565 @bbovenzi Could [`@textea/json-viewer`](https://github.com/TexteaInc/json-viewer) be a possible replacement? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] shubham22 commented on pull request #27559: Generic RDS Operators
shubham22 commented on PR #27559: URL: https://github.com/apache/airflow/pull/27559#issuecomment-1312809096 Agree with @o-nikolas here. Providing a generic Boto3 operator provides users with breadcrumbs and leaves all the work regarding figuring out Boto3 API details on the users themselves. TaskFlow and custom operators already do something similar and adding a generic operator would just make it confusing. In addition what Niko mentioned as main benefits of operators, I think **observability** is one of the main advantages. With operators, we can provide logs of different services/operations within the Airflow UI, so users don't have to go on a hunt for logs in AWS console/CloudWatch. A generic operator would not be able to provide the same convenience. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] RachitSharma2001 commented on a diff in pull request #26974: Add FTP Operator
RachitSharma2001 commented on code in PR #26974: URL: https://github.com/apache/airflow/pull/26974#discussion_r1020802540 ## airflow/providers/ftp/example_dags/example_ftp.py: ## @@ -0,0 +1,51 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +from datetime import datetime + +from airflow import models +from airflow.providers.ftp.operators.ftp import FTPOperation, FTPOperator + +with models.DAG( +"example_ftp_put_get", +schedule_interval=None, +start_date=datetime(2021, 1, 1), +catchup=False, +) as dag: +# [START howto_ftp_put] +ftp_put = FTPOperator( +task_id="test_ftp_put", +ftp_conn_id="ftp_default", +local_filepath="/tmp/filepath", Review Comment: I have updated the operator to reflect this name change to `FTPFileTransferOperator`. Let me know what you think. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] fatmumuhomer opened a new issue, #27645: Calendar view does not load when using CronTriggerTimeTable
fatmumuhomer opened a new issue, #27645: URL: https://github.com/apache/airflow/issues/27645 ### Apache Airflow version 2.4.2 ### What happened Create a DAG and set the schedule parameter using a CronTriggerTimeTable instance. Enable the DAG so that there is DAG run data. Try to access the Calendar View for the DAG. An ERR_EMPTY_RESPONSE error is displayed instead of the page. The Calendar View is accessible for other DAGs that are using the schedule_interval set to a cron string instead. ### What you think should happen instead The Calendar View should have been displayed. ### How to reproduce Create a DAG and set the schedule parameter to a CronTriggerTimeTable instance. Enable the DAG and allow some DAG runs to occur. Try to access the Calender View for the DAG. ### Operating System Red Hat Enterprise Linux 8.6 ### Versions of Apache Airflow Providers _No response_ ### Deployment Virtualenv installation ### Deployment details Airflow 2.4.2 installed via pip with Python3.9 to venv using constraints. ### Anything else _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] eladkal commented on a diff in pull request #26974: Add FTP Operator
eladkal commented on code in PR #26974: URL: https://github.com/apache/airflow/pull/26974#discussion_r1020949454 ## airflow/providers/ftp/example_dags/example_ftp.py: ## @@ -0,0 +1,51 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations Review Comment: I think we want the example dag to be in AIP-47 format and under system test folder? @potiuk or are we converting them seperatly? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pierrejeambrun commented on pull request #27642: Update API & Python Client versions
pierrejeambrun commented on PR #27642: URL: https://github.com/apache/airflow/pull/27642#issuecomment-1312785999 I believe the client is auto generated. Not sure if there is something else to do -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated: Add Pierre to committers list (#27643)
This is an automated email from the ASF dual-hosted git repository. pierrejeambrun pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new fc7b95d44d Add Pierre to committers list (#27643) fc7b95d44d is described below commit fc7b95d44d54bbd209c752edc7663af128f94227 Author: Pierre Jeambrun AuthorDate: Sun Nov 13 18:53:09 2022 +0100 Add Pierre to committers list (#27643) --- docs/apache-airflow/project.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/apache-airflow/project.rst b/docs/apache-airflow/project.rst index 54c5e53c05..4a3df1b763 100644 --- a/docs/apache-airflow/project.rst +++ b/docs/apache-airflow/project.rst @@ -71,6 +71,7 @@ Committers - Malthe Borch (@malthe) - Maxime "Max" Beauchemin (@mistercrunch) - Patrick Leo Tardif (@pltardif) +- Pierre Jeambrun (@pierrejeambrun) - Ping Zhang (@pingzh) - Qian Yu (@yuqian90) - Qingping Hou (@houqp)
[GitHub] [airflow] pierrejeambrun merged pull request #27643: Add Pierre to committers list
pierrejeambrun merged PR #27643: URL: https://github.com/apache/airflow/pull/27643 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk opened a new pull request, #27644: Pig cli connection properties cannot be passed by connection extra
potiuk opened a new pull request, #27644: URL: https://github.com/apache/airflow/pull/27644 --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #27522: Add example of dockerfile with creating new virtualenv
potiuk commented on PR #27522: URL: https://github.com/apache/airflow/pull/27522#issuecomment-1312781119 Can be anything neutral. I think we have a few places with similar things: For example this is what we have in example_python_operator: ``` @task.virtualenv( task_id="virtualenv_python", requirements=["colorama==0.4.0"], system_site_packages=False ) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #27643: Add Pierre to committers list
potiuk commented on PR #27643: URL: https://github.com/apache/airflow/pull/27643#issuecomment-1312780698 It should have been the first commit to do by a committer :). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (d6cb70331f -> 1d4fd5c6ea)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git from d6cb70331f Add workeer log-groomer-sidecar enable option in helm chart (#27178) add 1d4fd5c6ea The pinot-admin.sh command is now hard-coded. (#27641) No new revisions were added by this update. Summary of changes: airflow/providers/apache/pinot/CHANGELOG.rst | 22 ++ airflow/providers/apache/pinot/hooks/pinot.py| 16 +--- airflow/providers/apache/pinot/provider.yaml | 1 + tests/providers/apache/pinot/hooks/test_pinot.py | 17 + 4 files changed, 49 insertions(+), 7 deletions(-)
[GitHub] [airflow] potiuk merged pull request #27641: The pinot-admin.sh command is now hard-coded.
potiuk merged PR #27641: URL: https://github.com/apache/airflow/pull/27641 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ayushthe1 commented on pull request #27522: Add example of dockerfile with creating new virtualenv
ayushthe1 commented on PR #27522: URL: https://github.com/apache/airflow/pull/27522#issuecomment-1312779811 hey @potiuk ,what should i add in the `requirements.txt` file ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] bbovenzi commented on pull request #27639: Enable copying DagRun JSON to clipboard
bbovenzi commented on PR #27639: URL: https://github.com/apache/airflow/pull/27639#issuecomment-1312775691 > @pierrejeambrun Is there a reason why this specific package is being used? It hasn't been updated in over 2 years. Probably because we added it almost 2 years ago back when the package was still being updated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] bdsoha commented on pull request #27639: Enable copying DagRun JSON to clipboard
bdsoha commented on PR #27639: URL: https://github.com/apache/airflow/pull/27639#issuecomment-1312758482 @pierrejeambrun Is there a reason why this specific package is being used? It hasn't been updated in over 2 years. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pierrejeambrun commented on pull request #27639: Enable copying DagRun JSON to clipboard
pierrejeambrun commented on PR #27639: URL: https://github.com/apache/airflow/pull/27639#issuecomment-1312745157 Hello @bdsoha, Good idea, When copying a line, there is a green tick that appears for 5.5 seconds, this green tick is not inlined and makes the formatting jump when you hover it, I find this weird. (it appears bellow the copied line) ![image](https://user-images.githubusercontent.com/14861206/201526921-1f87c525-ef8f-4bbf-ac4d-894adb316883.png) Seems similar to https://github.com/mac-s-g/react-json-view/issues/241 but only when you already have copied something to the clipboard. Maybe that's just me, but I whish the green tick was on the same line. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pierrejeambrun commented on issue #27523: Jumping tasks in grid
pierrejeambrun commented on issue #27523: URL: https://github.com/apache/airflow/issues/27523#issuecomment-1312739101 @fokmess I see that you are willing to make a PR to fix this, I am assigning you. :smile: Don't hesitate to ask for some pointers if needed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pierrejeambrun opened a new pull request, #27642: Update API & Python Client versions
pierrejeambrun opened a new pull request, #27642: URL: https://github.com/apache/airflow/pull/27642 We might have forgotten to upgrade the API and Python client version when releasing 2.4. The stable release of the API is currently 2.3, we also have mentions of 'new in 2.4.0' in the current version. ![image](https://user-images.githubusercontent.com/14861206/201525008-1eea553e-5aa0-442f-8ced-ec9488350ef3.png) ![image](https://user-images.githubusercontent.com/14861206/201524988-dabd95e1-2919-43bb-8e76-613271e86d25.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pierrejeambrun commented on a diff in pull request #26658: Clear TaskGroup
pierrejeambrun commented on code in PR #26658: URL: https://github.com/apache/airflow/pull/26658#discussion_r1020905058 ## airflow/www/views.py: ## @@ -2047,14 +2053,42 @@ def clear(self): recursive = request.form.get('recursive') == "true" only_failed = request.form.get('only_failed') == "true" -task_ids: list[str | tuple[str, int]] -if map_indexes is None: -task_ids = [task_id] -else: -task_ids = [(task_id, map_index) for map_index in map_indexes] +task_ids: list[str | tuple[str, int]] = [] + +end_date = execution_date if not future else None +start_date = execution_date if not past else None + +if group_id is not None: +task_group_dict = dag.task_group.get_task_group_dict() +task_group = task_group_dict.get(group_id) +if task_group is None: +return redirect_or_json( +origin, msg=f"TaskGroup {group_id} could not be found", status="error", status_code=404 +) +task_ids = task_ids_or_regex = [t.task_id for t in task_group.iter_tasks()] + +# Lock the related dag runs to prevent from possible dead lock. +# https://github.com/apache/airflow/pull/26658 +dag_runs_query = session.query(DagRun).filter(DagRun.dag_id == dag_id).with_for_update() +if start_date is None and end_date is None: +dag_runs_query = dag_runs_query.filter(DagRun.execution_date == start_date) +else: +if start_date is not None: +dag_runs_query = dag_runs_query.filter(DagRun.execution_date >= start_date) + +if end_date is not None: +dag_runs_query = dag_runs_query.filter(DagRun.execution_date <= end_date) + +_ = dag_runs_query.all() Review Comment: @uranusjr Fixed it, we are now requesting only the id to not load the entire object, much better thanks. @potiuk I added a warning in the confirmation dialog to warn the user that this action could take a while (only for clearing task group across multiple dags -> future or past activated). Then we can merge this and start collecting feedback on this feature, maybe some input from astronomer so I can adjust for the **b)** solution you suggested. ![image](https://user-images.githubusercontent.com/14861206/201524754-9e993631-8a0c-4141-a70e-74b34b8f66e2.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pierrejeambrun commented on a diff in pull request #26658: Clear TaskGroup
pierrejeambrun commented on code in PR #26658: URL: https://github.com/apache/airflow/pull/26658#discussion_r1020905058 ## airflow/www/views.py: ## @@ -2047,14 +2053,42 @@ def clear(self): recursive = request.form.get('recursive') == "true" only_failed = request.form.get('only_failed') == "true" -task_ids: list[str | tuple[str, int]] -if map_indexes is None: -task_ids = [task_id] -else: -task_ids = [(task_id, map_index) for map_index in map_indexes] +task_ids: list[str | tuple[str, int]] = [] + +end_date = execution_date if not future else None +start_date = execution_date if not past else None + +if group_id is not None: +task_group_dict = dag.task_group.get_task_group_dict() +task_group = task_group_dict.get(group_id) +if task_group is None: +return redirect_or_json( +origin, msg=f"TaskGroup {group_id} could not be found", status="error", status_code=404 +) +task_ids = task_ids_or_regex = [t.task_id for t in task_group.iter_tasks()] + +# Lock the related dag runs to prevent from possible dead lock. +# https://github.com/apache/airflow/pull/26658 +dag_runs_query = session.query(DagRun).filter(DagRun.dag_id == dag_id).with_for_update() +if start_date is None and end_date is None: +dag_runs_query = dag_runs_query.filter(DagRun.execution_date == start_date) +else: +if start_date is not None: +dag_runs_query = dag_runs_query.filter(DagRun.execution_date >= start_date) + +if end_date is not None: +dag_runs_query = dag_runs_query.filter(DagRun.execution_date <= end_date) + +_ = dag_runs_query.all() Review Comment: @uranusjr Fixed it, we are now requesting only the id to not load the entire object, much better thanks. @potiuk I added a warning in the confirmation dialog to warn the user that this action could take a while (only for clearing task group across multiple dag -> future or past activated). Then we can merge this and start collecting feedback on this feature, maybe some input from astronomer so I can adjust for the **b)** solution you suggested. ![image](https://user-images.githubusercontent.com/14861206/201524754-9e993631-8a0c-4141-a70e-74b34b8f66e2.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk opened a new pull request, #27641: The pinot-admin.sh command is now hard-coded.
potiuk opened a new pull request, #27641: URL: https://github.com/apache/airflow/pull/27641 --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ephraimbuddy commented on a diff in pull request #22253: Add SparkKubernetesOperator crd implementation
ephraimbuddy commented on code in PR #22253: URL: https://github.com/apache/airflow/pull/22253#discussion_r1020891590 ## airflow/providers/cncf/kubernetes/operators/custom_object_launcher.py: ## @@ -0,0 +1,333 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +"""Launches Custom object""" +import sys +import time +from copy import deepcopy +from datetime import datetime as dt +from typing import Optional + +import tenacity +from kubernetes import client +from kubernetes.client import models as k8s +from kubernetes.client.rest import ApiException + +from airflow.exceptions import AirflowException +from airflow.providers.cncf.kubernetes.utils.pod_manager import PodManager +from airflow.utils.log.logging_mixin import LoggingMixin + +from airflow.providers.cncf.kubernetes.resource_convert.secret import convert_secret, convert_image_pull_secrets +from airflow.providers.cncf.kubernetes.resource_convert.configmap import convert_configmap, convert_configmap_to_volume +from airflow.providers.cncf.kubernetes.resource_convert.env_variable import convert_env_vars + +if sys.version_info >= (3, 8): +from functools import cached_property +else: +from cached_property import cached_property Review Comment: ```suggestion from airflow.compat.functools import cached_property ``` ## airflow/providers/cncf/kubernetes/operators/custom_object_launcher.py: ## @@ -0,0 +1,333 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +"""Launches Custom object""" +import sys +import time +from copy import deepcopy +from datetime import datetime as dt +from typing import Optional + +import tenacity +from kubernetes import client +from kubernetes.client import models as k8s +from kubernetes.client.rest import ApiException + +from airflow.exceptions import AirflowException +from airflow.providers.cncf.kubernetes.utils.pod_manager import PodManager +from airflow.utils.log.logging_mixin import LoggingMixin + +from airflow.providers.cncf.kubernetes.resource_convert.secret import convert_secret, convert_image_pull_secrets +from airflow.providers.cncf.kubernetes.resource_convert.configmap import convert_configmap, convert_configmap_to_volume +from airflow.providers.cncf.kubernetes.resource_convert.env_variable import convert_env_vars + +if sys.version_info >= (3, 8): +from functools import cached_property +else: +from cached_property import cached_property + + +def should_retry_start_spark_job(exception: BaseException) -> bool: +"""Check if an Exception indicates a transient error and warrants retrying""" +if isinstance(exception, ApiException): +return exception.status == 409 +return False + + +class SparkJobSpec(): +def __init__(self, **entries): +self.__dict__.update(entries) +self.validate() +self.update_resources() + +def validate(self): +if self.spec['dynamicAllocation']['enabled']: +if not all( +[self.spec['dynamicAllocation']['initialExecutors'], self.spec['dynamicAllocation']['minExecutors'], self.spec['dynamicAllocation']['maxExecutors']] +): +raise AirflowException("Make sure initial/min/max value for dynamic allocation is passed") + +def update_resources(self): +spark_resources = SparkResources(self.spec['driver'].pop('container_resources'), self.spec['executor'].pop('container_resources')) +
[airflow] branch main updated: Add workeer log-groomer-sidecar enable option in helm chart (#27178)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new d6cb70331f Add workeer log-groomer-sidecar enable option in helm chart (#27178) d6cb70331f is described below commit d6cb70331f6272789ce5c23b36bd5a5386f46c7e Author: Bob Du AuthorDate: Sun Nov 13 19:36:06 2022 +0800 Add workeer log-groomer-sidecar enable option in helm chart (#27178) --- chart/templates/workers/worker-deployment.yaml | 2 +- chart/values.schema.json | 5 + chart/values.yaml | 2 ++ tests/charts/test_worker.py| 17 + 4 files changed, 25 insertions(+), 1 deletion(-) diff --git a/chart/templates/workers/worker-deployment.yaml b/chart/templates/workers/worker-deployment.yaml index ccd7abd50a..e3fa252090 100644 --- a/chart/templates/workers/worker-deployment.yaml +++ b/chart/templates/workers/worker-deployment.yaml @@ -243,7 +243,7 @@ spec: {{- if and (.Values.dags.gitSync.enabled) (not .Values.dags.persistence.enabled) }} {{- include "git_sync_container" . | indent 8 }} {{- end }} -{{- if $persistence }} +{{- if and $persistence .Values.workers.logGroomerSidecar.enabled }} - name: worker-log-groomer image: {{ template "airflow_image" . }} imagePullPolicy: {{ .Values.images.airflow.pullPolicy }} diff --git a/chart/values.schema.json b/chart/values.schema.json index 234ad58900..7e1148a704 100644 --- a/chart/values.schema.json +++ b/chart/values.schema.json @@ -1561,6 +1561,11 @@ "type": "object", "additionalProperties": false, "properties": { +"enabled": { +"description": "Whether to deploy the Airflow worker log groomer sidecar.", +"type": "boolean", +"default": true +}, "command": { "description": "Command to use when running the Airflow workers log groomer sidecar (templated).", "type": [ diff --git a/chart/values.yaml b/chart/values.yaml index bda009b09b..81bf964663 100644 --- a/chart/values.yaml +++ b/chart/values.yaml @@ -600,6 +600,8 @@ workers: labels: {} logGroomerSidecar: +# Whether to deploy the Airflow worker log groomer sidecar. +enabled: true # Command to use when running the Airflow worker log groomer sidecar (templated). command: ~ # Args to use when running the Airflow worker log groomer sidecar (templated). diff --git a/tests/charts/test_worker.py b/tests/charts/test_worker.py index 1c495e4f31..c7c9acc3fc 100644 --- a/tests/charts/test_worker.py +++ b/tests/charts/test_worker.py @@ -523,6 +523,23 @@ class TestWorker: assert ["release-name"] == jmespath.search("spec.template.spec.containers[0].command", docs[0]) assert ["Helm"] == jmespath.search("spec.template.spec.containers[0].args", docs[0]) +def test_log_groomer_collector_default_enabled(self): +docs = render_chart(show_only=["templates/workers/worker-deployment.yaml"]) +assert 2 == len(jmespath.search("spec.template.spec.containers", docs[0])) +assert "worker-log-groomer" in [ +c["name"] for c in jmespath.search("spec.template.spec.containers", docs[0]) +] + +def test_log_groomer_collector_can_be_disabled(self): +docs = render_chart( +values={"workers": {"logGroomerSidecar": {"enabled": False}}}, +show_only=["templates/workers/worker-deployment.yaml"], +) +assert 1 == len(jmespath.search("spec.template.spec.containers", docs[0])) +assert "worker-log-groomer" not in [ +c["name"] for c in jmespath.search("spec.template.spec.containers", docs[0]) +] + def test_log_groomer_default_command_and_args(self): docs = render_chart(show_only=["templates/workers/worker-deployment.yaml"])
[GitHub] [airflow] potiuk merged pull request #27178: Add workeer log-groomer-sidecar enable option in helm chart
potiuk merged PR #27178: URL: https://github.com/apache/airflow/pull/27178 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #27178: Add workeer log-groomer-sidecar enable option in helm chart
potiuk commented on PR #27178: URL: https://github.com/apache/airflow/pull/27178#issuecomment-1312708585 Merging. I think there is little controversy with that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ashb commented on pull request #26467: Role Management with Unique Dag Owner Implementation
ashb commented on PR #26467: URL: https://github.com/apache/airflow/pull/26467#issuecomment-1312708029 I'll look at this again tomorrow -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ephraimbuddy commented on a diff in pull request #23720: Fix backfill queued task getting reset to scheduled state.
ephraimbuddy commented on code in PR #23720: URL: https://github.com/apache/airflow/pull/23720#discussion_r1020883376 ## airflow/executors/kubernetes_executor.py: ## @@ -530,13 +532,15 @@ def start(self) -> None: self.kube_config.worker_pods_pending_timeout_check_interval, self._check_worker_pods_pending_timeout, ) -self.event_scheduler.call_regular_interval( -self.kube_config.worker_pods_queued_check_interval, -self.clear_not_launched_queued_tasks, -) -# We also call this at startup as that's the most likely time to see -# stuck queued tasks -self.clear_not_launched_queued_tasks() + +if self.job_id != 'manual': Review Comment: @snjypl personally, I'll prefer to have this part in a separate PR because it seems to be addressing something else? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #21225: Tasks stuck in queued state
potiuk commented on issue #21225: URL: https://github.com/apache/airflow/issues/21225#issuecomment-1312698655 > hi, we encountered this problem in production very fresh. I hope you find a solution to the problem. It's giving us a lot of trouble. Thank you. You have not explained which version you have. Did you upgrade to latest 2.4 ? version @Walkerazos ? This is anyhow the only way how fix can be applied (by upgrading to newer version) - and if you upgrade to 2.4 and you stop seing the problem, please report it here - we alredy got a few reports that the error has been solved since (there were multiple hundreds of errors fixed since and some of them related, so this is absolutely easiest and best way to verify if that problem has been fixed - by having several users like you who have this problem and it is gone after the upgrade. Can you help us with that @Walkerazos ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org