[GitHub] [airflow] stamixthereal commented on pull request #28634: Add doc-strings and small improvement to email util
stamixthereal commented on PR #28634: URL: https://github.com/apache/airflow/pull/28634#issuecomment-1367133320 Thanks for your review, @uranusjr; I fixed it! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] stamixthereal commented on a diff in pull request #28619: Fix code readability, add docstrings to json_client
stamixthereal commented on code in PR #28619: URL: https://github.com/apache/airflow/pull/28619#discussion_r1058786146 ## airflow/api/client/json_client.py: ## @@ -24,9 +24,25 @@ class Client(api_client.Client): -"""Json API client implementation.""" +"""Json API client implementation. -def _request(self, url, method="GET", json=None): +This client is used to interact with a Json API server and perform various actions +such as triggering DAG runs,deleting DAGs, interacting with pools, and getting lineage information. + +:param _api_base_url: The base URL for the Json API server. +:param _session: A session object to use for making HTTP requests. +""" + +def _request(self, url: str, json=None, method: str = "GET"): +"""Make a request to the Json API server. + +:param url: The URL to send the request to. +:param method: The HTTP method to use (e.g. "GET", "POST", "DELETE"). +:param json: A dictionary containing JSON data to send in the request body. +:return: A dictionary containing the JSON response from the server. +:rtype: dict Review Comment: I've double-checked different scenarios and left the `_request` return type as dict because it seems that this is the only case when it works, also .json() method usually returns parsed data in dict type -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on pull request #28020: Feature/#28015 default args
uranusjr commented on PR #28020: URL: https://github.com/apache/airflow/pull/28020#issuecomment-1367091823 I don’t think we want to implement this. DAG arguments and default args are different things and this introduces complexity that is unnecessary at best. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #28627: Change Architecture class into Enum
uranusjr commented on code in PR #28627: URL: https://github.com/apache/airflow/pull/28627#discussion_r1058744792 ## airflow/cli/commands/info_command.py: ## @@ -155,12 +156,16 @@ class Architecture: ARM = "arm" @staticmethod -def get_current(): +def get_current() -> Architecture: """Get architecture.""" -return _MACHINE_TO_ARCHITECTURE.get(platform.machine().lower()) +current_architecture = _MACHINE_TO_ARCHITECTURE.get(platform.machine().lower()) +if current_architecture: +return current_architecture +else: +raise RuntimeError(f"Unknown architecture for machine: {platform.machine()}") Review Comment: Some `get_current` usages has built-in handling of the `None` case here, do we want to continue to allow that? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #28634: Add doc-strings and small improvement to email util
uranusjr commented on code in PR #28634: URL: https://github.com/apache/airflow/pull/28634#discussion_r1058743022 ## airflow/utils/email.py: ## @@ -267,8 +306,13 @@ def _get_smtp_connection(host: str, port: int, timeout: int, with_ssl: bool) -> def _get_email_list_from_str(addresses: str) -> list[str]: -delimiters = [",", ";"] -for delimiter in delimiters: -if delimiter in addresses: -return [address.strip() for address in addresses.split(delimiter)] -return [addresses] +""" +Extract a list of email addresses from a string. The string +can contain multiple email addresses separated by +any of the following delimiters: ',' or ';'. + +:param addresses: A string containing one or more email addresses. +:return: A list of email addresses. +""" +pattern = r"[,;]\s*" +return [address.strip() for address in re.split(pattern, addresses)] Review Comment: Since we do `strip` afterwards, the `\s*` part in the pattern does not seem to be needed? (Wildcards incur a minor performance penalty.) An alternative is to do `\s*` at both sides and remove the extra `strip`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #28634: Add doc-strings and small improvement to email util
uranusjr commented on code in PR #28634: URL: https://github.com/apache/airflow/pull/28634#discussion_r1058742434 ## airflow/utils/email.py: ## @@ -47,8 +48,26 @@ def send_email( conn_id: str | None = None, custom_headers: dict[str, Any] | None = None, **kwargs, -): -"""Send email using backend specified in EMAIL_BACKEND.""" +) -> None: +""" +Send an email using the backend specified in the `EMAIL_BACKEND` configuration option. + +:param to: A list or iterable of email addresses to send the email to. +:param subject: The subject of the email. +:param html_content: The content of the email in HTML format. +:param files: A list of paths to files to attach to the email. +:param dryrun: If `True`, the email will not actually be sent. Default: `False`. +:param cc: A string or iterable of strings containing email addresses to send a copy of the email to. +:param bcc: A string or iterable of strings containing email addresses to send a +blind carbon copy of the email to. +:param mime_subtype: The subtype of the MIME message. Default: "mixed". +:param mime_charset: The charset of the email. Default: "utf-8". +:param conn_id: The connection ID to use for the backend. If not provided, the default connection +specified in the `EMAIL_CONN_ID` configuration option will be used. Review Comment: ```suggestion specified in the ``EMAIL_CONN_ID`` configuration option will be used. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #28634: Add doc-strings and small improvement to email util
uranusjr commented on code in PR #28634: URL: https://github.com/apache/airflow/pull/28634#discussion_r1058742360 ## airflow/utils/email.py: ## @@ -47,8 +48,26 @@ def send_email( conn_id: str | None = None, custom_headers: dict[str, Any] | None = None, **kwargs, -): -"""Send email using backend specified in EMAIL_BACKEND.""" +) -> None: +""" +Send an email using the backend specified in the `EMAIL_BACKEND` configuration option. + +:param to: A list or iterable of email addresses to send the email to. +:param subject: The subject of the email. +:param html_content: The content of the email in HTML format. +:param files: A list of paths to files to attach to the email. +:param dryrun: If `True`, the email will not actually be sent. Default: `False`. Review Comment: ```suggestion :param dryrun: If `True`, the email will not actually be sent. Default: `False`. ``` Either using double backticks or star (italic, matching the style used by the Python stdlib documentation) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #28634: Add doc-strings and small improvement to email util
uranusjr commented on code in PR #28634: URL: https://github.com/apache/airflow/pull/28634#discussion_r1058742159 ## airflow/utils/email.py: ## @@ -47,8 +48,26 @@ def send_email( conn_id: str | None = None, custom_headers: dict[str, Any] | None = None, **kwargs, -): -"""Send email using backend specified in EMAIL_BACKEND.""" +) -> None: +""" +Send an email using the backend specified in the `EMAIL_BACKEND` configuration option. Review Comment: ```suggestion Send an email using the backend specified in the ``EMAIL_BACKEND`` configuration option. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #28635: Defer to hook setting for split_statements in SQLExecuteQueryOperator
uranusjr commented on code in PR #28635: URL: https://github.com/apache/airflow/pull/28635#discussion_r1058741250 ## airflow/providers/common/sql/operators/sql.py: ## @@ -198,7 +198,7 @@ class SQLExecuteQueryOperator(BaseSQLOperator): :param autocommit: (optional) if True, each command is automatically committed (default: False). :param parameters: (optional) the parameters to render the SQL query with. :param handler: (optional) the function that will be applied to the cursor (default: fetch_all_handler). -:param split_statements: (optional) if split single SQL string into statements (default: False). +:param split_statements: (optional) if split single SQL string into statements (default: see hook.run). Review Comment: What does *see hook.run* mean? Perhaps we should explain a bit more here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #28619: Fix code readability, add docstrings to json_client
uranusjr commented on code in PR #28619: URL: https://github.com/apache/airflow/pull/28619#discussion_r1058737216 ## airflow/api/client/json_client.py: ## @@ -24,9 +24,24 @@ class Client(api_client.Client): -"""Json API client implementation.""" +"""Json API client implementation. -def _request(self, url, method="GET", json=None): +This client is used to interact with a Json API server and perform various actions +such as triggering DAG runs,deleting DAGs, interacting with pools, and getting lineage information. + +:param _api_base_url: The base URL for the Json API server. +:param _session: A session object to use for making HTTP requests. Review Comment: ```suggestion :param api_base_url: The base URL for the Json API server. :param session: A session object to use for making HTTP requests. ``` As far as I can tell, the parameters don’t have an underscore prefix? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #28631: Clear DB between each separate providers tests
uranusjr commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058731243 ## tests/providers/conftest.py: ## @@ -0,0 +1,89 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +from functools import lru_cache +from pathlib import Path + +import pytest + +from tests.test_utils import db + +_CLEAR_DB_PROVIDERS = set() + + +@lru_cache(maxsize=None) +def providers_packages(): +"""Get providers packages full qualname.""" + +current_dir = Path(__file__).absolute().parent +providers = set() +for root in current_dir.iterdir(): +if not root.is_dir(): +continue + +providers_dirs = set() +for sentinel in {"hooks", "operators", "sensors"}: +providers_dirs = providers_dirs.union({p.parent for p in root.rglob(sentinel) if p.is_dir()}) + +if providers_dirs: +for d in providers_dirs: +providers.add(".".join(d.relative_to(current_dir).parts)) +else: +providers.add(root.name) + +return providers + + +def get_test_provider_name(m): +"""Extract provider name from module full qualname.""" +_, _, name = m.__name__.partition("providers.") +for provider in providers_packages(): +if name.startswith(provider): +return provider +return None + + +@pytest.fixture(scope="module", autouse=True) +def _clear_db_between_providers_tests(request): +"""Clear DB between each separate provider package test runs.""" +provider_name = get_test_provider_name(request.module) +if provider_name and provider_name not in _CLEAR_DB_PROVIDERS: +_CLEAR_DB_PROVIDERS.add(provider_name) +db.clear_db_runs() +db.clear_db_datasets() +db.clear_db_dags() +db.clear_db_serialized_dags() +db.clear_db_sla_miss() +db.clear_db_pools() +db.clear_db_connections() +db.clear_db_variables() +db.clear_db_dag_code() +db.clear_db_callbacks() +db.clear_rendered_ti_fields() +db.clear_db_import_errors() +db.clear_db_dag_warnings() +db.clear_db_xcom() +db.clear_db_logs() +db.clear_db_jobs() +db.clear_db_task_fail() +db.clear_db_task_reschedule() +db.clear_dag_specific_permissions() +db.create_default_connections() +db.set_default_pool_slots(128) +yield Review Comment: Would it be a good idea to also clean after the provider tests? I feel like it’s generally a good principle for one thing to clean up one’s own stuffs. We could do 1. Clean when entering the `providers` directory. 2. Clean after each provider is finished. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #28631: Clear DB between each separate providers tests
uranusjr commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058731243 ## tests/providers/conftest.py: ## @@ -0,0 +1,89 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +from functools import lru_cache +from pathlib import Path + +import pytest + +from tests.test_utils import db + +_CLEAR_DB_PROVIDERS = set() + + +@lru_cache(maxsize=None) +def providers_packages(): +"""Get providers packages full qualname.""" + +current_dir = Path(__file__).absolute().parent +providers = set() +for root in current_dir.iterdir(): +if not root.is_dir(): +continue + +providers_dirs = set() +for sentinel in {"hooks", "operators", "sensors"}: +providers_dirs = providers_dirs.union({p.parent for p in root.rglob(sentinel) if p.is_dir()}) + +if providers_dirs: +for d in providers_dirs: +providers.add(".".join(d.relative_to(current_dir).parts)) +else: +providers.add(root.name) + +return providers + + +def get_test_provider_name(m): +"""Extract provider name from module full qualname.""" +_, _, name = m.__name__.partition("providers.") +for provider in providers_packages(): +if name.startswith(provider): +return provider +return None + + +@pytest.fixture(scope="module", autouse=True) +def _clear_db_between_providers_tests(request): +"""Clear DB between each separate provider package test runs.""" +provider_name = get_test_provider_name(request.module) +if provider_name and provider_name not in _CLEAR_DB_PROVIDERS: +_CLEAR_DB_PROVIDERS.add(provider_name) +db.clear_db_runs() +db.clear_db_datasets() +db.clear_db_dags() +db.clear_db_serialized_dags() +db.clear_db_sla_miss() +db.clear_db_pools() +db.clear_db_connections() +db.clear_db_variables() +db.clear_db_dag_code() +db.clear_db_callbacks() +db.clear_rendered_ti_fields() +db.clear_db_import_errors() +db.clear_db_dag_warnings() +db.clear_db_xcom() +db.clear_db_logs() +db.clear_db_jobs() +db.clear_db_task_fail() +db.clear_db_task_reschedule() +db.clear_dag_specific_permissions() +db.create_default_connections() +db.set_default_pool_slots(128) +yield Review Comment: Would it be a good idea to also clean after the provider tests? I feel like it’s generally a good principle for one thing to clean up one’s own stuffs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation
dstandish commented on code in PR #28300: URL: https://github.com/apache/airflow/pull/28300#discussion_r1058724161 ## docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst: ## @@ -0,0 +1,87 @@ + .. Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + +Public Interface of Airflow +=== + +The Public Interface of Apache Airflow is a set of programmatic interfaces that allow developers to interact +with and access certain features of the Apache Airflow system. This includes operations such as +creating and managing DAGs (directed acyclic graphs), managing tasks and their dependencies, +and extending Airflow capabilities by writing new executors, plugins, operators and providers. The +Public Interface can be useful for building custom tools and integrations with other systems, +and for automating certain aspects of the Airflow workflow. + +You can extend Airflow in three ways: + +* By writing new custom Python code (via Operators, Plugins, Provider) +* By using the `Stable REST API `_ (based on the OpenAPI specification) +* By using the `Airflow Command Line Interface (CLI) `_ + +How can you extend Apache Airflow with custom Python Code? +== + +The Public Interface of Airflow consists of a number of different classes and packages that provide access +to the core features and functionality of the system. + +The classes and packages that may be considered as the Public Interface include: + +* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in Airflow. +* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way write custom operators. +* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom hooks. +* The :class:`~airflow.models.connection.Connection`, which provides access to external service credentials and configuration. +* The :class:`~airflow.models.variable.Variable`, which provides access to Airflow configuration variables. +* The :class:`~airflow.models.xcom.XCom` which are used to access to inter-task communication data. +* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define custom secret managers. +* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define custom plugins. +* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement custom Custom Deferrable Operators (based on ``asyncio``). +* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way write custom decorators. +* The :class:`~airflow.listeners.listener.ListenerManager` class which provides hooks that can be implemented to respond to DAG/Task lifecycle events. + +.. versionadded:: 2.5 + + Listener public interface has been added in version 2.5. + +* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors are the components of Airflow + that are responsible for executing tasks. + +.. versionadded:: 2.6 + + There are a number of different executor implementations built-in Airflow, each with its own unique + characteristics and capabilities. Executor interface was available in earlier version of Airflow but + only as of version 2.6 executors are fully decoupled and Airflow does not rely on built-in set of executors. + You could have implemented (and succeeded) with implementing Executors before Airflow 2.6 and a number + of people succeeded in doing so, but there were some hard-coded behaviours that preferred in-built + executors, and custom executors could not provide full functionality that built-in executors had. + + +What is not part of the Public Interface of Apache Airflow? +=== + +Everything not mentioned in this document should be considered as non-Public Interface. Review Comment: > For providers it's far easier because there is no "internal" sharing between multiple modules so i mean even e.g. for KPO we have a `pod_manager` module. This should not be considered (ideally) user-public. It exists only for KPO, used only in KPO. It is internal helper code for KPO, just because, otherw
[GitHub] [airflow] dstandish commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation
dstandish commented on code in PR #28300: URL: https://github.com/apache/airflow/pull/28300#discussion_r1058722499 ## docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst: ## @@ -0,0 +1,87 @@ + .. Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + +Public Interface of Airflow +=== + +The Public Interface of Apache Airflow is a set of programmatic interfaces that allow developers to interact +with and access certain features of the Apache Airflow system. This includes operations such as +creating and managing DAGs (directed acyclic graphs), managing tasks and their dependencies, +and extending Airflow capabilities by writing new executors, plugins, operators and providers. The +Public Interface can be useful for building custom tools and integrations with other systems, +and for automating certain aspects of the Airflow workflow. + +You can extend Airflow in three ways: + +* By writing new custom Python code (via Operators, Plugins, Provider) +* By using the `Stable REST API `_ (based on the OpenAPI specification) +* By using the `Airflow Command Line Interface (CLI) `_ + +How can you extend Apache Airflow with custom Python Code? +== + +The Public Interface of Airflow consists of a number of different classes and packages that provide access +to the core features and functionality of the system. + +The classes and packages that may be considered as the Public Interface include: + +* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in Airflow. +* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way write custom operators. +* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom hooks. +* The :class:`~airflow.models.connection.Connection`, which provides access to external service credentials and configuration. +* The :class:`~airflow.models.variable.Variable`, which provides access to Airflow configuration variables. +* The :class:`~airflow.models.xcom.XCom` which are used to access to inter-task communication data. +* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define custom secret managers. +* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define custom plugins. +* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement custom Custom Deferrable Operators (based on ``asyncio``). +* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way write custom decorators. +* The :class:`~airflow.listeners.listener.ListenerManager` class which provides hooks that can be implemented to respond to DAG/Task lifecycle events. + +.. versionadded:: 2.5 + + Listener public interface has been added in version 2.5. + +* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors are the components of Airflow + that are responsible for executing tasks. + +.. versionadded:: 2.6 + + There are a number of different executor implementations built-in Airflow, each with its own unique + characteristics and capabilities. Executor interface was available in earlier version of Airflow but + only as of version 2.6 executors are fully decoupled and Airflow does not rely on built-in set of executors. + You could have implemented (and succeeded) with implementing Executors before Airflow 2.6 and a number + of people succeeded in doing so, but there were some hard-coded behaviours that preferred in-built + executors, and custom executors could not provide full functionality that built-in executors had. + + +What is not part of the Public Interface of Apache Airflow? +=== + +Everything not mentioned in this document should be considered as non-Public Interface. Review Comment: > you wrote yourself that you prefer to minimize backcompat surface and "everything that's not an operator / sensor / hook should be private". just want to clarify that, specifically with that example about "only hook / sensor / operator are public" i was not quite saying that's what i think it should be, but
[GitHub] [airflow] uranusjr commented on pull request #27710: add a new conf to wait past_deps before skipping a task
uranusjr commented on PR #27710: URL: https://github.com/apache/airflow/pull/27710#issuecomment-1367053758 Do we still need to keep the old argument for backward compatibility? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] bharanidharan14 commented on a diff in pull request #28262: Hook for managing directories and files in Azure Data Lake Storage Gen2
bharanidharan14 commented on code in PR #28262: URL: https://github.com/apache/airflow/pull/28262#discussion_r1058690039 ## airflow/providers/microsoft/azure/hooks/adls_v2.py: ## @@ -0,0 +1,307 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +from typing import Any + +from azure.core.exceptions import ResourceExistsError, ResourceNotFoundError +from azure.identity import ClientSecretCredential +from azure.storage.filedatalake import ( +DataLakeDirectoryClient, +DataLakeFileClient, +DataLakeServiceClient, +DirectoryProperties, +FileSystemClient, +FileSystemProperties, +) +from hooks.base import BaseHook + + +class AdlsClientHook(BaseHook): Review Comment: Addressed it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] bharanidharan14 commented on a diff in pull request #28262: Hook for managing directories and files in Azure Data Lake Storage Gen2
bharanidharan14 commented on code in PR #28262: URL: https://github.com/apache/airflow/pull/28262#discussion_r1058689964 ## tests/providers/microsoft/azure/hooks/test_adls_v2.py: ## @@ -0,0 +1,108 @@ +# Licensed to the Apache Software Foundation (ASF) under one Review Comment: Moved the test to `test_azure_data_lake` file -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] bharanidharan14 commented on a diff in pull request #28262: Hook for managing directories and files in Azure Data Lake Storage Gen2
bharanidharan14 commented on code in PR #28262: URL: https://github.com/apache/airflow/pull/28262#discussion_r1058689769 ## airflow/providers/microsoft/azure/hooks/adls_v2.py: ## @@ -0,0 +1,307 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +from typing import Any + +from azure.core.exceptions import ResourceExistsError, ResourceNotFoundError +from azure.identity import ClientSecretCredential +from azure.storage.filedatalake import ( +DataLakeDirectoryClient, +DataLakeFileClient, +DataLakeServiceClient, +DirectoryProperties, +FileSystemClient, +FileSystemProperties, +) +from hooks.base import BaseHook + + +class AdlsClientHook(BaseHook): +""" +This Hook interacts with ADLS gen2 storage account it mainly helps to create and manage +directories and files in storage accounts that have a hierarchical namespace. Using Alds_v2 connection +details create DataLakeServiceClient object + +Due to Wasb is marked as legacy and and retirement of the (ADLS1) it would be nice to +implement ALDS gen2 hook for interacting with the storage account. + +.. seealso:: + https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-directory-file-acl-python + +:param adls_v2_conn_id: Reference to the :ref:`adls_v2 connection `. Review Comment: @tatiana Addressed your suggestion -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] avicol commented on pull request #28353: Fix the sql syntax in merge_data
avicol commented on PR #28353: URL: https://github.com/apache/airflow/pull/28353#issuecomment-1367014562 > @avicol doc spellcheck 😢 The doc error is from a file that I did not contributed to: ../../airflow/providers/cncf/kubernetes/CHANGELOG.rst:377: (didn) and you didn’t use the Trying a rebase. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] hussein-awala commented on pull request #27710: add a new conf to wait past_deps before skipping a task
hussein-awala commented on PR #27710: URL: https://github.com/apache/airflow/pull/27710#issuecomment-1367003154 @potiuk it's ready, can you check it please? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] closed pull request #27481: Add "user_updated_state" column in taskinstance
github-actions[bot] closed pull request #27481: Add "user_updated_state" column in taskinstance URL: https://github.com/apache/airflow/pull/27481 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation
potiuk commented on code in PR #28300: URL: https://github.com/apache/airflow/pull/28300#discussion_r1058656390 ## docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst: ## @@ -0,0 +1,87 @@ + .. Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + +Public Interface of Airflow +=== + +The Public Interface of Apache Airflow is a set of programmatic interfaces that allow developers to interact +with and access certain features of the Apache Airflow system. This includes operations such as +creating and managing DAGs (directed acyclic graphs), managing tasks and their dependencies, +and extending Airflow capabilities by writing new executors, plugins, operators and providers. The +Public Interface can be useful for building custom tools and integrations with other systems, +and for automating certain aspects of the Airflow workflow. + +You can extend Airflow in three ways: + +* By writing new custom Python code (via Operators, Plugins, Provider) +* By using the `Stable REST API `_ (based on the OpenAPI specification) +* By using the `Airflow Command Line Interface (CLI) `_ + +How can you extend Apache Airflow with custom Python Code? +== + +The Public Interface of Airflow consists of a number of different classes and packages that provide access +to the core features and functionality of the system. + +The classes and packages that may be considered as the Public Interface include: + +* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in Airflow. +* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way write custom operators. +* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom hooks. +* The :class:`~airflow.models.connection.Connection`, which provides access to external service credentials and configuration. +* The :class:`~airflow.models.variable.Variable`, which provides access to Airflow configuration variables. +* The :class:`~airflow.models.xcom.XCom` which are used to access to inter-task communication data. +* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define custom secret managers. +* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define custom plugins. +* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement custom Custom Deferrable Operators (based on ``asyncio``). +* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way write custom decorators. +* The :class:`~airflow.listeners.listener.ListenerManager` class which provides hooks that can be implemented to respond to DAG/Task lifecycle events. + +.. versionadded:: 2.5 + + Listener public interface has been added in version 2.5. + +* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors are the components of Airflow + that are responsible for executing tasks. + +.. versionadded:: 2.6 + + There are a number of different executor implementations built-in Airflow, each with its own unique + characteristics and capabilities. Executor interface was available in earlier version of Airflow but + only as of version 2.6 executors are fully decoupled and Airflow does not rely on built-in set of executors. + You could have implemented (and succeeded) with implementing Executors before Airflow 2.6 and a number + of people succeeded in doing so, but there were some hard-coded behaviours that preferred in-built + executors, and custom executors could not provide full functionality that built-in executors had. + + +What is not part of the Public Interface of Apache Airflow? +=== + +Everything not mentioned in this document should be considered as non-Public Interface. Review Comment: BTW. Eventually we might be able to produce airflow typeshed-like package: https://github.com/python/typeshed with only stubs for only "externally public" APIs. This is what should be an "ultimate" goal I think. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to
[GitHub] [airflow] potiuk commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation
potiuk commented on code in PR #28300: URL: https://github.com/apache/airflow/pull/28300#discussion_r1058655785 ## docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst: ## @@ -0,0 +1,87 @@ + .. Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + +Public Interface of Airflow +=== + +The Public Interface of Apache Airflow is a set of programmatic interfaces that allow developers to interact +with and access certain features of the Apache Airflow system. This includes operations such as +creating and managing DAGs (directed acyclic graphs), managing tasks and their dependencies, +and extending Airflow capabilities by writing new executors, plugins, operators and providers. The +Public Interface can be useful for building custom tools and integrations with other systems, +and for automating certain aspects of the Airflow workflow. + +You can extend Airflow in three ways: + +* By writing new custom Python code (via Operators, Plugins, Provider) +* By using the `Stable REST API `_ (based on the OpenAPI specification) +* By using the `Airflow Command Line Interface (CLI) `_ + +How can you extend Apache Airflow with custom Python Code? +== + +The Public Interface of Airflow consists of a number of different classes and packages that provide access +to the core features and functionality of the system. + +The classes and packages that may be considered as the Public Interface include: + +* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in Airflow. +* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way write custom operators. +* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom hooks. +* The :class:`~airflow.models.connection.Connection`, which provides access to external service credentials and configuration. +* The :class:`~airflow.models.variable.Variable`, which provides access to Airflow configuration variables. +* The :class:`~airflow.models.xcom.XCom` which are used to access to inter-task communication data. +* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define custom secret managers. +* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define custom plugins. +* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement custom Custom Deferrable Operators (based on ``asyncio``). +* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way write custom decorators. +* The :class:`~airflow.listeners.listener.ListenerManager` class which provides hooks that can be implemented to respond to DAG/Task lifecycle events. + +.. versionadded:: 2.5 + + Listener public interface has been added in version 2.5. + +* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors are the components of Airflow + that are responsible for executing tasks. + +.. versionadded:: 2.6 + + There are a number of different executor implementations built-in Airflow, each with its own unique + characteristics and capabilities. Executor interface was available in earlier version of Airflow but + only as of version 2.6 executors are fully decoupled and Airflow does not rely on built-in set of executors. + You could have implemented (and succeeded) with implementing Executors before Airflow 2.6 and a number + of people succeeded in doing so, but there were some hard-coded behaviours that preferred in-built + executors, and custom executors could not provide full functionality that built-in executors had. + + +What is not part of the Public Interface of Apache Airflow? +=== + +Everything not mentioned in this document should be considered as non-Public Interface. Review Comment: > It's also important to clarify that some methods are intentionally "private". E.g. in the operator, it might have helper methods. But these methods are more implementation detail right? If users subclass, though, they may use and depend on these. So what do we do? Ideally, from maintainer perspective, we want the ov
[GitHub] [airflow] potiuk commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation
potiuk commented on code in PR #28300: URL: https://github.com/apache/airflow/pull/28300#discussion_r1058656390 ## docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst: ## @@ -0,0 +1,87 @@ + .. Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + +Public Interface of Airflow +=== + +The Public Interface of Apache Airflow is a set of programmatic interfaces that allow developers to interact +with and access certain features of the Apache Airflow system. This includes operations such as +creating and managing DAGs (directed acyclic graphs), managing tasks and their dependencies, +and extending Airflow capabilities by writing new executors, plugins, operators and providers. The +Public Interface can be useful for building custom tools and integrations with other systems, +and for automating certain aspects of the Airflow workflow. + +You can extend Airflow in three ways: + +* By writing new custom Python code (via Operators, Plugins, Provider) +* By using the `Stable REST API `_ (based on the OpenAPI specification) +* By using the `Airflow Command Line Interface (CLI) `_ + +How can you extend Apache Airflow with custom Python Code? +== + +The Public Interface of Airflow consists of a number of different classes and packages that provide access +to the core features and functionality of the system. + +The classes and packages that may be considered as the Public Interface include: + +* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in Airflow. +* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way write custom operators. +* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom hooks. +* The :class:`~airflow.models.connection.Connection`, which provides access to external service credentials and configuration. +* The :class:`~airflow.models.variable.Variable`, which provides access to Airflow configuration variables. +* The :class:`~airflow.models.xcom.XCom` which are used to access to inter-task communication data. +* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define custom secret managers. +* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define custom plugins. +* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement custom Custom Deferrable Operators (based on ``asyncio``). +* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way write custom decorators. +* The :class:`~airflow.listeners.listener.ListenerManager` class which provides hooks that can be implemented to respond to DAG/Task lifecycle events. + +.. versionadded:: 2.5 + + Listener public interface has been added in version 2.5. + +* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors are the components of Airflow + that are responsible for executing tasks. + +.. versionadded:: 2.6 + + There are a number of different executor implementations built-in Airflow, each with its own unique + characteristics and capabilities. Executor interface was available in earlier version of Airflow but + only as of version 2.6 executors are fully decoupled and Airflow does not rely on built-in set of executors. + You could have implemented (and succeeded) with implementing Executors before Airflow 2.6 and a number + of people succeeded in doing so, but there were some hard-coded behaviours that preferred in-built + executors, and custom executors could not provide full functionality that built-in executors had. + + +What is not part of the Public Interface of Apache Airflow? +=== + +Everything not mentioned in this document should be considered as non-Public Interface. Review Comment: BTW. Eventually we might be able to produce airflow typeshed https://github.com/python/typeshed with only stubs for only "externally public" APIs. This is what should be an "ultimate" goal I think. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us
[GitHub] [airflow] potiuk commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation
potiuk commented on code in PR #28300: URL: https://github.com/apache/airflow/pull/28300#discussion_r1058655785 ## docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst: ## @@ -0,0 +1,87 @@ + .. Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + +Public Interface of Airflow +=== + +The Public Interface of Apache Airflow is a set of programmatic interfaces that allow developers to interact +with and access certain features of the Apache Airflow system. This includes operations such as +creating and managing DAGs (directed acyclic graphs), managing tasks and their dependencies, +and extending Airflow capabilities by writing new executors, plugins, operators and providers. The +Public Interface can be useful for building custom tools and integrations with other systems, +and for automating certain aspects of the Airflow workflow. + +You can extend Airflow in three ways: + +* By writing new custom Python code (via Operators, Plugins, Provider) +* By using the `Stable REST API `_ (based on the OpenAPI specification) +* By using the `Airflow Command Line Interface (CLI) `_ + +How can you extend Apache Airflow with custom Python Code? +== + +The Public Interface of Airflow consists of a number of different classes and packages that provide access +to the core features and functionality of the system. + +The classes and packages that may be considered as the Public Interface include: + +* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in Airflow. +* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way write custom operators. +* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom hooks. +* The :class:`~airflow.models.connection.Connection`, which provides access to external service credentials and configuration. +* The :class:`~airflow.models.variable.Variable`, which provides access to Airflow configuration variables. +* The :class:`~airflow.models.xcom.XCom` which are used to access to inter-task communication data. +* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define custom secret managers. +* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define custom plugins. +* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement custom Custom Deferrable Operators (based on ``asyncio``). +* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way write custom decorators. +* The :class:`~airflow.listeners.listener.ListenerManager` class which provides hooks that can be implemented to respond to DAG/Task lifecycle events. + +.. versionadded:: 2.5 + + Listener public interface has been added in version 2.5. + +* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors are the components of Airflow + that are responsible for executing tasks. + +.. versionadded:: 2.6 + + There are a number of different executor implementations built-in Airflow, each with its own unique + characteristics and capabilities. Executor interface was available in earlier version of Airflow but + only as of version 2.6 executors are fully decoupled and Airflow does not rely on built-in set of executors. + You could have implemented (and succeeded) with implementing Executors before Airflow 2.6 and a number + of people succeeded in doing so, but there were some hard-coded behaviours that preferred in-built + executors, and custom executors could not provide full functionality that built-in executors had. + + +What is not part of the Public Interface of Apache Airflow? +=== + +Everything not mentioned in this document should be considered as non-Public Interface. Review Comment: > It's also important to clarify that some methods are intentionally "private". E.g. in the operator, it might have helper methods. But these methods are more implementation detail right? If users subclass, though, they may use and depend on these. So what do we do? Ideally, from maintainer perspective, we want the ov
[GitHub] [airflow] potiuk commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation
potiuk commented on code in PR #28300: URL: https://github.com/apache/airflow/pull/28300#discussion_r1058655785 ## docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst: ## @@ -0,0 +1,87 @@ + .. Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + +Public Interface of Airflow +=== + +The Public Interface of Apache Airflow is a set of programmatic interfaces that allow developers to interact +with and access certain features of the Apache Airflow system. This includes operations such as +creating and managing DAGs (directed acyclic graphs), managing tasks and their dependencies, +and extending Airflow capabilities by writing new executors, plugins, operators and providers. The +Public Interface can be useful for building custom tools and integrations with other systems, +and for automating certain aspects of the Airflow workflow. + +You can extend Airflow in three ways: + +* By writing new custom Python code (via Operators, Plugins, Provider) +* By using the `Stable REST API `_ (based on the OpenAPI specification) +* By using the `Airflow Command Line Interface (CLI) `_ + +How can you extend Apache Airflow with custom Python Code? +== + +The Public Interface of Airflow consists of a number of different classes and packages that provide access +to the core features and functionality of the system. + +The classes and packages that may be considered as the Public Interface include: + +* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in Airflow. +* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way write custom operators. +* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom hooks. +* The :class:`~airflow.models.connection.Connection`, which provides access to external service credentials and configuration. +* The :class:`~airflow.models.variable.Variable`, which provides access to Airflow configuration variables. +* The :class:`~airflow.models.xcom.XCom` which are used to access to inter-task communication data. +* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define custom secret managers. +* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define custom plugins. +* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement custom Custom Deferrable Operators (based on ``asyncio``). +* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way write custom decorators. +* The :class:`~airflow.listeners.listener.ListenerManager` class which provides hooks that can be implemented to respond to DAG/Task lifecycle events. + +.. versionadded:: 2.5 + + Listener public interface has been added in version 2.5. + +* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors are the components of Airflow + that are responsible for executing tasks. + +.. versionadded:: 2.6 + + There are a number of different executor implementations built-in Airflow, each with its own unique + characteristics and capabilities. Executor interface was available in earlier version of Airflow but + only as of version 2.6 executors are fully decoupled and Airflow does not rely on built-in set of executors. + You could have implemented (and succeeded) with implementing Executors before Airflow 2.6 and a number + of people succeeded in doing so, but there were some hard-coded behaviours that preferred in-built + executors, and custom executors could not provide full functionality that built-in executors had. + + +What is not part of the Public Interface of Apache Airflow? +=== + +Everything not mentioned in this document should be considered as non-Public Interface. Review Comment: > It's also important to clarify that some methods are intentionally "private". E.g. in the operator, it might have helper methods. But these methods are more implementation detail right? If users subclass, though, they may use and depend on these. So what do we do? Ideally, from maintainer perspective, we want the ov
[airflow] branch constraints-main updated: Updating constraints. Build id:
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch constraints-main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/constraints-main by this push: new 230be22d16 Updating constraints. Build id: 230be22d16 is described below commit 230be22d165b567d7954c88bb871c6fd7bd40ab6 Author: Automated GitHub Actions commit AuthorDate: Wed Dec 28 23:33:27 2022 + Updating constraints. Build id: This update in constraints is automatically committed by the CI 'constraints-push' step based on HEAD of '' in '' with commit sha . All tests passed in this build so we determined we can push the updated constraints. See https://github.com/apache/airflow/blob/main/README.md#installing-from-pypi for details. --- constraints-3.10.txt | 8 constraints-3.7.txt | 10 +- constraints-3.8.txt | 10 +- constraints-3.9.txt | 8 constraints-no-providers-3.10.txt | 2 +- constraints-no-providers-3.7.txt | 4 ++-- constraints-no-providers-3.8.txt | 4 ++-- constraints-no-providers-3.9.txt | 2 +- constraints-source-providers-3.10.txt | 10 +- constraints-source-providers-3.7.txt | 12 ++-- constraints-source-providers-3.8.txt | 12 ++-- constraints-source-providers-3.9.txt | 10 +- 12 files changed, 46 insertions(+), 46 deletions(-) diff --git a/constraints-3.10.txt b/constraints-3.10.txt index c091006864..aa2056dc23 100644 --- a/constraints-3.10.txt +++ b/constraints-3.10.txt @@ -1,5 +1,5 @@ # -# This constraints file was automatically generated on 2022-12-28T20:00:53Z +# This constraints file was automatically generated on 2022-12-28T23:33:09Z # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the providers from PIP-released packages at the moment of the constraint generation. @@ -173,9 +173,9 @@ billiard==3.6.4.0 black==23.1a1 bleach==5.0.1 blinker==1.5 -boto3==1.26.38 +boto3==1.26.39 boto==2.49.0 -botocore==1.29.38 +botocore==1.29.39 bowler==0.9.0 cachelib==0.9.0 cachetools==4.2.2 @@ -382,7 +382,7 @@ msrestazure==0.6.4 multi-key-dict==2.0.3 multidict==6.0.4 mypy-boto3-appflow==1.26.32 -mypy-boto3-rds==1.26.36 +mypy-boto3-rds==1.26.39 mypy-boto3-redshift-data==1.26.30 mypy-extensions==0.4.3 mypy==0.971 diff --git a/constraints-3.7.txt b/constraints-3.7.txt index 98b9ea54da..a86f768297 100644 --- a/constraints-3.7.txt +++ b/constraints-3.7.txt @@ -1,5 +1,5 @@ # -# This constraints file was automatically generated on 2022-12-28T20:01:31Z +# This constraints file was automatically generated on 2022-12-28T23:33:24Z # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the providers from PIP-released packages at the moment of the constraint generation. @@ -173,9 +173,9 @@ billiard==3.6.4.0 black==23.1a1 bleach==5.0.1 blinker==1.5 -boto3==1.26.38 +boto3==1.26.39 boto==2.49.0 -botocore==1.29.38 +botocore==1.29.39 bowler==0.9.0 cached-property==1.5.2 cachelib==0.9.0 @@ -322,7 +322,7 @@ idna==3.4 ijson==3.1.4 imagesize==1.4.1 importlib-metadata==4.13.0 -importlib-resources==5.10.1 +importlib-resources==5.10.2 incremental==22.10.0 inflection==0.5.1 influxdb-client==1.35.0 @@ -383,7 +383,7 @@ msrestazure==0.6.4 multi-key-dict==2.0.3 multidict==6.0.4 mypy-boto3-appflow==1.26.32 -mypy-boto3-rds==1.26.36 +mypy-boto3-rds==1.26.39 mypy-boto3-redshift-data==1.26.30 mypy-extensions==0.4.3 mypy==0.971 diff --git a/constraints-3.8.txt b/constraints-3.8.txt index 717b16d20a..f390106970 100644 --- a/constraints-3.8.txt +++ b/constraints-3.8.txt @@ -1,5 +1,5 @@ # -# This constraints file was automatically generated on 2022-12-28T20:01:21Z +# This constraints file was automatically generated on 2022-12-28T23:33:20Z # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the providers from PIP-released packages at the moment of the constraint generation. @@ -174,9 +174,9 @@ billiard==3.6.4.0 black==23.1a1 bleach==5.0.1 blinker==1.5 -boto3==1.26.38 +boto3==1.26.39 boto==2.49.0 -botocore==1.29.38 +botocore==1.29.39 bowler==0.9.0 cachelib==0.9.0 cachetools==4.2.2 @@ -323,7 +323,7 @@ idna==3.4 ijson==3.1.4 imagesize==1.4.1 importlib-metadata==5.2.0 -importlib-resources==5.10.1 +importlib-resources==5.10.2 incremental==22.10.0 inflection==0.5.1 influxdb-client==1.35.0 @@ -384,7 +384,7 @@ msrestazure==0.6.4 multi-key-dict==2.0.3 multidict==6.0.4 mypy-boto3-appflow==1.26.32 -mypy-boto3-rd
[GitHub] [airflow] potiuk commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation
potiuk commented on code in PR #28300: URL: https://github.com/apache/airflow/pull/28300#discussion_r1058648489 ## docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst: ## @@ -0,0 +1,87 @@ + .. Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + +Public Interface of Airflow +=== + +The Public Interface of Apache Airflow is a set of programmatic interfaces that allow developers to interact +with and access certain features of the Apache Airflow system. This includes operations such as +creating and managing DAGs (directed acyclic graphs), managing tasks and their dependencies, +and extending Airflow capabilities by writing new executors, plugins, operators and providers. The +Public Interface can be useful for building custom tools and integrations with other systems, +and for automating certain aspects of the Airflow workflow. + +You can extend Airflow in three ways: + +* By writing new custom Python code (via Operators, Plugins, Provider) +* By using the `Stable REST API `_ (based on the OpenAPI specification) +* By using the `Airflow Command Line Interface (CLI) `_ + +How can you extend Apache Airflow with custom Python Code? +== + +The Public Interface of Airflow consists of a number of different classes and packages that provide access +to the core features and functionality of the system. + +The classes and packages that may be considered as the Public Interface include: + +* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in Airflow. +* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way write custom operators. +* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom hooks. +* The :class:`~airflow.models.connection.Connection`, which provides access to external service credentials and configuration. +* The :class:`~airflow.models.variable.Variable`, which provides access to Airflow configuration variables. +* The :class:`~airflow.models.xcom.XCom` which are used to access to inter-task communication data. +* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define custom secret managers. +* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define custom plugins. +* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement custom Custom Deferrable Operators (based on ``asyncio``). +* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way write custom decorators. +* The :class:`~airflow.listeners.listener.ListenerManager` class which provides hooks that can be implemented to respond to DAG/Task lifecycle events. + +.. versionadded:: 2.5 + + Listener public interface has been added in version 2.5. + +* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors are the components of Airflow + that are responsible for executing tasks. + +.. versionadded:: 2.6 + + There are a number of different executor implementations built-in Airflow, each with its own unique + characteristics and capabilities. Executor interface was available in earlier version of Airflow but + only as of version 2.6 executors are fully decoupled and Airflow does not rely on built-in set of executors. + You could have implemented (and succeeded) with implementing Executors before Airflow 2.6 and a number + of people succeeded in doing so, but there were some hard-coded behaviours that preferred in-built + executors, and custom executors could not provide full functionality that built-in executors had. + + +What is not part of the Public Interface of Apache Airflow? +=== + +Everything not mentioned in this document should be considered as non-Public Interface. Review Comment: Maybe this is misunderstanding - what I wrote above was not about what provider packages "provides" - but what they "consume". This document indeed does not specify anything about what providers provide - it's only what Airflow "produces" and providers "consume" as users of that. The Community Providers have t
[GitHub] [airflow] dstandish commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation
dstandish commented on code in PR #28300: URL: https://github.com/apache/airflow/pull/28300#discussion_r1058643196 ## docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst: ## @@ -0,0 +1,87 @@ + .. Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + +Public Interface of Airflow +=== + +The Public Interface of Apache Airflow is a set of programmatic interfaces that allow developers to interact +with and access certain features of the Apache Airflow system. This includes operations such as +creating and managing DAGs (directed acyclic graphs), managing tasks and their dependencies, +and extending Airflow capabilities by writing new executors, plugins, operators and providers. The +Public Interface can be useful for building custom tools and integrations with other systems, +and for automating certain aspects of the Airflow workflow. + +You can extend Airflow in three ways: + +* By writing new custom Python code (via Operators, Plugins, Provider) +* By using the `Stable REST API `_ (based on the OpenAPI specification) +* By using the `Airflow Command Line Interface (CLI) `_ + +How can you extend Apache Airflow with custom Python Code? +== + +The Public Interface of Airflow consists of a number of different classes and packages that provide access +to the core features and functionality of the system. + +The classes and packages that may be considered as the Public Interface include: + +* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in Airflow. +* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way write custom operators. +* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom hooks. +* The :class:`~airflow.models.connection.Connection`, which provides access to external service credentials and configuration. +* The :class:`~airflow.models.variable.Variable`, which provides access to Airflow configuration variables. +* The :class:`~airflow.models.xcom.XCom` which are used to access to inter-task communication data. +* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define custom secret managers. +* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define custom plugins. +* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement custom Custom Deferrable Operators (based on ``asyncio``). +* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way write custom decorators. +* The :class:`~airflow.listeners.listener.ListenerManager` class which provides hooks that can be implemented to respond to DAG/Task lifecycle events. + +.. versionadded:: 2.5 + + Listener public interface has been added in version 2.5. + +* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors are the components of Airflow + that are responsible for executing tasks. + +.. versionadded:: 2.6 + + There are a number of different executor implementations built-in Airflow, each with its own unique + characteristics and capabilities. Executor interface was available in earlier version of Airflow but + only as of version 2.6 executors are fully decoupled and Airflow does not rely on built-in set of executors. + You could have implemented (and succeeded) with implementing Executors before Airflow 2.6 and a number + of people succeeded in doing so, but there were some hard-coded behaviours that preferred in-built + executors, and custom executors could not provide full functionality that built-in executors had. + + +What is not part of the Public Interface of Apache Airflow? +=== + +Everything not mentioned in this document should be considered as non-Public Interface. Review Comment: One problem with the public / private distinction is... from tool perspective ... even using _my_func across modules is sorta forbidden. But there is a difference between the internal "private" and the user-facing "private". We need to be able to write helper code that can be used across modules (internal public
[GitHub] [airflow] dstandish commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation
dstandish commented on code in PR #28300: URL: https://github.com/apache/airflow/pull/28300#discussion_r1058643196 ## docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst: ## @@ -0,0 +1,87 @@ + .. Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + +Public Interface of Airflow +=== + +The Public Interface of Apache Airflow is a set of programmatic interfaces that allow developers to interact +with and access certain features of the Apache Airflow system. This includes operations such as +creating and managing DAGs (directed acyclic graphs), managing tasks and their dependencies, +and extending Airflow capabilities by writing new executors, plugins, operators and providers. The +Public Interface can be useful for building custom tools and integrations with other systems, +and for automating certain aspects of the Airflow workflow. + +You can extend Airflow in three ways: + +* By writing new custom Python code (via Operators, Plugins, Provider) +* By using the `Stable REST API `_ (based on the OpenAPI specification) +* By using the `Airflow Command Line Interface (CLI) `_ + +How can you extend Apache Airflow with custom Python Code? +== + +The Public Interface of Airflow consists of a number of different classes and packages that provide access +to the core features and functionality of the system. + +The classes and packages that may be considered as the Public Interface include: + +* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in Airflow. +* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way write custom operators. +* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom hooks. +* The :class:`~airflow.models.connection.Connection`, which provides access to external service credentials and configuration. +* The :class:`~airflow.models.variable.Variable`, which provides access to Airflow configuration variables. +* The :class:`~airflow.models.xcom.XCom` which are used to access to inter-task communication data. +* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define custom secret managers. +* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define custom plugins. +* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement custom Custom Deferrable Operators (based on ``asyncio``). +* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way write custom decorators. +* The :class:`~airflow.listeners.listener.ListenerManager` class which provides hooks that can be implemented to respond to DAG/Task lifecycle events. + +.. versionadded:: 2.5 + + Listener public interface has been added in version 2.5. + +* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors are the components of Airflow + that are responsible for executing tasks. + +.. versionadded:: 2.6 + + There are a number of different executor implementations built-in Airflow, each with its own unique + characteristics and capabilities. Executor interface was available in earlier version of Airflow but + only as of version 2.6 executors are fully decoupled and Airflow does not rely on built-in set of executors. + You could have implemented (and succeeded) with implementing Executors before Airflow 2.6 and a number + of people succeeded in doing so, but there were some hard-coded behaviours that preferred in-built + executors, and custom executors could not provide full functionality that built-in executors had. + + +What is not part of the Public Interface of Apache Airflow? +=== + +Everything not mentioned in this document should be considered as non-Public Interface. Review Comment: One problem with public / private is... from tool perspective ... even using _my_func across modules is forbidden. But there is a difference between the internal "private" and the user-facing "private". We need to be able to write helper code that can be used across modules (internal public), but not have it be
[GitHub] [airflow] dstandish commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation
dstandish commented on code in PR #28300: URL: https://github.com/apache/airflow/pull/28300#discussion_r1058642484 ## docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst: ## @@ -0,0 +1,87 @@ + .. Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + +Public Interface of Airflow +=== + +The Public Interface of Apache Airflow is a set of programmatic interfaces that allow developers to interact +with and access certain features of the Apache Airflow system. This includes operations such as +creating and managing DAGs (directed acyclic graphs), managing tasks and their dependencies, +and extending Airflow capabilities by writing new executors, plugins, operators and providers. The +Public Interface can be useful for building custom tools and integrations with other systems, +and for automating certain aspects of the Airflow workflow. + +You can extend Airflow in three ways: + +* By writing new custom Python code (via Operators, Plugins, Provider) +* By using the `Stable REST API `_ (based on the OpenAPI specification) +* By using the `Airflow Command Line Interface (CLI) `_ + +How can you extend Apache Airflow with custom Python Code? +== + +The Public Interface of Airflow consists of a number of different classes and packages that provide access +to the core features and functionality of the system. + +The classes and packages that may be considered as the Public Interface include: + +* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in Airflow. +* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way write custom operators. +* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom hooks. +* The :class:`~airflow.models.connection.Connection`, which provides access to external service credentials and configuration. +* The :class:`~airflow.models.variable.Variable`, which provides access to Airflow configuration variables. +* The :class:`~airflow.models.xcom.XCom` which are used to access to inter-task communication data. +* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define custom secret managers. +* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define custom plugins. +* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement custom Custom Deferrable Operators (based on ``asyncio``). +* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way write custom decorators. +* The :class:`~airflow.listeners.listener.ListenerManager` class which provides hooks that can be implemented to respond to DAG/Task lifecycle events. + +.. versionadded:: 2.5 + + Listener public interface has been added in version 2.5. + +* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors are the components of Airflow + that are responsible for executing tasks. + +.. versionadded:: 2.6 + + There are a number of different executor implementations built-in Airflow, each with its own unique + characteristics and capabilities. Executor interface was available in earlier version of Airflow but + only as of version 2.6 executors are fully decoupled and Airflow does not rely on built-in set of executors. + You could have implemented (and succeeded) with implementing Executors before Airflow 2.6 and a number + of people succeeded in doing so, but there were some hard-coded behaviours that preferred in-built + executors, and custom executors could not provide full functionality that built-in executors had. + + +What is not part of the Public Interface of Apache Airflow? +=== + +Everything not mentioned in this document should be considered as non-Public Interface. Review Comment: It's also important to clarify that some methods are intentionally "private". E.g. in the operator, it might have helper methods. But these methods are more implementation detail right? If users subclass, though, they may use and depend on these. So what do we do? Ideally, from maintainer perspective, we want th
[GitHub] [airflow] dstandish commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation
dstandish commented on code in PR #28300: URL: https://github.com/apache/airflow/pull/28300#discussion_r1058642484 ## docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst: ## @@ -0,0 +1,87 @@ + .. Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + +Public Interface of Airflow +=== + +The Public Interface of Apache Airflow is a set of programmatic interfaces that allow developers to interact +with and access certain features of the Apache Airflow system. This includes operations such as +creating and managing DAGs (directed acyclic graphs), managing tasks and their dependencies, +and extending Airflow capabilities by writing new executors, plugins, operators and providers. The +Public Interface can be useful for building custom tools and integrations with other systems, +and for automating certain aspects of the Airflow workflow. + +You can extend Airflow in three ways: + +* By writing new custom Python code (via Operators, Plugins, Provider) +* By using the `Stable REST API `_ (based on the OpenAPI specification) +* By using the `Airflow Command Line Interface (CLI) `_ + +How can you extend Apache Airflow with custom Python Code? +== + +The Public Interface of Airflow consists of a number of different classes and packages that provide access +to the core features and functionality of the system. + +The classes and packages that may be considered as the Public Interface include: + +* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in Airflow. +* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way write custom operators. +* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom hooks. +* The :class:`~airflow.models.connection.Connection`, which provides access to external service credentials and configuration. +* The :class:`~airflow.models.variable.Variable`, which provides access to Airflow configuration variables. +* The :class:`~airflow.models.xcom.XCom` which are used to access to inter-task communication data. +* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define custom secret managers. +* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define custom plugins. +* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement custom Custom Deferrable Operators (based on ``asyncio``). +* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way write custom decorators. +* The :class:`~airflow.listeners.listener.ListenerManager` class which provides hooks that can be implemented to respond to DAG/Task lifecycle events. + +.. versionadded:: 2.5 + + Listener public interface has been added in version 2.5. + +* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors are the components of Airflow + that are responsible for executing tasks. + +.. versionadded:: 2.6 + + There are a number of different executor implementations built-in Airflow, each with its own unique + characteristics and capabilities. Executor interface was available in earlier version of Airflow but + only as of version 2.6 executors are fully decoupled and Airflow does not rely on built-in set of executors. + You could have implemented (and succeeded) with implementing Executors before Airflow 2.6 and a number + of people succeeded in doing so, but there were some hard-coded behaviours that preferred in-built + executors, and custom executors could not provide full functionality that built-in executors had. + + +What is not part of the Public Interface of Apache Airflow? +=== + +Everything not mentioned in this document should be considered as non-Public Interface. Review Comment: It's also important to clarify that some methods are intentionally "private". E.g. in the operator, it might have helper methods. But these methods are more implementation detail right? If users subclass, though, they may use and depend on these. So what do we do? Ideally, from maintainer perspective, we want th
[GitHub] [airflow] dstandish commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation
dstandish commented on code in PR #28300: URL: https://github.com/apache/airflow/pull/28300#discussion_r1058640908 ## docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst: ## @@ -0,0 +1,87 @@ + .. Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + +Public Interface of Airflow +=== + +The Public Interface of Apache Airflow is a set of programmatic interfaces that allow developers to interact +with and access certain features of the Apache Airflow system. This includes operations such as +creating and managing DAGs (directed acyclic graphs), managing tasks and their dependencies, +and extending Airflow capabilities by writing new executors, plugins, operators and providers. The +Public Interface can be useful for building custom tools and integrations with other systems, +and for automating certain aspects of the Airflow workflow. + +You can extend Airflow in three ways: + +* By writing new custom Python code (via Operators, Plugins, Provider) +* By using the `Stable REST API `_ (based on the OpenAPI specification) +* By using the `Airflow Command Line Interface (CLI) `_ + +How can you extend Apache Airflow with custom Python Code? +== + +The Public Interface of Airflow consists of a number of different classes and packages that provide access +to the core features and functionality of the system. + +The classes and packages that may be considered as the Public Interface include: + +* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in Airflow. +* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way write custom operators. +* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom hooks. +* The :class:`~airflow.models.connection.Connection`, which provides access to external service credentials and configuration. +* The :class:`~airflow.models.variable.Variable`, which provides access to Airflow configuration variables. +* The :class:`~airflow.models.xcom.XCom` which are used to access to inter-task communication data. +* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define custom secret managers. +* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define custom plugins. +* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement custom Custom Deferrable Operators (based on ``asyncio``). +* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way write custom decorators. +* The :class:`~airflow.listeners.listener.ListenerManager` class which provides hooks that can be implemented to respond to DAG/Task lifecycle events. + +.. versionadded:: 2.5 + + Listener public interface has been added in version 2.5. + +* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors are the components of Airflow + that are responsible for executing tasks. + +.. versionadded:: 2.6 + + There are a number of different executor implementations built-in Airflow, each with its own unique + characteristics and capabilities. Executor interface was available in earlier version of Airflow but + only as of version 2.6 executors are fully decoupled and Airflow does not rely on built-in set of executors. + You could have implemented (and succeeded) with implementing Executors before Airflow 2.6 and a number + of people succeeded in doing so, but there were some hard-coded behaviours that preferred in-built + executors, and custom executors could not provide full functionality that built-in executors had. + + +What is not part of the Public Interface of Apache Airflow? +=== + +Everything not mentioned in this document should be considered as non-Public Interface. Review Comment: And if it's both then how do we distinguish? I think the current convention is underscores OR meta private tag. But interested in your thoughts. We should include that in this proposal i think. Personally I think it is an important goal to minimize backcompat surface area so i would suggest that ever
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058633670 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: > For long term we also need to consider get rid of most of the stuff in `test.test_utils` and replace them by equivalent as fixture/context manager/markers but this better complete after migrate remaining tests to `pytest` and document them. > > Because most of the new contributors just copy-paste some stuff from one test to another without any idea how it actually work. Yep. Full agreement here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests
Taragolis commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r105866 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: For long term we also need to consider get rid of most of the stuff in `test.test_utils` and replace them by equivalent as fixture/context manager/markers but this better complete after migrate remaining tests to `pytest` and document them. Because most of the new contributors just copy-paste some stuff from one test to another without any idea how it actually work. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation
dstandish commented on code in PR #28300: URL: https://github.com/apache/airflow/pull/28300#discussion_r1058631912 ## docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst: ## @@ -0,0 +1,87 @@ + .. Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + +Public Interface of Airflow +=== + +The Public Interface of Apache Airflow is a set of programmatic interfaces that allow developers to interact +with and access certain features of the Apache Airflow system. This includes operations such as +creating and managing DAGs (directed acyclic graphs), managing tasks and their dependencies, +and extending Airflow capabilities by writing new executors, plugins, operators and providers. The +Public Interface can be useful for building custom tools and integrations with other systems, +and for automating certain aspects of the Airflow workflow. + +You can extend Airflow in three ways: + +* By writing new custom Python code (via Operators, Plugins, Provider) +* By using the `Stable REST API `_ (based on the OpenAPI specification) +* By using the `Airflow Command Line Interface (CLI) `_ + +How can you extend Apache Airflow with custom Python Code? +== + +The Public Interface of Airflow consists of a number of different classes and packages that provide access +to the core features and functionality of the system. + +The classes and packages that may be considered as the Public Interface include: + +* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in Airflow. +* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way write custom operators. +* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom hooks. +* The :class:`~airflow.models.connection.Connection`, which provides access to external service credentials and configuration. +* The :class:`~airflow.models.variable.Variable`, which provides access to Airflow configuration variables. +* The :class:`~airflow.models.xcom.XCom` which are used to access to inter-task communication data. +* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define custom secret managers. +* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define custom plugins. +* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement custom Custom Deferrable Operators (based on ``asyncio``). +* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way write custom decorators. +* The :class:`~airflow.listeners.listener.ListenerManager` class which provides hooks that can be implemented to respond to DAG/Task lifecycle events. + +.. versionadded:: 2.5 + + Listener public interface has been added in version 2.5. + +* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors are the components of Airflow + that are responsible for executing tasks. + +.. versionadded:: 2.6 + + There are a number of different executor implementations built-in Airflow, each with its own unique + characteristics and capabilities. Executor interface was available in earlier version of Airflow but + only as of version 2.6 executors are fully decoupled and Airflow does not rely on built-in set of executors. + You could have implemented (and succeeded) with implementing Executors before Airflow 2.6 and a number + of people succeeded in doing so, but there were some hard-coded behaviours that preferred in-built + executors, and custom executors could not provide full functionality that built-in executors had. + + +What is not part of the Public Interface of Apache Airflow? +=== + +Everything not mentioned in this document should be considered as non-Public Interface. Review Comment: Sorry... so ... specifically in the provider package... is everything private, public, or some private some public? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-u
[airflow] branch main updated: Variables set in variables.env are automatically exported (#28633)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new 94b3b897e2 Variables set in variables.env are automatically exported (#28633) 94b3b897e2 is described below commit 94b3b897e2f94902777c4b24fb10c915279d8967 Author: Jarek Potiuk AuthorDate: Wed Dec 28 23:28:50 2022 +0100 Variables set in variables.env are automatically exported (#28633) The variables set in variables.env were not automatically exported. --- scripts/in_container/configure_environment.sh | 2 ++ 1 file changed, 2 insertions(+) diff --git a/scripts/in_container/configure_environment.sh b/scripts/in_container/configure_environment.sh index d9638b3219..6d0cbae41e 100644 --- a/scripts/in_container/configure_environment.sh +++ b/scripts/in_container/configure_environment.sh @@ -36,8 +36,10 @@ if [[ -d "${AIRFLOW_BREEZE_CONFIG_DIR}" && \ echo echo "${COLOR_BLUE}Sourcing environment variables from ${VARIABLES_ENV_FILE} in ${AIRFLOW_BREEZE_CONFIG_DIR}${COLOR_RESET}" echo +set -o allexport # shellcheck disable=1090 source "${VARIABLES_ENV_FILE}" +set +o allexport popd >/dev/null 2>&1 || exit 1 fi
[GitHub] [airflow] potiuk merged pull request #28633: Variables set in variables.env are automatically exported
potiuk merged PR #28633: URL: https://github.com/apache/airflow/pull/28633 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058629909 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: I could check also request fixture tomorrow, I thought it should contain session information, such as how many tests collected and if it equal (or less, I don't know how it possible) 1 then do not cleanup anything. > I think this is not really needed now - the overhead with using the test_utils classes is really low indeed. I believe we should enable it for all tests and compare execution time on CI and if it is < 5% overhead in total or so, we can leave it on for all modules. ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: > I could check also request fixture tomorrow, I thought it should contain session information, such as how many tests collected and if it equal (or less, I don't know how it possible) 1 then do not cleanup anything. I think this is not really needed now - the overhead with using the test_utils classes is really low indeed. I believe we should enable it for all tests and compare execution time on CI and if it is < 5% overhead in total or so, we can leave it on for all modules. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests
Taragolis commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058628201 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: And also remove `yield` in this fixture we do not need any teardown, at least yet -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation
potiuk commented on code in PR #28300: URL: https://github.com/apache/airflow/pull/28300#discussion_r1058628066 ## docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst: ## @@ -0,0 +1,87 @@ + .. Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + +Public Interface of Airflow +=== + +The Public Interface of Apache Airflow is a set of programmatic interfaces that allow developers to interact +with and access certain features of the Apache Airflow system. This includes operations such as +creating and managing DAGs (directed acyclic graphs), managing tasks and their dependencies, +and extending Airflow capabilities by writing new executors, plugins, operators and providers. The +Public Interface can be useful for building custom tools and integrations with other systems, +and for automating certain aspects of the Airflow workflow. + +You can extend Airflow in three ways: + +* By writing new custom Python code (via Operators, Plugins, Provider) +* By using the `Stable REST API `_ (based on the OpenAPI specification) +* By using the `Airflow Command Line Interface (CLI) `_ + +How can you extend Apache Airflow with custom Python Code? +== + +The Public Interface of Airflow consists of a number of different classes and packages that provide access +to the core features and functionality of the system. + +The classes and packages that may be considered as the Public Interface include: + +* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in Airflow. +* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way write custom operators. +* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom hooks. +* The :class:`~airflow.models.connection.Connection`, which provides access to external service credentials and configuration. +* The :class:`~airflow.models.variable.Variable`, which provides access to Airflow configuration variables. +* The :class:`~airflow.models.xcom.XCom` which are used to access to inter-task communication data. +* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define custom secret managers. +* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define custom plugins. +* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement custom Custom Deferrable Operators (based on ``asyncio``). +* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way write custom decorators. +* The :class:`~airflow.listeners.listener.ListenerManager` class which provides hooks that can be implemented to respond to DAG/Task lifecycle events. + +.. versionadded:: 2.5 + + Listener public interface has been added in version 2.5. + +* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors are the components of Airflow + that are responsible for executing tasks. + +.. versionadded:: 2.6 + + There are a number of different executor implementations built-in Airflow, each with its own unique + characteristics and capabilities. Executor interface was available in earlier version of Airflow but + only as of version 2.6 executors are fully decoupled and Airflow does not rely on built-in set of executors. + You could have implemented (and succeeded) with implementing Executors before Airflow 2.6 and a number + of people succeeded in doing so, but there were some hard-coded behaviours that preferred in-built + executors, and custom executors could not provide full functionality that built-in executors had. + + +What is not part of the Public Interface of Apache Airflow? +=== + +Everything not mentioned in this document should be considered as non-Public Interface. Review Comment: And yes I want to add eventually tests to make sure that all our providers are only using what we consider public" and nothing else. This will be partially done as part of AIP-44 (for DB methods) but it can be extended to all Python calls made by the providers. -- This is an automated message from the Apache Git S
[GitHub] [airflow] potiuk commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation
potiuk commented on code in PR #28300: URL: https://github.com/apache/airflow/pull/28300#discussion_r1058627698 ## docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst: ## @@ -0,0 +1,87 @@ + .. Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + +Public Interface of Airflow +=== + +The Public Interface of Apache Airflow is a set of programmatic interfaces that allow developers to interact +with and access certain features of the Apache Airflow system. This includes operations such as +creating and managing DAGs (directed acyclic graphs), managing tasks and their dependencies, +and extending Airflow capabilities by writing new executors, plugins, operators and providers. The +Public Interface can be useful for building custom tools and integrations with other systems, +and for automating certain aspects of the Airflow workflow. + +You can extend Airflow in three ways: + +* By writing new custom Python code (via Operators, Plugins, Provider) +* By using the `Stable REST API `_ (based on the OpenAPI specification) +* By using the `Airflow Command Line Interface (CLI) `_ + +How can you extend Apache Airflow with custom Python Code? +== + +The Public Interface of Airflow consists of a number of different classes and packages that provide access +to the core features and functionality of the system. + +The classes and packages that may be considered as the Public Interface include: + +* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in Airflow. +* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way write custom operators. +* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom hooks. +* The :class:`~airflow.models.connection.Connection`, which provides access to external service credentials and configuration. +* The :class:`~airflow.models.variable.Variable`, which provides access to Airflow configuration variables. +* The :class:`~airflow.models.xcom.XCom` which are used to access to inter-task communication data. +* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define custom secret managers. +* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define custom plugins. +* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement custom Custom Deferrable Operators (based on ``asyncio``). +* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way write custom decorators. +* The :class:`~airflow.listeners.listener.ListenerManager` class which provides hooks that can be implemented to respond to DAG/Task lifecycle events. + +.. versionadded:: 2.5 + + Listener public interface has been added in version 2.5. + +* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors are the components of Airflow + that are responsible for executing tasks. + +.. versionadded:: 2.6 + + There are a number of different executor implementations built-in Airflow, each with its own unique + characteristics and capabilities. Executor interface was available in earlier version of Airflow but + only as of version 2.6 executors are fully decoupled and Airflow does not rely on built-in set of executors. + You could have implemented (and succeeded) with implementing Executors before Airflow 2.6 and a number + of people succeeded in doing so, but there were some hard-coded behaviours that preferred in-built + executors, and custom executors could not provide full functionality that built-in executors had. + + +What is not part of the Public Interface of Apache Airflow? +=== + +Everything not mentioned in this document should be considered as non-Public Interface. Review Comment: And the list does not **have** to be complete now. If we forget something before merging it - we can always add it later - there is completely no harm in that. It will be "eventually complete" if we follow it for quite some time and decide that some things should be "publicised". -- This is an automated message fro
[GitHub] [airflow] potiuk commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation
potiuk commented on code in PR #28300: URL: https://github.com/apache/airflow/pull/28300#discussion_r1058627308 ## docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst: ## @@ -0,0 +1,87 @@ + .. Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + +Public Interface of Airflow +=== + +The Public Interface of Apache Airflow is a set of programmatic interfaces that allow developers to interact +with and access certain features of the Apache Airflow system. This includes operations such as +creating and managing DAGs (directed acyclic graphs), managing tasks and their dependencies, +and extending Airflow capabilities by writing new executors, plugins, operators and providers. The +Public Interface can be useful for building custom tools and integrations with other systems, +and for automating certain aspects of the Airflow workflow. + +You can extend Airflow in three ways: + +* By writing new custom Python code (via Operators, Plugins, Provider) +* By using the `Stable REST API `_ (based on the OpenAPI specification) +* By using the `Airflow Command Line Interface (CLI) `_ + +How can you extend Apache Airflow with custom Python Code? +== + +The Public Interface of Airflow consists of a number of different classes and packages that provide access +to the core features and functionality of the system. + +The classes and packages that may be considered as the Public Interface include: + +* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in Airflow. +* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way write custom operators. +* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom hooks. +* The :class:`~airflow.models.connection.Connection`, which provides access to external service credentials and configuration. +* The :class:`~airflow.models.variable.Variable`, which provides access to Airflow configuration variables. +* The :class:`~airflow.models.xcom.XCom` which are used to access to inter-task communication data. +* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define custom secret managers. +* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define custom plugins. +* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement custom Custom Deferrable Operators (based on ``asyncio``). +* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way write custom decorators. +* The :class:`~airflow.listeners.listener.ListenerManager` class which provides hooks that can be implemented to respond to DAG/Task lifecycle events. + +.. versionadded:: 2.5 + + Listener public interface has been added in version 2.5. + +* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors are the components of Airflow + that are responsible for executing tasks. + +.. versionadded:: 2.6 + + There are a number of different executor implementations built-in Airflow, each with its own unique + characteristics and capabilities. Executor interface was available in earlier version of Airflow but + only as of version 2.6 executors are fully decoupled and Airflow does not rely on built-in set of executors. + You could have implemented (and succeeded) with implementing Executors before Airflow 2.6 and a number + of people succeeded in doing so, but there were some hard-coded behaviours that preferred in-built + executors, and custom executors could not provide full functionality that built-in executors had. + + +What is not part of the Public Interface of Apache Airflow? +=== + +Everything not mentioned in this document should be considered as non-Public Interface. Review Comment: IMHO - it should also address providers (and especially them). Rather than adding private tags, I think we should allowlist everything that should be public here. If we forgot something we should add. All the rest should be private by definition IMHO. This is the whole purpose of this list. And we should expl
[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests
Taragolis commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058626222 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: I could check also `request` fixture tomorrow, I thought it should contain session information, such as how many tests collected and if it equal (or less, I don't know how it possible) `1` then do not cleanup anything. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (69df1c5d9e -> e8657ce559)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git from 69df1c5d9e Speed up provider validation pre-commit (#28541) add e8657ce559 Improve "other" test category selection (#28630) No new revisions were added by this update. Summary of changes: .github/workflows/ci.yml | 6 +++-- Dockerfile.ci | 4 +++- .../src/airflow_breeze/utils/selective_checks.py | 10 ++-- dev/breeze/tests/test_selective_checks.py | 25 +++- scripts/docker/entrypoint_ci.sh| 4 +++- scripts/in_container/run_extract_tests.sh | 25 ...est_collection.py => test_pytest_collection.py} | 27 -- 7 files changed, 57 insertions(+), 44 deletions(-) delete mode 100755 scripts/in_container/run_extract_tests.sh rename scripts/in_container/{test_arm_pytest_collection.py => test_pytest_collection.py} (73%)
[GitHub] [airflow] potiuk merged pull request #28630: Improve "other" test category selection
potiuk merged PR #28630: URL: https://github.com/apache/airflow/pull/28630 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation
dstandish commented on code in PR #28300: URL: https://github.com/apache/airflow/pull/28300#discussion_r1058624408 ## docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst: ## @@ -0,0 +1,87 @@ + .. Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + +Public Interface of Airflow +=== + +The Public Interface of Apache Airflow is a set of programmatic interfaces that allow developers to interact +with and access certain features of the Apache Airflow system. This includes operations such as +creating and managing DAGs (directed acyclic graphs), managing tasks and their dependencies, +and extending Airflow capabilities by writing new executors, plugins, operators and providers. The +Public Interface can be useful for building custom tools and integrations with other systems, +and for automating certain aspects of the Airflow workflow. + +You can extend Airflow in three ways: + +* By writing new custom Python code (via Operators, Plugins, Provider) +* By using the `Stable REST API `_ (based on the OpenAPI specification) +* By using the `Airflow Command Line Interface (CLI) `_ + +How can you extend Apache Airflow with custom Python Code? +== + +The Public Interface of Airflow consists of a number of different classes and packages that provide access +to the core features and functionality of the system. + +The classes and packages that may be considered as the Public Interface include: + +* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in Airflow. +* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way write custom operators. +* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom hooks. +* The :class:`~airflow.models.connection.Connection`, which provides access to external service credentials and configuration. +* The :class:`~airflow.models.variable.Variable`, which provides access to Airflow configuration variables. +* The :class:`~airflow.models.xcom.XCom` which are used to access to inter-task communication data. +* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define custom secret managers. +* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define custom plugins. +* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement custom Custom Deferrable Operators (based on ``asyncio``). +* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way write custom decorators. +* The :class:`~airflow.listeners.listener.ListenerManager` class which provides hooks that can be implemented to respond to DAG/Task lifecycle events. + +.. versionadded:: 2.5 + + Listener public interface has been added in version 2.5. + +* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors are the components of Airflow + that are responsible for executing tasks. + +.. versionadded:: 2.6 + + There are a number of different executor implementations built-in Airflow, each with its own unique + characteristics and capabilities. Executor interface was available in earlier version of Airflow but + only as of version 2.6 executors are fully decoupled and Airflow does not rely on built-in set of executors. + You could have implemented (and succeeded) with implementing Executors before Airflow 2.6 and a number + of people succeeded in doing so, but there were some hard-coded behaviours that preferred in-built + executors, and custom executors could not provide full functionality that built-in executors had. + + +What is not part of the Public Interface of Apache Airflow? +=== + +Everything not mentioned in this document should be considered as non-Public Interface. Review Comment: This doesn't address providers, or how we can include "private" e.g. helper code in otherwise "public" modules / classes. Is it sufficient to include `:meta private:`? Or must we prefix with underscore? -- This is an automated message from the Apache Git Service. To respond to the message, please log on t
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058623426 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: I think it costs us almost nothing and is a **long term** solution if we do it this way. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058622999 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: You convinced me. I just tested your solution (My videos were with --with-db-init) and yes. I agree it's not noticeable in PyCharm/IntelliJ, so you are likely right that Flask import and others add a lot. And yeah. Having DB reset in this case for every test is quite OK. It is actually likely that those flask imports cause the constraint problems anyway. Ttry to notice which one is which (it's visible, but barely). https://user-images.githubusercontent.com/595491/209874550-732ee92c-43dc-4fc1-b3f7-f0b365b0d281.mov https://user-images.githubusercontent.com/595491/209874540-7a7e5611-d888-45a8-b8fd-02f8df147385.mov And it has the nice benefit that it will help the users in many situations where they run single tests as well. This overhead is much smallee that I was used to from `--with-db-init` However, I think few changes are needed: 1) have a "clear_all()" in the test_utils.db and using it instead of having the list of methods in conftest.py - this will help with future maintainability 2) What's even more ... I **really** think we should apply it to all tests and completely apply it to the main conftest.py and do it for all modules - I think with such low overhead we can kill few birds with the same stone and the overall overhead will be miinimal. I just did this in main "conftest.py". I don't think we need to detect any providers at all and run it for **everything** ``` @pytest.fixture(scope="module", autouse=True) def _clear_db_between_modules(request): from tests.test_utils import db """Clear DB between each separate provider package test runs.""" print("Clearing DB") db.clear_db_runs() db.clear_db_datasets() db.clear_db_dags() db.clear_db_serialized_dags() db.clear_db_sla_miss() db.clear_db_pools() db.clear_db_connections() db.clear_db_variables() db.clear_db_dag_code() db.clear_db_callbacks() db.clear_rendered_ti_fields() db.clear_db_import_errors() db.clear_db_dag_warnings() db.clear_db_xcom() db.clear_db_logs() db.clear_db_jobs() db.clear_db_task_fail() db.clear_db_task_reschedule() db.clear_dag_specific_permissions() db.create_default_connections() db.set_default_pool_slots(128) yield ``` And run it with -s ``` pytest -s tests/cli/ tests/core/ ``` And it looks like it works as expected: ``` tests/cli/commands/test_info_command.py::TestPiiAnonymizer::test_should_remove_pii_from_path Clearing DB PASSED tests/cli/commands/test_info_command.py::TestPiiAnonymizer::test_should_remove_pii_from_url[postgresql+psycopg2://postgres:airflow@postgres/airflow-postgresql+psycopg2://p...s:PASSWORD@postgres/airflow] PASSED tests/cli/commands/test_info_command.py::TestPiiAnonymizer::test_should_remove_pii_from_url[postgresql+psycopg2://postgres@postgres/airflow-postgresql+psycopg2://p...s@postgres/airflow] PASSED tests/cli/commands/test_info_command.py::TestPiiAnonymizer::test_should_remove_pii_from_url[postgresql+psycopg2://:airflow@postgres/airflow-postgresql+psycopg2://:PASSWORD@postgres/airflow] PASSED tests/cli/commands/test_info_command.py::TestPiiAnonymizer::test_should_remove_pii_from_url[postgresql+psycopg2://postgres/airflow-postgresql+psycopg2://postgres/airflow] PASSED tests/cli/commands/test_info_command.py::TestAirflowInfo::test_airflow_info PASSED tests/cli/commands/test_info_command.py::TestAirflowInfo::test_system_info PASSED tests/cli/commands/test_info_command.py::TestAirflowInfo::test_paths_info PASSED tests/cli/commands/test_info_command.py::TestAirflowInfo::test_tools_info PASSED tests/cli/commands
[GitHub] [airflow] dstandish commented on a diff in pull request #28336: Fixed hanged KubernetesPodOperator
dstandish commented on code in PR #28336: URL: https://github.com/apache/airflow/pull/28336#discussion_r1058615241 ## airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py: ## @@ -168,6 +168,7 @@ class KubernetesPodOperator(BaseOperator): :param labels: labels to apply to the Pod. (templated) :param startup_timeout_seconds: timeout in seconds to startup the pod. :param get_logs: get the stdout of the container as logs of the tasks. +:param logs_timeout: timeout in seconds to read logs after container termination. Review Comment: I am not sure this needs to be configurable. It reads like a setting "if no logs for logs_timeout settings then quit". but... really this timeout is applied only after pod completes. and, in this case, 2 minutes seems fine, it's a potentially confusing param, and i don't think we need to make that configurable. ## airflow/providers/cncf/kubernetes/utils/pod_manager.py: ## @@ -91,6 +99,61 @@ def get_container_termination_message(pod: V1Pod, container_name: str): return container_status.state.terminated.message if container_status else None +class PodLogsConsumer: +""" +PodLogsConsumer is responsible for pulling pod logs from a stream with checking a container status before +reading data. +This class is a workaround for the issue https://github.com/apache/airflow/issues/23497 +""" + +def __init__( +self, +response: HTTPResponse, +pod: V1Pod, +pod_manager: PodManager, +container_name: str, +timeout: int = 120, +): +self.response = response +self.pod = pod +self.pod_manager = pod_manager +self.container_name = container_name +self.timeout = timeout + +def __iter__(self) -> Generator[bytes, None, None]: +messages: list[bytes] = [] +if self.logs_available(): +for chunk in self.response.stream(amt=None, decode_content=True): Review Comment: does this work when follow=False? ## airflow/providers/cncf/kubernetes/utils/pod_manager.py: ## @@ -91,6 +99,61 @@ def get_container_termination_message(pod: V1Pod, container_name: str): return container_status.state.terminated.message if container_status else None +class PodLogsConsumer: +""" +PodLogsConsumer is responsible for pulling pod logs from a stream with checking a container status before +reading data. +This class is a workaround for the issue https://github.com/apache/airflow/issues/23497 +""" + +def __init__( +self, +response: HTTPResponse, +pod: V1Pod, +pod_manager: PodManager, +container_name: str, +timeout: int = 120, +): +self.response = response +self.pod = pod +self.pod_manager = pod_manager +self.container_name = container_name +self.timeout = timeout + +def __iter__(self) -> Generator[bytes, None, None]: +messages: list[bytes] = [] +if self.logs_available(): +for chunk in self.response.stream(amt=None, decode_content=True): +if b"\n" in chunk: +chunks = chunk.split(b"\n") +yield b"".join(messages) + chunks[0] + b"\n" +for x in chunks[1:-1]: +yield x + b"\n" +if chunks[-1]: +messages = [chunks[-1]] +else: +messages = [] +else: +messages.append(chunk) +if not self.logs_available(): Review Comment: this would seem to result in a _lot_ of calls to the kube API. it may be wise to limit such cheking to every couple minutes? wdyt? @jedcunningham ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] maggesssss commented on issue #23666: Add transfers operator S3 to SQL / SQL to SQL
magges commented on issue #23666: URL: https://github.com/apache/airflow/issues/23666#issuecomment-1366930217 @thinhnd2104 are you working on this topic? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pgagnon commented on pull request #28635: Defer to hook setting for split_statements in SQLExecuteQueryOperator
pgagnon commented on PR #28635: URL: https://github.com/apache/airflow/pull/28635#issuecomment-1366926674 > I don't think this breaks any backcompat because for most providers this is no change and, for snowflake , if you try to submit multistatement r.n. with defaults it will just fail We should be very explicit about this in the release notes though if we merge this change (which I am in favor of), because it potentially alters DAG behavior. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on a diff in pull request #28635: Defer to hook setting for split_statements in SQLExecuteQueryOperator
dstandish commented on code in PR #28635: URL: https://github.com/apache/airflow/pull/28635#discussion_r1058602568 ## airflow/providers/common/sql/operators/sql.py: ## @@ -26,6 +26,7 @@ from airflow.hooks.base import BaseHook from airflow.models import BaseOperator, SkipMixin from airflow.providers.common.sql.hooks.sql import DbApiHook, fetch_all_handler, return_single_query_results +from airflow.utils.types import NOTSET Review Comment: ```suggestion ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests
Taragolis commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058602023 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { +"alibaba", +"amazon", +"apache", +"atlassian", +"common", +"dbt", +"facebook", +"google", +"microsoft", +} +PROVIDERS_PACKAGES = set() + + +def get_test_provider_name(m): +"""Extract provider name from module full qualname.""" +_, _, name = m.__name__.partition("providers.") +for inner_provider in INNER_PROVIDERS: +if name.startswith(inner_provider): +return ".".join(name.split(".", 2)[:2]) +return name.split(".", 1)[0] + + +@pytest.fixture(scope="module", autouse=True) +def _clear_db_between_providers_tests(request): +"""Clear DB between each separate provider package test runs.""" +provider_name = get_test_provider_name(request.module) +if provider_name and provider_name not in PROVIDERS_PACKAGES: +PROVIDERS_PACKAGES.add(provider_name) +db.clear_db_runs() Review Comment: We could do it any time. This PR just a temporary solution but unfortunetly some of the temporary solution could work for ages and become permanent -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Adityamalik123 opened a new pull request, #28636: Adding base changes for adding updated_at filter for task_instance_en…
Adityamalik123 opened a new pull request, #28636: URL: https://github.com/apache/airflow/pull/28636 …dpoint --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests
Taragolis commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058601195 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: > I hate having to wait even 0.5 s to run single test with Ctrl+R. Plus some people have auto-rerrunable tests (there are plugins that run tests automatically after saving). We cannot make that experience so much worse. This could be nice debate for many reason. First of all `pytest` itself started for about 1 second with load all plugins plus outputs might take a lot of resources. ```console ❯ time pytest --help ... pytest --help 1.21s user 0.21s system 76% cpu 1.851 total ❯ time pytest --help ... pytest --help 0.58s user 0.11s system 98% cpu 0.697 total ❯ time pytest --help ... pytest --help 0.59s user 0.12s system 87% cpu 0.816 total ``` PyCharm also could add latency. Hopefully we not enable any coverage in adopts because PyCharm/IDEA could add additional 5-10 seconds per run in this case. Personally most of the time I spend in debugger when I try to check is it work as expected or try to figure out why it not working. And usually rerun all the module tests just for sure that last change not break anything new. As well as if we consider **X** it is a time for create something new then time for cover by tests plus fixing become **X * 5** with most of the time siting with hand on the face and told "Why on Earth you don't work as expected" or "This concept is garbage I need to rewrite my code because tests shows me not everything might work well". But this is a personal as well as all times only related to my setup, someone who has old laptop could additionally wait more than 0.2 second -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on pull request #28635: Defer to hook setting for split_statements in SQLExecuteQueryOperator
dstandish commented on PR #28635: URL: https://github.com/apache/airflow/pull/28635#issuecomment-1366922185 I don't think this breaks any backcompat because for most providers this is no change and, for snowflake , if you try to submit multistatement r.n. with defaults it will just fail -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish opened a new pull request, #28635: Defer to hook setting for split_statements in SQLExecuteQueryOperator
dstandish opened a new pull request, #28635: URL: https://github.com/apache/airflow/pull/28635 Some databases, such as snowflake, require you to split statements in order to submit multi-statement sql. For such databases, splitting is the natural default, and we should defer to the hook to control that. cc @potiuk @eladkal @kazanzhy @pgagnon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] stamixthereal opened a new pull request, #28634: Add doc-strings and small improvement to email util
stamixthereal opened a new pull request, #28634: URL: https://github.com/apache/airflow/pull/28634 Docstrings were added to email sending util, some type hints changes, and implements regular expression intro `_get_email_list_from_str` function -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests
Taragolis commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058578673 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: > See the videos above (that's the native env).. It is noticeable, a lot. I think this happen because Init DB required load Flask and their dependencies. All other tests right not use reset db / init db for cleanup their states. Instead of just run direct queries by this module: https://github.com/apache/airflow/blob/main/tests/test_utils/db.py which run pretty fast I agree that the current solution a bit dumb but the main target right now reduce chance of side-effects between different providers tests, and the cost about 0.2 sec on single test or +0.2 (plus additional for not expensive logic) on single provider package. A lot of chance that this solution do not required as soon as we change structure of providers tests. For all tests we should consider to use different approach -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058576658 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { +"alibaba", +"amazon", +"apache", +"atlassian", +"common", +"dbt", +"facebook", +"google", +"microsoft", +} +PROVIDERS_PACKAGES = set() + + +def get_test_provider_name(m): +"""Extract provider name from module full qualname.""" +_, _, name = m.__name__.partition("providers.") +for inner_provider in INNER_PROVIDERS: +if name.startswith(inner_provider): +return ".".join(name.split(".", 2)[:2]) +return name.split(".", 1)[0] + + +@pytest.fixture(scope="module", autouse=True) +def _clear_db_between_providers_tests(request): +"""Clear DB between each separate provider package test runs.""" +provider_name = get_test_provider_name(request.module) +if provider_name and provider_name not in PROVIDERS_PACKAGES: +PROVIDERS_PACKAGES.add(provider_name) +db.clear_db_runs() Review Comment: If you want I can take over (but not just yet - I need catch-up with other stuff :D ) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058576453 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { +"alibaba", +"amazon", +"apache", +"atlassian", +"common", +"dbt", +"facebook", +"google", +"microsoft", +} +PROVIDERS_PACKAGES = set() + + +def get_test_provider_name(m): +"""Extract provider name from module full qualname.""" +_, _, name = m.__name__.partition("providers.") +for inner_provider in INNER_PROVIDERS: +if name.startswith(inner_provider): +return ".".join(name.split(".", 2)[:2]) +return name.split(".", 1)[0] + + +@pytest.fixture(scope="module", autouse=True) +def _clear_db_between_providers_tests(request): +"""Clear DB between each separate provider package test runs.""" +provider_name = get_test_provider_name(request.module) +if provider_name and provider_name not in PROVIDERS_PACKAGES: +PROVIDERS_PACKAGES.add(provider_name) +db.clear_db_runs() Review Comment: (BTW. Now you know why I have not tried it **just yet**, I kinda subconsciously expected it won't be super easy). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058576111 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { +"alibaba", +"amazon", +"apache", +"atlassian", +"common", +"dbt", +"facebook", +"google", +"microsoft", +} +PROVIDERS_PACKAGES = set() + + +def get_test_provider_name(m): +"""Extract provider name from module full qualname.""" +_, _, name = m.__name__.partition("providers.") +for inner_provider in INNER_PROVIDERS: +if name.startswith(inner_provider): +return ".".join(name.split(".", 2)[:2]) +return name.split(".", 1)[0] + + +@pytest.fixture(scope="module", autouse=True) +def _clear_db_between_providers_tests(request): +"""Clear DB between each separate provider package test runs.""" +provider_name = get_test_provider_name(request.module) +if provider_name and provider_name not in PROVIDERS_PACKAGES: +PROVIDERS_PACKAGES.add(provider_name) +db.clear_db_runs() Review Comment: > Also tried. For some reason it works when we run --with-db-init. I think this is a **real** bug that we need to trace if it does not work here. I recall similar errors raised by our users. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058575510 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { +"alibaba", +"amazon", +"apache", +"atlassian", +"common", +"dbt", +"facebook", +"google", +"microsoft", +} +PROVIDERS_PACKAGES = set() + + +def get_test_provider_name(m): +"""Extract provider name from module full qualname.""" +_, _, name = m.__name__.partition("providers.") +for inner_provider in INNER_PROVIDERS: +if name.startswith(inner_provider): +return ".".join(name.split(".", 2)[:2]) +return name.split(".", 1)[0] + + +@pytest.fixture(scope="module", autouse=True) +def _clear_db_between_providers_tests(request): +"""Clear DB between each separate provider package test runs.""" +provider_name = get_test_provider_name(request.module) +if provider_name and provider_name not in PROVIDERS_PACKAGES: +PROVIDERS_PACKAGES.add(provider_name) +db.clear_db_runs() Review Comment: > We also have reset_db fixture which not use anywhere, at least I can't find any usage of them Yep. Noticed it - we can remove it I think while doing it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests
Taragolis commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058574945 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { +"alibaba", +"amazon", +"apache", +"atlassian", +"common", +"dbt", +"facebook", +"google", +"microsoft", +} +PROVIDERS_PACKAGES = set() + + +def get_test_provider_name(m): +"""Extract provider name from module full qualname.""" +_, _, name = m.__name__.partition("providers.") +for inner_provider in INNER_PROVIDERS: +if name.startswith(inner_provider): +return ".".join(name.split(".", 2)[:2]) +return name.split(".", 1)[0] + + +@pytest.fixture(scope="module", autouse=True) +def _clear_db_between_providers_tests(request): +"""Clear DB between each separate provider package test runs.""" +provider_name = get_test_provider_name(request.module) +if provider_name and provider_name not in PROVIDERS_PACKAGES: +PROVIDERS_PACKAGES.add(provider_name) +db.clear_db_runs() Review Comment: We also have `reset_db` fixture which not use anywhere, at least I can't find any usage of them https://github.com/apache/airflow/blob/69df1c5d9e27764daa3584d3da6b0647cda9093d/tests/conftest.py#L91-L100 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests
Taragolis commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058574499 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { +"alibaba", +"amazon", +"apache", +"atlassian", +"common", +"dbt", +"facebook", +"google", +"microsoft", +} +PROVIDERS_PACKAGES = set() + + +def get_test_provider_name(m): +"""Extract provider name from module full qualname.""" +_, _, name = m.__name__.partition("providers.") +for inner_provider in INNER_PROVIDERS: +if name.startswith(inner_provider): +return ".".join(name.split(".", 2)[:2]) +return name.split(".", 1)[0] + + +@pytest.fixture(scope="module", autouse=True) +def _clear_db_between_providers_tests(request): +"""Clear DB between each separate provider package test runs.""" +provider_name = get_test_provider_name(request.module) +if provider_name and provider_name not in PROVIDERS_PACKAGES: +PROVIDERS_PACKAGES.add(provider_name) +db.clear_db_runs() Review Comment: Also tried. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058574439 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { +"alibaba", +"amazon", +"apache", +"atlassian", +"common", +"dbt", +"facebook", +"google", +"microsoft", +} +PROVIDERS_PACKAGES = set() + + +def get_test_provider_name(m): +"""Extract provider name from module full qualname.""" +_, _, name = m.__name__.partition("providers.") +for inner_provider in INNER_PROVIDERS: +if name.startswith(inner_provider): +return ".".join(name.split(".", 2)[:2]) +return name.split(".", 1)[0] + + +@pytest.fixture(scope="module", autouse=True) +def _clear_db_between_providers_tests(request): +"""Clear DB between each separate provider package test runs.""" +provider_name = get_test_provider_name(request.module) +if provider_name and provider_name not in PROVIDERS_PACKAGES: +PROVIDERS_PACKAGES.add(provider_name) +db.clear_db_runs() Review Comment: I think som of those imports are needed to make it works (and no, it was not me who implemented it, and yes I also hate it ). ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { +"alibaba", +"amazon", +"apache", +"atlassian", +"common", +"dbt", +"facebook", +"google", +"microsoft", +} +PROVIDERS_PACKAGES = set() + + +def get_test_provider_name(m): +"""Extract provider name from module full qualname.""" +_, _, name = m.__name__.partition("providers.") +for inner_provider in INNER_PROVIDERS: +if name.startswith(inner_provider): +return ".".join(name.split(".", 2)[:2]) +return name.split(".", 1)[0] + + +@pytest.fixture(scope="module", autouse=True) +def _clear_db_between_providers_tests(request): +"""Clear DB between each separate provider package test runs.""" +provider_name = get_test_provider_name(request.module) +if provider_name and provider_name not in PROVIDERS_PACKAGES: +PROVIDERS_PACKAGES.add(provider_name) +db.clear_db_runs() Review Comment: I think some of those imports are needed to make it works (and no, it was not me who implemented it, and yes I also hate it ). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058574076 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { +"alibaba", +"amazon", +"apache", +"atlassian", +"common", +"dbt", +"facebook", +"google", +"microsoft", +} +PROVIDERS_PACKAGES = set() + + +def get_test_provider_name(m): +"""Extract provider name from module full qualname.""" +_, _, name = m.__name__.partition("providers.") +for inner_provider in INNER_PROVIDERS: +if name.startswith(inner_provider): +return ".".join(name.split(".", 2)[:2]) +return name.split(".", 1)[0] + + +@pytest.fixture(scope="module", autouse=True) +def _clear_db_between_providers_tests(request): +"""Clear DB between each separate provider package test runs.""" +provider_name = get_test_provider_name(request.module) +if provider_name and provider_name not in PROVIDERS_PACKAGES: +PROVIDERS_PACKAGES.add(provider_name) +db.clear_db_runs() Review Comment: This one works when we run `--with-db-init` so maybe you should simply just use this command @Taragolis (from our conftest.py). ``` def initial_db_init(): from flask import Flask from airflow.configuration import conf from airflow.utils import db from airflow.www.app import sync_appbuilder_roles from airflow.www.extensions.init_appbuilder import init_appbuilder db.resetdb() db.bootstrap_dagbag() # minimal app to add roles flask_app = Flask(__name__) flask_app.config["SQLALCHEMY_DATABASE_URI"] = conf.get("database", "SQL_ALCHEMY_CONN") init_appbuilder(flask_app) sync_appbuilder_roles(flask_app) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058569528 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { +"alibaba", +"amazon", +"apache", +"atlassian", +"common", +"dbt", +"facebook", +"google", +"microsoft", +} +PROVIDERS_PACKAGES = set() + + +def get_test_provider_name(m): +"""Extract provider name from module full qualname.""" +_, _, name = m.__name__.partition("providers.") +for inner_provider in INNER_PROVIDERS: +if name.startswith(inner_provider): +return ".".join(name.split(".", 2)[:2]) +return name.split(".", 1)[0] + + +@pytest.fixture(scope="module", autouse=True) +def _clear_db_between_providers_tests(request): +"""Clear DB between each separate provider package test runs.""" +provider_name = get_test_provider_name(request.module) +if provider_name and provider_name not in PROVIDERS_PACKAGES: +PROVIDERS_PACKAGES.add(provider_name) +db.clear_db_runs() Review Comment: I think this is because of some import sequence @ephraimbuddy - does it ring a bell ?. It works for sure in 'airflow db init` command, so we should be able to implement it here as well. I think we **must** sort it out, because otherwise it's not future-proof - in case we add new stuff. I can help sorting it out when we solve the other comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests
Taragolis commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058572636 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: > But the same kind of problem happened on multiple occasions for all kinds of tests, not only between providers. The mechanism is the same - data created in one test (in unrelated package) might impact tests from another package. So I think if we attempt to do such isolation, it should be done across the board. Then I suggest to do in a different way. Because such side effect outside of Providers could happened only when two PR merged and create this stuff 1. Migrate all test to pytest (there is less then 100 modules remaining) 2. Create separate pytest plugin with fixtures/context managers 3. Migrate all current tests to use this fixtures, most of them already exists in [conftest.py](https://github.com/apache/airflow/blob/main/tests/conftest.py) 4. Restrict manually create DAGs, TIs and etc and allow only by fixtures/context managers (more tricky part) 5. All core functionality should have setup and teardown which cleanup (most of them already have) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058571297 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: See the videos above (that's the native env).. It is noticeable, a lot. I hate having to wait even 0.5 s to run single test with Ctrl+R. Plus some people have auto-rerrunable tests (there are plugins that run tests automatically after saving). We cannot make that experience so much worse. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058571297 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: See the videos above (that's the native env).. It is noticeable, a lot. I hate having to wait even 0.5 to run single test with Ctrl+R. Plus some people have auto-rerrunable tests (there are plugins that run tests automatically after saving). We cannot make that experience so much worse. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058569528 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { +"alibaba", +"amazon", +"apache", +"atlassian", +"common", +"dbt", +"facebook", +"google", +"microsoft", +} +PROVIDERS_PACKAGES = set() + + +def get_test_provider_name(m): +"""Extract provider name from module full qualname.""" +_, _, name = m.__name__.partition("providers.") +for inner_provider in INNER_PROVIDERS: +if name.startswith(inner_provider): +return ".".join(name.split(".", 2)[:2]) +return name.split(".", 1)[0] + + +@pytest.fixture(scope="module", autouse=True) +def _clear_db_between_providers_tests(request): +"""Clear DB between each separate provider package test runs.""" +provider_name = get_test_provider_name(request.module) +if provider_name and provider_name not in PROVIDERS_PACKAGES: +PROVIDERS_PACKAGES.add(provider_name) +db.clear_db_runs() Review Comment: I think this is because of some import sequence @ephraimbuddy - does it ring a bell ?. It works for I think we **must** sort it out, because otherwise it's not future-proof - in case we add new stuff. I can help sorting it out when we solve the other comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests
Taragolis commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058568341 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: 1 second difference with this approach only in breeze in MacOS, in native environment differences even less ```console 1 passed in 0.21s 1 passed in 0.23s 1 passed in 0.24s ``` ```console 1 passed in 0.03s 1 passed in 0.03s 1 passed in 0.03s ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on a diff in pull request #28439: CustomTTYColoredFormatter should inherit TimezoneAware formatter
dstandish commented on code in PR #28439: URL: https://github.com/apache/airflow/pull/28439#discussion_r1058564087 ## airflow/utils/log/colored_log.py: ## @@ -38,7 +40,7 @@ BOLD_OFF = esc("22") -class CustomTTYColoredFormatter(TTYColoredFormatter): +class CustomTTYColoredFormatter(TTYColoredFormatter, TimezoneAware): Review Comment: oh forgot to add... sec -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on a diff in pull request #28439: CustomTTYColoredFormatter should inherit TimezoneAware formatter
dstandish commented on code in PR #28439: URL: https://github.com/apache/airflow/pull/28439#discussion_r1058563809 ## airflow/utils/log/colored_log.py: ## @@ -38,7 +40,7 @@ BOLD_OFF = esc("22") -class CustomTTYColoredFormatter(TTYColoredFormatter): +class CustomTTYColoredFormatter(TTYColoredFormatter, TimezoneAware): Review Comment: done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058561504 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: > So one second per run it not a big deal and if developer run all tests in single module or just one test class, it would be even unnoticeable. One second is a lot for tests that do not require the DB at all - especially if you run them from IDE with re-running single tests after the whole suite is run, The difference is - immediate green/red vs. waiting a second. I often forget to remove `--with-db-init` when I add it and this happens only for one run - the difference is really noticeable so I remove it as soon as I realise I left it. Compare those two: With db init: https://user-images.githubusercontent.com/595491/209865511-d2815c9f-0747-4445-adbe-0318705fbc1d.mov Without: https://user-images.githubusercontent.com/595491/209865517-0931689a-3491-43ec-824d-8db14100a775.mov > All others tests run all together, so if one tests from Core affect another then most probably we notice this in Core section Correct. But the same kind of problem happened on multiple occasions for all kinds of tests, not only between providers. The mechanism is the same - data created in one test (in unrelated package) might impact tests from another package. So I think if we attempt to do such isolation, it should be done across the board. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058561504 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: > So one second per run it not a big deal and if developer run all tests in single module or just one test class, it would be even unnoticeable. One second is a lot for tests that do not require the DB at all - especially if you run them from IDE with re-running single tests after the whole suite is run, The difference is - immediate green/red vs. waiting a second. I often forget to remove `--with-db-init` when I add it and this happens only for one run - the difference is really noticeable so I remove it as soon as I realise I left it. Compare those two: With db reset: https://user-images.githubusercontent.com/595491/209865511-d2815c9f-0747-4445-adbe-0318705fbc1d.mov Without: https://user-images.githubusercontent.com/595491/209865517-0931689a-3491-43ec-824d-8db14100a775.mov > All others tests run all together, so if one tests from Core affect another then most probably we notice this in Core section Correct. But the same kind of problem happened on multiple occasions for all kinds of tests, not only between providers. The mechanism is the same - data created in one test (in unrelated package) might impact tests from another package. So I think if we attempt to do such isolation, it should be done across the board. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch constraints-main updated: Updating constraints. Build id:
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch constraints-main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/constraints-main by this push: new 6d85c982c7 Updating constraints. Build id: 6d85c982c7 is described below commit 6d85c982c7cd3a1ef4f9f463e3d48023d8c50089 Author: Automated GitHub Actions commit AuthorDate: Wed Dec 28 20:01:34 2022 + Updating constraints. Build id: This update in constraints is automatically committed by the CI 'constraints-push' step based on HEAD of '' in '' with commit sha . All tests passed in this build so we determined we can push the updated constraints. See https://github.com/apache/airflow/blob/main/README.md#installing-from-pypi for details. --- constraints-3.10.txt | 4 ++-- constraints-3.7.txt | 4 ++-- constraints-3.8.txt | 4 ++-- constraints-3.9.txt | 4 ++-- constraints-no-providers-3.10.txt | 4 ++-- constraints-no-providers-3.7.txt | 4 ++-- constraints-no-providers-3.8.txt | 4 ++-- constraints-no-providers-3.9.txt | 4 ++-- constraints-source-providers-3.10.txt | 6 +++--- constraints-source-providers-3.7.txt | 6 +++--- constraints-source-providers-3.8.txt | 6 +++--- constraints-source-providers-3.9.txt | 6 +++--- 12 files changed, 28 insertions(+), 28 deletions(-) diff --git a/constraints-3.10.txt b/constraints-3.10.txt index dfe02ecc3f..c091006864 100644 --- a/constraints-3.10.txt +++ b/constraints-3.10.txt @@ -1,5 +1,5 @@ # -# This constraints file was automatically generated on 2022-12-28T17:49:32Z +# This constraints file was automatically generated on 2022-12-28T20:00:53Z # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the providers from PIP-released packages at the moment of the constraint generation. @@ -423,7 +423,7 @@ pinotdb==0.4.12 pipdeptree==2.3.3 pipx==1.1.0 pkginfo==1.9.2 -platformdirs==2.6.1 +platformdirs==2.6.2 pluggy==1.0.0 ply==3.11 plyvel==1.5.0 diff --git a/constraints-3.7.txt b/constraints-3.7.txt index a0d3f9db8f..98b9ea54da 100644 --- a/constraints-3.7.txt +++ b/constraints-3.7.txt @@ -1,5 +1,5 @@ # -# This constraints file was automatically generated on 2022-12-28T17:50:04Z +# This constraints file was automatically generated on 2022-12-28T20:01:31Z # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the providers from PIP-released packages at the moment of the constraint generation. @@ -425,7 +425,7 @@ pipdeptree==2.3.3 pipx==1.1.0 pkginfo==1.9.2 pkgutil_resolve_name==1.3.10 -platformdirs==2.6.1 +platformdirs==2.6.2 pluggy==1.0.0 ply==3.11 plyvel==1.5.0 diff --git a/constraints-3.8.txt b/constraints-3.8.txt index 30688906b2..717b16d20a 100644 --- a/constraints-3.8.txt +++ b/constraints-3.8.txt @@ -1,5 +1,5 @@ # -# This constraints file was automatically generated on 2022-12-28T17:49:55Z +# This constraints file was automatically generated on 2022-12-28T20:01:21Z # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the providers from PIP-released packages at the moment of the constraint generation. @@ -426,7 +426,7 @@ pipdeptree==2.3.3 pipx==1.1.0 pkginfo==1.9.2 pkgutil_resolve_name==1.3.10 -platformdirs==2.6.1 +platformdirs==2.6.2 pluggy==1.0.0 ply==3.11 plyvel==1.5.0 diff --git a/constraints-3.9.txt b/constraints-3.9.txt index bb81259e17..4ac0d98675 100644 --- a/constraints-3.9.txt +++ b/constraints-3.9.txt @@ -1,5 +1,5 @@ # -# This constraints file was automatically generated on 2022-12-28T17:49:54Z +# This constraints file was automatically generated on 2022-12-28T20:01:21Z # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the providers from PIP-released packages at the moment of the constraint generation. @@ -423,7 +423,7 @@ pinotdb==0.4.12 pipdeptree==2.3.3 pipx==1.1.0 pkginfo==1.9.2 -platformdirs==2.6.1 +platformdirs==2.6.2 pluggy==1.0.0 ply==3.11 plyvel==1.5.0 diff --git a/constraints-no-providers-3.10.txt b/constraints-no-providers-3.10.txt index 1ece205362..a0e7244016 100644 --- a/constraints-no-providers-3.10.txt +++ b/constraints-no-providers-3.10.txt @@ -1,5 +1,5 @@ # -# This constraints file was automatically generated on 2022-12-28T17:46:31Z +# This constraints file was automatically generated on 2022-12-28T19:57:59Z # via "eager-upgrade" mechanism of PIP. For the "main" bra
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058561504 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: > So one second per run it not a big deal and if developer run all tests in single module or just one test class, it would be even unnoticeable. One second is a lot for tests that do not require the DB at all - especially if you run them from IDE with re-running single tests after the whole suite is run, The difference is - immediate green/red vs. waiting a second. I often forget to remove `--with-db-init` when I add it and this happens only for one run - the difference is really noticeable. Compare those two: With db reset: https://user-images.githubusercontent.com/595491/209865511-d2815c9f-0747-4445-adbe-0318705fbc1d.mov Without: https://user-images.githubusercontent.com/595491/209865517-0931689a-3491-43ec-824d-8db14100a775.mov > All others tests run all together, so if one tests from Core affect another then most probably we notice this in Core section Correct. But the same kind of problem happened on multiple occasions for all kinds of tests, not only between providers. The mechanism is the same - data created in one test (in unrelated package) might impact tests from another package. So I think if we attempt to do such isolation, it should be done across the board. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests
Taragolis commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058557413 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { +"alibaba", +"amazon", +"apache", +"atlassian", +"common", +"dbt", +"facebook", +"google", +"microsoft", +} +PROVIDERS_PACKAGES = set() + + +def get_test_provider_name(m): +"""Extract provider name from module full qualname.""" +_, _, name = m.__name__.partition("providers.") +for inner_provider in INNER_PROVIDERS: +if name.startswith(inner_provider): +return ".".join(name.split(".", 2)[:2]) +return name.split(".", 1)[0] + + +@pytest.fixture(scope="module", autouse=True) +def _clear_db_between_providers_tests(request): +"""Clear DB between each separate provider package test runs.""" +provider_name = get_test_provider_name(request.module) +if provider_name and provider_name not in PROVIDERS_PACKAGES: +PROVIDERS_PACKAGES.add(provider_name) +db.clear_db_runs() Review Comment: That what I tried initially > Use separate clear tests helpers instead of airflow.utils.db.resetdb(). locally cause some errors with duplicate constraints. Long error traceback under the spoiler ``` console ❯ breeze --python 3.7 --backend postgres shell --db-reset root@ba342530c0bc:/opt/airflow# pytest tests/providers/docker/operators/test_docker.py::TestDockerOperator::test_execute tests/providers/slack/hooks/test_slack.py::TestSlackHook::test_token_property_deprecated == test session starts == platform linux -- Python 3.7.16, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- /usr/local/bin/python cachedir: .pytest_cache rootdir: /opt/airflow, configfile: pytest.ini plugins: cov-4.0.0, asyncio-0.20.3, rerunfailures-9.1.1, instafail-0.4.2, anyio-3.6.2, timeouts-1.2.1, xdist-3.1.0, requests-mock-1.10.0, capture-warnings-0.0.4, httpx-0.21.2, time-machine-2.8.2 asyncio: mode=strict setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s collected 2 items tests/providers/docker/operators/test_docker.py::TestDockerOperator::test_execute ERROR [ 50%] tests/providers/slack/hooks/test_slack.py::TestSlackHook::test_token_property_deprecated ERROR [100%] ERRORS = ___ ERROR at setup of TestDockerOperator.test_execute ___ self = , dialect = constructor = > statement = 'CREATE UNIQUE INDEX idx_ab_user_username ON ab_user (lower(username))', parameters = {}, execution_options = immutabledict({'autocommit': True}) args = (,), kw = {}, branched = , yp = None conn = , context = cursor = , evt_handled = False def _execute_context( self, dialect, constructor, statement, parameters, execution_options, *args, **kw ): """Create an :class:`.ExecutionContext` and execute, returning a :class:`_engine.CursorResult`.""" branched = self if self.__branch_from: # if this is a "branched" connection, do everything in terms # of the "root" connecti
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058561504 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: > So one second per run it not a big deal and if developer run all tests in single module or just one test class, it would be even unnoticeable. One second is a lot for tests that do not require the DB at all - especially if you run them from IDE with re-running single tests after the whole suite is run, The difference is - immediate green/red vs. waiting a second. I often forget to remove `--with-db-init` when I add it and this happens only for one run - the difference is really noticeable. Compare those two: With db reset: https://user-images.githubusercontent.com/595491/209865511-d2815c9f-0747-4445-adbe-0318705fbc1d.mov Without: https://user-images.githubusercontent.com/595491/209865517-0931689a-3491-43ec-824d-8db14100a775.mov > All others tests run all together, so if one tests from Core affect another then most probably we notice this in Core section But the same kind of problem happened on multiple occasions for all kinds of tests, not only between providers. The mechanism is the same - data created in one test (in unrelated package) might impact tests from another package. So I think if we attempt to do such isolation, it should be done across the board. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests
Taragolis commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058557413 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { +"alibaba", +"amazon", +"apache", +"atlassian", +"common", +"dbt", +"facebook", +"google", +"microsoft", +} +PROVIDERS_PACKAGES = set() + + +def get_test_provider_name(m): +"""Extract provider name from module full qualname.""" +_, _, name = m.__name__.partition("providers.") +for inner_provider in INNER_PROVIDERS: +if name.startswith(inner_provider): +return ".".join(name.split(".", 2)[:2]) +return name.split(".", 1)[0] + + +@pytest.fixture(scope="module", autouse=True) +def _clear_db_between_providers_tests(request): +"""Clear DB between each separate provider package test runs.""" +provider_name = get_test_provider_name(request.module) +if provider_name and provider_name not in PROVIDERS_PACKAGES: +PROVIDERS_PACKAGES.add(provider_name) +db.clear_db_runs() Review Comment: That what I tried initially > Use separate clear tests helpers instead of airflow.utils.db.resetdb(). locally cause some errors with duplicate constraints. ``` console ❯ breeze --python 3.7 --backend postgres shell --db-reset root@ba342530c0bc:/opt/airflow# pytest tests/providers/docker/operators/test_docker.py::TestDockerOperator::test_execute tests/providers/slack/hooks/test_slack.py::TestSlackHook::test_token_property_deprecated == test session starts == platform linux -- Python 3.7.16, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- /usr/local/bin/python cachedir: .pytest_cache rootdir: /opt/airflow, configfile: pytest.ini plugins: cov-4.0.0, asyncio-0.20.3, rerunfailures-9.1.1, instafail-0.4.2, anyio-3.6.2, timeouts-1.2.1, xdist-3.1.0, requests-mock-1.10.0, capture-warnings-0.0.4, httpx-0.21.2, time-machine-2.8.2 asyncio: mode=strict setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s collected 2 items tests/providers/docker/operators/test_docker.py::TestDockerOperator::test_execute ERROR [ 50%] tests/providers/slack/hooks/test_slack.py::TestSlackHook::test_token_property_deprecated ERROR [100%] ERRORS = ___ ERROR at setup of TestDockerOperator.test_execute ___ self = , dialect = constructor = > statement = 'CREATE UNIQUE INDEX idx_ab_user_username ON ab_user (lower(username))', parameters = {}, execution_options = immutabledict({'autocommit': True}) args = (,), kw = {}, branched = , yp = None conn = , context = cursor = , evt_handled = False def _execute_context( self, dialect, constructor, statement, parameters, execution_options, *args, **kw ): """Create an :class:`.ExecutionContext` and execute, returning a :class:`_engine.CursorResult`.""" branched = self if self.__branch_from: # if this is a "branched" connection, do everything in terms # of the "root" connection, *except* for .close(), which is #
[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests
Taragolis commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058554942 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: > but the way it is implemented now, it will also run the auto-fixture even if you run a single unit test. This will dramatically slow-down any iterations you do I am afraid (and it does not really matter if we do it only for Providers or for whole airflow). A lot of tests right now clear some part of DB between the run, so I can't say it dramatically affect of single test. Single test with fixture (3 runs) on Postgres in breeze ```console 1 passed in 1.95s 1 passed in 1.96s 1 passed in 2.00s ``` Single test without fixture (3 runs) on Postgres in breeze ```console 1 passed in 1.02s 1 passed in 0.98s 1 passed in 0.99s ``` So one second per run it not a big deal and if developer run all tests in single module or just one test class, it would be even unnoticeable. > Don't we want to do it for ALL packages that are "leaves" - not only for providers (including all airflow packages)? Initial problem happen because different providers packages could run separately for each other, e.g. SSH and Oracle and if developer do some stuff wrong we could find it only in main. All others tests run all together, so if one tests from Core affect another then most probably we notice this in Core section -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk opened a new pull request, #28633: Variables set in variables.env are automatically exported
potiuk opened a new pull request, #28633: URL: https://github.com/apache/airflow/pull/28633 The variables set in variables.env were not automatically exported. --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058533615 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { +"alibaba", +"amazon", +"apache", +"atlassian", +"common", +"dbt", +"facebook", +"google", +"microsoft", +} +PROVIDERS_PACKAGES = set() + + +def get_test_provider_name(m): +"""Extract provider name from module full qualname.""" +_, _, name = m.__name__.partition("providers.") +for inner_provider in INNER_PROVIDERS: +if name.startswith(inner_provider): +return ".".join(name.split(".", 2)[:2]) +return name.split(".", 1)[0] + + +@pytest.fixture(scope="module", autouse=True) +def _clear_db_between_providers_tests(request): +"""Clear DB between each separate provider package test runs.""" +provider_name = get_test_provider_name(request.module) +if provider_name and provider_name not in PROVIDERS_PACKAGES: +PROVIDERS_PACKAGES.add(provider_name) +db.clear_db_runs() Review Comment: Hm. I think we can do it with simply `db.resetdb()` (from airflow.utils) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058533615 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { +"alibaba", +"amazon", +"apache", +"atlassian", +"common", +"dbt", +"facebook", +"google", +"microsoft", +} +PROVIDERS_PACKAGES = set() + + +def get_test_provider_name(m): +"""Extract provider name from module full qualname.""" +_, _, name = m.__name__.partition("providers.") +for inner_provider in INNER_PROVIDERS: +if name.startswith(inner_provider): +return ".".join(name.split(".", 2)[:2]) +return name.split(".", 1)[0] + + +@pytest.fixture(scope="module", autouse=True) +def _clear_db_between_providers_tests(request): +"""Clear DB between each separate provider package test runs.""" +provider_name = get_test_provider_name(request.module) +if provider_name and provider_name not in PROVIDERS_PACKAGES: +PROVIDERS_PACKAGES.add(provider_name) +db.clear_db_runs() Review Comment: Hm. I thinkwe can do it with simply `db.resetdb()` ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { +"alibaba", +"amazon", +"apache", +"atlassian", +"common", +"dbt", +"facebook", +"google", +"microsoft", +} +PROVIDERS_PACKAGES = set() + + +def get_test_provider_name(m): +"""Extract provider name from module full qualname.""" +_, _, name = m.__name__.partition("providers.") +for inner_provider in INNER_PROVIDERS: +if name.startswith(inner_provider): +return ".".join(name.split(".", 2)[:2]) +return name.split(".", 1)[0] + + +@pytest.fixture(scope="module", autouse=True) +def _clear_db_between_providers_tests(request): +"""Clear DB between each separate provider package test runs.""" +provider_name = get_test_provider_name(request.module) +if provider_name and provider_name not in PROVIDERS_PACKAGES: +PROVIDERS_PACKAGES.add(provider_name) +db.clear_db_runs() Review Comment: Hm. I think we can do it with simply `db.resetdb()` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058531880 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: There is also one side-effect that we should take into account here. This is fine when we run the tests in CI, but the way it is implemented now, it will also run the auto-fixture even if you run a single unit test. This will dramatically slow-down any iterations you do I am afraid (and it does not really matter if we do it only for Providers or for whole airflow). I think this can be solved by only doing this when we run: 1) for the first time ever (so some kind of check if db is initialized) 2) or always when we run in CI (`CI` env variable set to "true"). The 1) would be nice for first-time users - we badly miss it now, the user needs to know that they have to run the tests with `--with-db-init` flag. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058531880 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: There is also one side-effect that we should take into account here. This is fine when we run the tests in CI, but the way it is implemented now, it will also run the auto-fixture even if you run a single unit test. This will dramatically slow-down any iterations you do I am afraid (and it does not really matter if we do it only for Providers or for whole airflow). I think this can be solved by only doing this when we run: 1) for the first time ever (so some kind of check if db is initialized) 2) or when we run in CI (`CI` env variable set to "true"). The 1) would be nice for first-time users - we badly miss it now, the user needs to know that they have to run the tests with `--with-db-init` flag. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058531880 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: There is also one side-effect that we should take into account here. This is fine when we run the tests in CI, but the way it is implemented now, it will also run the auto-fixture even if you run a single unit test. This will dramatically slow-down any iterations you do I am afraid (and it does not really matter if we do it only for Providers or for whole airflow). I think this can be solved by only doing this when we run for: 1) the first time at all (so some kind of check if db is initialized) 2) or when we run in CI (`CI` env variable set to "true"). The 1) would be nice for first-time users - we badly miss it now, the user needs to know that they have to run the tests with `--with-db-init` flag. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests
Taragolis commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058532316 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: I think we could . Just need to scan directory until found one of the folder: `hooks`, `operators`, `sensors` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058531880 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: There is also one side-effect that we should take into account here. This is fine when we run the tests in CI, but the way it is implemented now, it will also run the auto-fixture even if you run a single unit test. This will dramatically slow-down any iterations you do I am afraid (and it does not really matter if we do it only for Providers or for whole airflow). I think this can be solved by only doing this when we run for: 1) the first time at all (so some kind of check if db is initialized) 2) or when we run in CI (`CI` env variable set to "true"). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests
Taragolis commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058532316 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: I think we could just need to scan directory until found one of the folder: `hooks`, `operators`, `sensors` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058531880 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: There is also one side-effect that we should take into account here. This is fine when we run the tests in CI, but the way it is implemented now, it will also run the auto-fixture even if you run a single unit test. This will dramatically slow-down any iterations you do I am afraid (and it does not really matter if we do it only for Providers or for whole airflow).. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058530854 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: Another comment here @Taragolis . Maybe we do not need that at all. Don't we want to do it for ALL packages that are "leaves" - not only for providers (including all airflow packages)? I think that would be a nice compromise between isolation and speed of execution -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058530854 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: Another comment here @Taragolis . Maybe we do not need that at all. Don't we want to do it for ALL packages that are "leaves" - not only for providers? I thin that would be a nice compromise between isolation and speed of execution -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests
potiuk commented on code in PR #28631: URL: https://github.com/apache/airflow/pull/28631#discussion_r1058530302 ## tests/providers/conftest.py: ## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from __future__ import annotations + +import pytest + +from tests.test_utils import db + +# Providers with subpackages +INNER_PROVIDERS = { Review Comment: We can check for presence of provider.yaml. Or better - rely on `generated/provider_dependencies.json` instead. the provider_dependencies.json is guaranteed to be updated by pre-commit and it is far easier to use than scanning folders, as well as it's intention is to remain stable when we refactor and move the providers. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (8a23bf47a4 -> 69df1c5d9e)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git from 8a23bf47a4 Add note for pymssql requirement (#28625) add 69df1c5d9e Speed up provider validation pre-commit (#28541) No new revisions were added by this update. Summary of changes: .pre-commit-config.yaml | 5 ++--- scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py | 6 +- scripts/in_container/run_provider_yaml_files_check.py | 5 ++--- 3 files changed, 9 insertions(+), 7 deletions(-)
[GitHub] [airflow] potiuk merged pull request #28541: Speed up provider validation pre-commit
potiuk merged PR #28541: URL: https://github.com/apache/airflow/pull/28541 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #28280: Inconsistent handling of newlines in tasks
potiuk commented on issue #28280: URL: https://github.com/apache/airflow/issues/28280#issuecomment-1366846230 Yep. Let's keep it as the kind of issue that we can send anyone to if they stumble upon similar case :). We can even start counting. Number of people experiencing the less-than-stellar handling of newlines in both browser and Jinja: :one: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org