[GitHub] [airflow] stamixthereal commented on pull request #28634: Add doc-strings and small improvement to email util

2022-12-28 Thread GitBox


stamixthereal commented on PR #28634:
URL: https://github.com/apache/airflow/pull/28634#issuecomment-1367133320

   Thanks for your review, @uranusjr; I fixed it!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] stamixthereal commented on a diff in pull request #28619: Fix code readability, add docstrings to json_client

2022-12-28 Thread GitBox


stamixthereal commented on code in PR #28619:
URL: https://github.com/apache/airflow/pull/28619#discussion_r1058786146


##
airflow/api/client/json_client.py:
##
@@ -24,9 +24,25 @@
 
 
 class Client(api_client.Client):
-"""Json API client implementation."""
+"""Json API client implementation.
 
-def _request(self, url, method="GET", json=None):
+This client is used to interact with a Json API server and perform various 
actions
+such as triggering DAG runs,deleting DAGs, interacting with pools, and 
getting lineage information.
+
+:param _api_base_url: The base URL for the Json API server.
+:param _session: A session object to use for making HTTP requests.
+"""
+
+def _request(self, url: str, json=None, method: str = "GET"):
+"""Make a request to the Json API server.
+
+:param url: The URL to send the request to.
+:param method: The HTTP method to use (e.g. "GET", "POST", "DELETE").
+:param json: A dictionary containing JSON data to send in the request 
body.
+:return: A dictionary containing the JSON response from the server.
+:rtype: dict

Review Comment:
   I've double-checked different scenarios and left the `_request` return type 
as dict because it seems that this is the only case when it works, also .json() 
method usually returns parsed data in dict type



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #28020: Feature/#28015 default args

2022-12-28 Thread GitBox


uranusjr commented on PR #28020:
URL: https://github.com/apache/airflow/pull/28020#issuecomment-1367091823

   I don’t think we want to implement this. DAG arguments and default args are 
different things and this introduces complexity that is unnecessary at best.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #28627: Change Architecture class into Enum

2022-12-28 Thread GitBox


uranusjr commented on code in PR #28627:
URL: https://github.com/apache/airflow/pull/28627#discussion_r1058744792


##
airflow/cli/commands/info_command.py:
##
@@ -155,12 +156,16 @@ class Architecture:
 ARM = "arm"
 
 @staticmethod
-def get_current():
+def get_current() -> Architecture:
 """Get architecture."""
-return _MACHINE_TO_ARCHITECTURE.get(platform.machine().lower())
+current_architecture = 
_MACHINE_TO_ARCHITECTURE.get(platform.machine().lower())
+if current_architecture:
+return current_architecture
+else:
+raise RuntimeError(f"Unknown architecture for machine: 
{platform.machine()}")

Review Comment:
   Some `get_current` usages has built-in handling of the `None` case here, do 
we want to continue to allow that?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #28634: Add doc-strings and small improvement to email util

2022-12-28 Thread GitBox


uranusjr commented on code in PR #28634:
URL: https://github.com/apache/airflow/pull/28634#discussion_r1058743022


##
airflow/utils/email.py:
##
@@ -267,8 +306,13 @@ def _get_smtp_connection(host: str, port: int, timeout: 
int, with_ssl: bool) ->
 
 
 def _get_email_list_from_str(addresses: str) -> list[str]:
-delimiters = [",", ";"]
-for delimiter in delimiters:
-if delimiter in addresses:
-return [address.strip() for address in addresses.split(delimiter)]
-return [addresses]
+"""
+Extract a list of email addresses from a string. The string
+can contain multiple email addresses separated by
+any of the following delimiters: ',' or ';'.
+
+:param addresses: A string containing one or more email addresses.
+:return: A list of email addresses.
+"""
+pattern = r"[,;]\s*"
+return [address.strip() for address in re.split(pattern, addresses)]

Review Comment:
   Since we do `strip` afterwards, the `\s*` part in the pattern does not seem 
to be needed? (Wildcards incur a minor performance penalty.)
   
   An alternative is to do `\s*` at both sides and remove the extra `strip`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #28634: Add doc-strings and small improvement to email util

2022-12-28 Thread GitBox


uranusjr commented on code in PR #28634:
URL: https://github.com/apache/airflow/pull/28634#discussion_r1058742434


##
airflow/utils/email.py:
##
@@ -47,8 +48,26 @@ def send_email(
 conn_id: str | None = None,
 custom_headers: dict[str, Any] | None = None,
 **kwargs,
-):
-"""Send email using backend specified in EMAIL_BACKEND."""
+) -> None:
+"""
+Send an email using the backend specified in the `EMAIL_BACKEND` 
configuration option.
+
+:param to: A list or iterable of email addresses to send the email to.
+:param subject: The subject of the email.
+:param html_content: The content of the email in HTML format.
+:param files: A list of paths to files to attach to the email.
+:param dryrun: If `True`, the email will not actually be sent. Default: 
`False`.
+:param cc: A string or iterable of strings containing email addresses to 
send a copy of the email to.
+:param bcc: A string or iterable of strings containing email addresses to 
send a
+blind carbon copy of the email to.
+:param mime_subtype: The subtype of the MIME message. Default: "mixed".
+:param mime_charset: The charset of the email. Default: "utf-8".
+:param conn_id: The connection ID to use for the backend. If not provided, 
the default connection
+specified in the `EMAIL_CONN_ID` configuration option will be used.

Review Comment:
   ```suggestion
   specified in the ``EMAIL_CONN_ID`` configuration option will be used.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #28634: Add doc-strings and small improvement to email util

2022-12-28 Thread GitBox


uranusjr commented on code in PR #28634:
URL: https://github.com/apache/airflow/pull/28634#discussion_r1058742360


##
airflow/utils/email.py:
##
@@ -47,8 +48,26 @@ def send_email(
 conn_id: str | None = None,
 custom_headers: dict[str, Any] | None = None,
 **kwargs,
-):
-"""Send email using backend specified in EMAIL_BACKEND."""
+) -> None:
+"""
+Send an email using the backend specified in the `EMAIL_BACKEND` 
configuration option.
+
+:param to: A list or iterable of email addresses to send the email to.
+:param subject: The subject of the email.
+:param html_content: The content of the email in HTML format.
+:param files: A list of paths to files to attach to the email.
+:param dryrun: If `True`, the email will not actually be sent. Default: 
`False`.

Review Comment:
   ```suggestion
   :param dryrun: If `True`, the email will not actually be sent. Default: 
`False`.
   ```
   
   Either using double backticks or star (italic, matching the style used by 
the Python stdlib documentation)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #28634: Add doc-strings and small improvement to email util

2022-12-28 Thread GitBox


uranusjr commented on code in PR #28634:
URL: https://github.com/apache/airflow/pull/28634#discussion_r1058742159


##
airflow/utils/email.py:
##
@@ -47,8 +48,26 @@ def send_email(
 conn_id: str | None = None,
 custom_headers: dict[str, Any] | None = None,
 **kwargs,
-):
-"""Send email using backend specified in EMAIL_BACKEND."""
+) -> None:
+"""
+Send an email using the backend specified in the `EMAIL_BACKEND` 
configuration option.

Review Comment:
   ```suggestion
   Send an email using the backend specified in the ``EMAIL_BACKEND`` 
configuration option.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #28635: Defer to hook setting for split_statements in SQLExecuteQueryOperator

2022-12-28 Thread GitBox


uranusjr commented on code in PR #28635:
URL: https://github.com/apache/airflow/pull/28635#discussion_r1058741250


##
airflow/providers/common/sql/operators/sql.py:
##
@@ -198,7 +198,7 @@ class SQLExecuteQueryOperator(BaseSQLOperator):
 :param autocommit: (optional) if True, each command is automatically 
committed (default: False).
 :param parameters: (optional) the parameters to render the SQL query with.
 :param handler: (optional) the function that will be applied to the cursor 
(default: fetch_all_handler).
-:param split_statements: (optional) if split single SQL string into 
statements (default: False).
+:param split_statements: (optional) if split single SQL string into 
statements (default: see hook.run).

Review Comment:
   What does *see hook.run* mean? Perhaps we should explain a bit more here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #28619: Fix code readability, add docstrings to json_client

2022-12-28 Thread GitBox


uranusjr commented on code in PR #28619:
URL: https://github.com/apache/airflow/pull/28619#discussion_r1058737216


##
airflow/api/client/json_client.py:
##
@@ -24,9 +24,24 @@
 
 
 class Client(api_client.Client):
-"""Json API client implementation."""
+"""Json API client implementation.
 
-def _request(self, url, method="GET", json=None):
+This client is used to interact with a Json API server and perform various 
actions
+such as triggering DAG runs,deleting DAGs, interacting with pools, and 
getting lineage information.
+
+:param _api_base_url: The base URL for the Json API server.
+:param _session: A session object to use for making HTTP requests.

Review Comment:
   ```suggestion
   :param api_base_url: The base URL for the Json API server.
   :param session: A session object to use for making HTTP requests.
   ```
   
   As far as I can tell, the parameters don’t have an underscore prefix?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


uranusjr commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058731243


##
tests/providers/conftest.py:
##
@@ -0,0 +1,89 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+from functools import lru_cache
+from pathlib import Path
+
+import pytest
+
+from tests.test_utils import db
+
+_CLEAR_DB_PROVIDERS = set()
+
+
+@lru_cache(maxsize=None)
+def providers_packages():
+"""Get providers packages full qualname."""
+
+current_dir = Path(__file__).absolute().parent
+providers = set()
+for root in current_dir.iterdir():
+if not root.is_dir():
+continue
+
+providers_dirs = set()
+for sentinel in {"hooks", "operators", "sensors"}:
+providers_dirs = providers_dirs.union({p.parent for p in 
root.rglob(sentinel) if p.is_dir()})
+
+if providers_dirs:
+for d in providers_dirs:
+providers.add(".".join(d.relative_to(current_dir).parts))
+else:
+providers.add(root.name)
+
+return providers
+
+
+def get_test_provider_name(m):
+"""Extract provider name from module full qualname."""
+_, _, name = m.__name__.partition("providers.")
+for provider in providers_packages():
+if name.startswith(provider):
+return provider
+return None
+
+
+@pytest.fixture(scope="module", autouse=True)
+def _clear_db_between_providers_tests(request):
+"""Clear DB between each separate provider package test runs."""
+provider_name = get_test_provider_name(request.module)
+if provider_name and provider_name not in _CLEAR_DB_PROVIDERS:
+_CLEAR_DB_PROVIDERS.add(provider_name)
+db.clear_db_runs()
+db.clear_db_datasets()
+db.clear_db_dags()
+db.clear_db_serialized_dags()
+db.clear_db_sla_miss()
+db.clear_db_pools()
+db.clear_db_connections()
+db.clear_db_variables()
+db.clear_db_dag_code()
+db.clear_db_callbacks()
+db.clear_rendered_ti_fields()
+db.clear_db_import_errors()
+db.clear_db_dag_warnings()
+db.clear_db_xcom()
+db.clear_db_logs()
+db.clear_db_jobs()
+db.clear_db_task_fail()
+db.clear_db_task_reschedule()
+db.clear_dag_specific_permissions()
+db.create_default_connections()
+db.set_default_pool_slots(128)
+yield

Review Comment:
   Would it be a good idea to also clean after the provider tests? I feel like 
it’s generally a good principle for one thing to clean up one’s own stuffs. We 
could do
   
   1. Clean when entering the `providers` directory.
   2. Clean after each provider is finished.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


uranusjr commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058731243


##
tests/providers/conftest.py:
##
@@ -0,0 +1,89 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+from functools import lru_cache
+from pathlib import Path
+
+import pytest
+
+from tests.test_utils import db
+
+_CLEAR_DB_PROVIDERS = set()
+
+
+@lru_cache(maxsize=None)
+def providers_packages():
+"""Get providers packages full qualname."""
+
+current_dir = Path(__file__).absolute().parent
+providers = set()
+for root in current_dir.iterdir():
+if not root.is_dir():
+continue
+
+providers_dirs = set()
+for sentinel in {"hooks", "operators", "sensors"}:
+providers_dirs = providers_dirs.union({p.parent for p in 
root.rglob(sentinel) if p.is_dir()})
+
+if providers_dirs:
+for d in providers_dirs:
+providers.add(".".join(d.relative_to(current_dir).parts))
+else:
+providers.add(root.name)
+
+return providers
+
+
+def get_test_provider_name(m):
+"""Extract provider name from module full qualname."""
+_, _, name = m.__name__.partition("providers.")
+for provider in providers_packages():
+if name.startswith(provider):
+return provider
+return None
+
+
+@pytest.fixture(scope="module", autouse=True)
+def _clear_db_between_providers_tests(request):
+"""Clear DB between each separate provider package test runs."""
+provider_name = get_test_provider_name(request.module)
+if provider_name and provider_name not in _CLEAR_DB_PROVIDERS:
+_CLEAR_DB_PROVIDERS.add(provider_name)
+db.clear_db_runs()
+db.clear_db_datasets()
+db.clear_db_dags()
+db.clear_db_serialized_dags()
+db.clear_db_sla_miss()
+db.clear_db_pools()
+db.clear_db_connections()
+db.clear_db_variables()
+db.clear_db_dag_code()
+db.clear_db_callbacks()
+db.clear_rendered_ti_fields()
+db.clear_db_import_errors()
+db.clear_db_dag_warnings()
+db.clear_db_xcom()
+db.clear_db_logs()
+db.clear_db_jobs()
+db.clear_db_task_fail()
+db.clear_db_task_reschedule()
+db.clear_dag_specific_permissions()
+db.create_default_connections()
+db.set_default_pool_slots(128)
+yield

Review Comment:
   Would it be a good idea to also clean after the provider tests? I feel like 
it’s generally a good principle for one thing to clean up one’s own stuffs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dstandish commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation

2022-12-28 Thread GitBox


dstandish commented on code in PR #28300:
URL: https://github.com/apache/airflow/pull/28300#discussion_r1058724161


##
docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst:
##
@@ -0,0 +1,87 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Public Interface of Airflow
+===
+
+The Public Interface of Apache Airflow is a set of programmatic interfaces 
that allow developers to interact
+with and access certain features of the Apache Airflow system. This includes 
operations such as
+creating and managing DAGs (directed acyclic graphs), managing tasks and their 
dependencies,
+and extending Airflow capabilities by writing new executors, plugins, 
operators and providers. The
+Public Interface can be useful for building custom tools and integrations with 
other systems,
+and for automating certain aspects of the Airflow workflow.
+
+You can extend Airflow in three ways:
+
+* By writing new custom Python code (via Operators, Plugins, Provider)
+* By using the `Stable REST API `_ (based on the OpenAPI 
specification)
+* By using the `Airflow Command Line Interface (CLI) 
`_
+
+How can you extend Apache Airflow with custom Python Code?
+==
+
+The Public Interface of Airflow consists of a number of different classes and 
packages that provide access
+to the core features and functionality of the system.
+
+The classes and packages that may be considered as the Public Interface 
include:
+
+* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in 
Airflow.
+* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way 
write custom operators.
+* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom 
hooks.
+* The :class:`~airflow.models.connection.Connection`, which provides access to 
external service credentials and configuration.
+* The :class:`~airflow.models.variable.Variable`, which provides access to 
Airflow configuration variables.
+* The :class:`~airflow.models.xcom.XCom` which are used to access to 
inter-task communication data.
+* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define 
custom secret managers.
+* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define 
custom plugins.
+* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement 
custom Custom Deferrable Operators (based on ``asyncio``).
+* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way 
write custom decorators.
+* The :class:`~airflow.listeners.listener.ListenerManager` class which 
provides hooks that can be implemented to respond to DAG/Task lifecycle events.
+
+.. versionadded:: 2.5
+
+   Listener public interface has been added in version 2.5.
+
+* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors 
are the components of Airflow
+  that are responsible for executing tasks.
+
+.. versionadded:: 2.6
+
+   There are a number of different executor implementations built-in Airflow, 
each with its own unique
+   characteristics and capabilities. Executor interface was available in 
earlier version of Airflow but
+   only as of version 2.6 executors are fully decoupled and Airflow does not 
rely on built-in set of executors.
+   You could have implemented (and succeeded) with implementing Executors 
before Airflow 2.6 and a number
+   of people succeeded in doing so, but there were some hard-coded behaviours 
that preferred in-built
+   executors, and custom executors could not provide full functionality that 
built-in executors had.
+
+
+What is not part of the Public Interface of Apache Airflow?
+===
+
+Everything not mentioned in this document should be considered as non-Public 
Interface.

Review Comment:
   > For providers it's far easier because there is no "internal" sharing 
between multiple modules
   
   so i mean even e.g. for KPO we have a `pod_manager` module.  This should not 
be considered (ideally) user-public.  It exists only for KPO, used only in KPO. 
 It is internal helper code for KPO, just because, 

[GitHub] [airflow] dstandish commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation

2022-12-28 Thread GitBox


dstandish commented on code in PR #28300:
URL: https://github.com/apache/airflow/pull/28300#discussion_r1058722499


##
docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst:
##
@@ -0,0 +1,87 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Public Interface of Airflow
+===
+
+The Public Interface of Apache Airflow is a set of programmatic interfaces 
that allow developers to interact
+with and access certain features of the Apache Airflow system. This includes 
operations such as
+creating and managing DAGs (directed acyclic graphs), managing tasks and their 
dependencies,
+and extending Airflow capabilities by writing new executors, plugins, 
operators and providers. The
+Public Interface can be useful for building custom tools and integrations with 
other systems,
+and for automating certain aspects of the Airflow workflow.
+
+You can extend Airflow in three ways:
+
+* By writing new custom Python code (via Operators, Plugins, Provider)
+* By using the `Stable REST API `_ (based on the OpenAPI 
specification)
+* By using the `Airflow Command Line Interface (CLI) 
`_
+
+How can you extend Apache Airflow with custom Python Code?
+==
+
+The Public Interface of Airflow consists of a number of different classes and 
packages that provide access
+to the core features and functionality of the system.
+
+The classes and packages that may be considered as the Public Interface 
include:
+
+* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in 
Airflow.
+* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way 
write custom operators.
+* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom 
hooks.
+* The :class:`~airflow.models.connection.Connection`, which provides access to 
external service credentials and configuration.
+* The :class:`~airflow.models.variable.Variable`, which provides access to 
Airflow configuration variables.
+* The :class:`~airflow.models.xcom.XCom` which are used to access to 
inter-task communication data.
+* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define 
custom secret managers.
+* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define 
custom plugins.
+* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement 
custom Custom Deferrable Operators (based on ``asyncio``).
+* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way 
write custom decorators.
+* The :class:`~airflow.listeners.listener.ListenerManager` class which 
provides hooks that can be implemented to respond to DAG/Task lifecycle events.
+
+.. versionadded:: 2.5
+
+   Listener public interface has been added in version 2.5.
+
+* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors 
are the components of Airflow
+  that are responsible for executing tasks.
+
+.. versionadded:: 2.6
+
+   There are a number of different executor implementations built-in Airflow, 
each with its own unique
+   characteristics and capabilities. Executor interface was available in 
earlier version of Airflow but
+   only as of version 2.6 executors are fully decoupled and Airflow does not 
rely on built-in set of executors.
+   You could have implemented (and succeeded) with implementing Executors 
before Airflow 2.6 and a number
+   of people succeeded in doing so, but there were some hard-coded behaviours 
that preferred in-built
+   executors, and custom executors could not provide full functionality that 
built-in executors had.
+
+
+What is not part of the Public Interface of Apache Airflow?
+===
+
+Everything not mentioned in this document should be considered as non-Public 
Interface.

Review Comment:
   > you wrote yourself that you prefer to minimize backcompat surface and 
"everything that's not an operator / sensor / hook should be private".
   
   just want to clarify that, specifically with that example about "only hook / 
sensor / operator are public" i was not quite saying that's what i think it 
should be, 

[GitHub] [airflow] uranusjr commented on pull request #27710: add a new conf to wait past_deps before skipping a task

2022-12-28 Thread GitBox


uranusjr commented on PR #27710:
URL: https://github.com/apache/airflow/pull/27710#issuecomment-1367053758

   Do we still need to keep the old argument for backward compatibility?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] bharanidharan14 commented on a diff in pull request #28262: Hook for managing directories and files in Azure Data Lake Storage Gen2

2022-12-28 Thread GitBox


bharanidharan14 commented on code in PR #28262:
URL: https://github.com/apache/airflow/pull/28262#discussion_r1058690039


##
airflow/providers/microsoft/azure/hooks/adls_v2.py:
##
@@ -0,0 +1,307 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+from typing import Any
+
+from azure.core.exceptions import ResourceExistsError, ResourceNotFoundError
+from azure.identity import ClientSecretCredential
+from azure.storage.filedatalake import (
+DataLakeDirectoryClient,
+DataLakeFileClient,
+DataLakeServiceClient,
+DirectoryProperties,
+FileSystemClient,
+FileSystemProperties,
+)
+from hooks.base import BaseHook
+
+
+class AdlsClientHook(BaseHook):

Review Comment:
   Addressed it 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] bharanidharan14 commented on a diff in pull request #28262: Hook for managing directories and files in Azure Data Lake Storage Gen2

2022-12-28 Thread GitBox


bharanidharan14 commented on code in PR #28262:
URL: https://github.com/apache/airflow/pull/28262#discussion_r1058689964


##
tests/providers/microsoft/azure/hooks/test_adls_v2.py:
##
@@ -0,0 +1,108 @@
+# Licensed to the Apache Software Foundation (ASF) under one

Review Comment:
   Moved the test to `test_azure_data_lake` file



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] bharanidharan14 commented on a diff in pull request #28262: Hook for managing directories and files in Azure Data Lake Storage Gen2

2022-12-28 Thread GitBox


bharanidharan14 commented on code in PR #28262:
URL: https://github.com/apache/airflow/pull/28262#discussion_r1058689769


##
airflow/providers/microsoft/azure/hooks/adls_v2.py:
##
@@ -0,0 +1,307 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+from typing import Any
+
+from azure.core.exceptions import ResourceExistsError, ResourceNotFoundError
+from azure.identity import ClientSecretCredential
+from azure.storage.filedatalake import (
+DataLakeDirectoryClient,
+DataLakeFileClient,
+DataLakeServiceClient,
+DirectoryProperties,
+FileSystemClient,
+FileSystemProperties,
+)
+from hooks.base import BaseHook
+
+
+class AdlsClientHook(BaseHook):
+"""
+This Hook interacts with ADLS gen2 storage account it mainly helps to 
create and manage
+directories and files in storage accounts that have a hierarchical 
namespace. Using Alds_v2 connection
+details create DataLakeServiceClient object
+
+Due to Wasb is marked as legacy and and retirement of the (ADLS1) it would 
be nice to
+implement ALDS gen2 hook for interacting with the storage account.
+
+.. seealso::
+
https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-directory-file-acl-python
+
+:param adls_v2_conn_id: Reference to the :ref:`adls_v2 connection 
`.

Review Comment:
   @tatiana Addressed your suggestion 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] avicol commented on pull request #28353: Fix the sql syntax in merge_data

2022-12-28 Thread GitBox


avicol commented on PR #28353:
URL: https://github.com/apache/airflow/pull/28353#issuecomment-1367014562

   > @avicol doc spellcheck 
   The doc error is from a file that I did not contributed to:
   ../../airflow/providers/cncf/kubernetes/CHANGELOG.rst:377: (didn)   and you 
didn’t use the
   Trying a rebase.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] hussein-awala commented on pull request #27710: add a new conf to wait past_deps before skipping a task

2022-12-28 Thread GitBox


hussein-awala commented on PR #27710:
URL: https://github.com/apache/airflow/pull/27710#issuecomment-1367003154

   @potiuk it's ready, can you check it please?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] github-actions[bot] closed pull request #27481: Add "user_updated_state" column in taskinstance

2022-12-28 Thread GitBox


github-actions[bot] closed pull request #27481: Add "user_updated_state" column 
in taskinstance
URL: https://github.com/apache/airflow/pull/27481


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation

2022-12-28 Thread GitBox


potiuk commented on code in PR #28300:
URL: https://github.com/apache/airflow/pull/28300#discussion_r1058656390


##
docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst:
##
@@ -0,0 +1,87 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Public Interface of Airflow
+===
+
+The Public Interface of Apache Airflow is a set of programmatic interfaces 
that allow developers to interact
+with and access certain features of the Apache Airflow system. This includes 
operations such as
+creating and managing DAGs (directed acyclic graphs), managing tasks and their 
dependencies,
+and extending Airflow capabilities by writing new executors, plugins, 
operators and providers. The
+Public Interface can be useful for building custom tools and integrations with 
other systems,
+and for automating certain aspects of the Airflow workflow.
+
+You can extend Airflow in three ways:
+
+* By writing new custom Python code (via Operators, Plugins, Provider)
+* By using the `Stable REST API `_ (based on the OpenAPI 
specification)
+* By using the `Airflow Command Line Interface (CLI) 
`_
+
+How can you extend Apache Airflow with custom Python Code?
+==
+
+The Public Interface of Airflow consists of a number of different classes and 
packages that provide access
+to the core features and functionality of the system.
+
+The classes and packages that may be considered as the Public Interface 
include:
+
+* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in 
Airflow.
+* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way 
write custom operators.
+* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom 
hooks.
+* The :class:`~airflow.models.connection.Connection`, which provides access to 
external service credentials and configuration.
+* The :class:`~airflow.models.variable.Variable`, which provides access to 
Airflow configuration variables.
+* The :class:`~airflow.models.xcom.XCom` which are used to access to 
inter-task communication data.
+* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define 
custom secret managers.
+* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define 
custom plugins.
+* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement 
custom Custom Deferrable Operators (based on ``asyncio``).
+* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way 
write custom decorators.
+* The :class:`~airflow.listeners.listener.ListenerManager` class which 
provides hooks that can be implemented to respond to DAG/Task lifecycle events.
+
+.. versionadded:: 2.5
+
+   Listener public interface has been added in version 2.5.
+
+* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors 
are the components of Airflow
+  that are responsible for executing tasks.
+
+.. versionadded:: 2.6
+
+   There are a number of different executor implementations built-in Airflow, 
each with its own unique
+   characteristics and capabilities. Executor interface was available in 
earlier version of Airflow but
+   only as of version 2.6 executors are fully decoupled and Airflow does not 
rely on built-in set of executors.
+   You could have implemented (and succeeded) with implementing Executors 
before Airflow 2.6 and a number
+   of people succeeded in doing so, but there were some hard-coded behaviours 
that preferred in-built
+   executors, and custom executors could not provide full functionality that 
built-in executors had.
+
+
+What is not part of the Public Interface of Apache Airflow?
+===
+
+Everything not mentioned in this document should be considered as non-Public 
Interface.

Review Comment:
   BTW. Eventually we might be able to produce airflow typeshed-like package: 
https://github.com/python/typeshed with only stubs for only "externally public" 
APIs. This is what should be an "ultimate" goal I think. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on 

[GitHub] [airflow] potiuk commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation

2022-12-28 Thread GitBox


potiuk commented on code in PR #28300:
URL: https://github.com/apache/airflow/pull/28300#discussion_r1058655785


##
docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst:
##
@@ -0,0 +1,87 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Public Interface of Airflow
+===
+
+The Public Interface of Apache Airflow is a set of programmatic interfaces 
that allow developers to interact
+with and access certain features of the Apache Airflow system. This includes 
operations such as
+creating and managing DAGs (directed acyclic graphs), managing tasks and their 
dependencies,
+and extending Airflow capabilities by writing new executors, plugins, 
operators and providers. The
+Public Interface can be useful for building custom tools and integrations with 
other systems,
+and for automating certain aspects of the Airflow workflow.
+
+You can extend Airflow in three ways:
+
+* By writing new custom Python code (via Operators, Plugins, Provider)
+* By using the `Stable REST API `_ (based on the OpenAPI 
specification)
+* By using the `Airflow Command Line Interface (CLI) 
`_
+
+How can you extend Apache Airflow with custom Python Code?
+==
+
+The Public Interface of Airflow consists of a number of different classes and 
packages that provide access
+to the core features and functionality of the system.
+
+The classes and packages that may be considered as the Public Interface 
include:
+
+* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in 
Airflow.
+* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way 
write custom operators.
+* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom 
hooks.
+* The :class:`~airflow.models.connection.Connection`, which provides access to 
external service credentials and configuration.
+* The :class:`~airflow.models.variable.Variable`, which provides access to 
Airflow configuration variables.
+* The :class:`~airflow.models.xcom.XCom` which are used to access to 
inter-task communication data.
+* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define 
custom secret managers.
+* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define 
custom plugins.
+* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement 
custom Custom Deferrable Operators (based on ``asyncio``).
+* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way 
write custom decorators.
+* The :class:`~airflow.listeners.listener.ListenerManager` class which 
provides hooks that can be implemented to respond to DAG/Task lifecycle events.
+
+.. versionadded:: 2.5
+
+   Listener public interface has been added in version 2.5.
+
+* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors 
are the components of Airflow
+  that are responsible for executing tasks.
+
+.. versionadded:: 2.6
+
+   There are a number of different executor implementations built-in Airflow, 
each with its own unique
+   characteristics and capabilities. Executor interface was available in 
earlier version of Airflow but
+   only as of version 2.6 executors are fully decoupled and Airflow does not 
rely on built-in set of executors.
+   You could have implemented (and succeeded) with implementing Executors 
before Airflow 2.6 and a number
+   of people succeeded in doing so, but there were some hard-coded behaviours 
that preferred in-built
+   executors, and custom executors could not provide full functionality that 
built-in executors had.
+
+
+What is not part of the Public Interface of Apache Airflow?
+===
+
+Everything not mentioned in this document should be considered as non-Public 
Interface.

Review Comment:
   > It's also important to clarify that some methods are intentionally 
"private". E.g. in the operator, it might have helper methods. But these 
methods are more implementation detail right? If users subclass, though, they 
may use and depend on these. So what do we do? Ideally, from maintainer 
perspective, we want the 

[GitHub] [airflow] potiuk commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation

2022-12-28 Thread GitBox


potiuk commented on code in PR #28300:
URL: https://github.com/apache/airflow/pull/28300#discussion_r1058656390


##
docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst:
##
@@ -0,0 +1,87 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Public Interface of Airflow
+===
+
+The Public Interface of Apache Airflow is a set of programmatic interfaces 
that allow developers to interact
+with and access certain features of the Apache Airflow system. This includes 
operations such as
+creating and managing DAGs (directed acyclic graphs), managing tasks and their 
dependencies,
+and extending Airflow capabilities by writing new executors, plugins, 
operators and providers. The
+Public Interface can be useful for building custom tools and integrations with 
other systems,
+and for automating certain aspects of the Airflow workflow.
+
+You can extend Airflow in three ways:
+
+* By writing new custom Python code (via Operators, Plugins, Provider)
+* By using the `Stable REST API `_ (based on the OpenAPI 
specification)
+* By using the `Airflow Command Line Interface (CLI) 
`_
+
+How can you extend Apache Airflow with custom Python Code?
+==
+
+The Public Interface of Airflow consists of a number of different classes and 
packages that provide access
+to the core features and functionality of the system.
+
+The classes and packages that may be considered as the Public Interface 
include:
+
+* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in 
Airflow.
+* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way 
write custom operators.
+* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom 
hooks.
+* The :class:`~airflow.models.connection.Connection`, which provides access to 
external service credentials and configuration.
+* The :class:`~airflow.models.variable.Variable`, which provides access to 
Airflow configuration variables.
+* The :class:`~airflow.models.xcom.XCom` which are used to access to 
inter-task communication data.
+* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define 
custom secret managers.
+* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define 
custom plugins.
+* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement 
custom Custom Deferrable Operators (based on ``asyncio``).
+* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way 
write custom decorators.
+* The :class:`~airflow.listeners.listener.ListenerManager` class which 
provides hooks that can be implemented to respond to DAG/Task lifecycle events.
+
+.. versionadded:: 2.5
+
+   Listener public interface has been added in version 2.5.
+
+* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors 
are the components of Airflow
+  that are responsible for executing tasks.
+
+.. versionadded:: 2.6
+
+   There are a number of different executor implementations built-in Airflow, 
each with its own unique
+   characteristics and capabilities. Executor interface was available in 
earlier version of Airflow but
+   only as of version 2.6 executors are fully decoupled and Airflow does not 
rely on built-in set of executors.
+   You could have implemented (and succeeded) with implementing Executors 
before Airflow 2.6 and a number
+   of people succeeded in doing so, but there were some hard-coded behaviours 
that preferred in-built
+   executors, and custom executors could not provide full functionality that 
built-in executors had.
+
+
+What is not part of the Public Interface of Apache Airflow?
+===
+
+Everything not mentioned in this document should be considered as non-Public 
Interface.

Review Comment:
   BTW. Eventually we might be able to produce airflow typeshed 
https://github.com/python/typeshed with only stubs for only "externally public" 
APIs. This is what should be an "ultimate" goal I think. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and 

[GitHub] [airflow] potiuk commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation

2022-12-28 Thread GitBox


potiuk commented on code in PR #28300:
URL: https://github.com/apache/airflow/pull/28300#discussion_r1058655785


##
docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst:
##
@@ -0,0 +1,87 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Public Interface of Airflow
+===
+
+The Public Interface of Apache Airflow is a set of programmatic interfaces 
that allow developers to interact
+with and access certain features of the Apache Airflow system. This includes 
operations such as
+creating and managing DAGs (directed acyclic graphs), managing tasks and their 
dependencies,
+and extending Airflow capabilities by writing new executors, plugins, 
operators and providers. The
+Public Interface can be useful for building custom tools and integrations with 
other systems,
+and for automating certain aspects of the Airflow workflow.
+
+You can extend Airflow in three ways:
+
+* By writing new custom Python code (via Operators, Plugins, Provider)
+* By using the `Stable REST API `_ (based on the OpenAPI 
specification)
+* By using the `Airflow Command Line Interface (CLI) 
`_
+
+How can you extend Apache Airflow with custom Python Code?
+==
+
+The Public Interface of Airflow consists of a number of different classes and 
packages that provide access
+to the core features and functionality of the system.
+
+The classes and packages that may be considered as the Public Interface 
include:
+
+* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in 
Airflow.
+* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way 
write custom operators.
+* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom 
hooks.
+* The :class:`~airflow.models.connection.Connection`, which provides access to 
external service credentials and configuration.
+* The :class:`~airflow.models.variable.Variable`, which provides access to 
Airflow configuration variables.
+* The :class:`~airflow.models.xcom.XCom` which are used to access to 
inter-task communication data.
+* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define 
custom secret managers.
+* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define 
custom plugins.
+* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement 
custom Custom Deferrable Operators (based on ``asyncio``).
+* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way 
write custom decorators.
+* The :class:`~airflow.listeners.listener.ListenerManager` class which 
provides hooks that can be implemented to respond to DAG/Task lifecycle events.
+
+.. versionadded:: 2.5
+
+   Listener public interface has been added in version 2.5.
+
+* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors 
are the components of Airflow
+  that are responsible for executing tasks.
+
+.. versionadded:: 2.6
+
+   There are a number of different executor implementations built-in Airflow, 
each with its own unique
+   characteristics and capabilities. Executor interface was available in 
earlier version of Airflow but
+   only as of version 2.6 executors are fully decoupled and Airflow does not 
rely on built-in set of executors.
+   You could have implemented (and succeeded) with implementing Executors 
before Airflow 2.6 and a number
+   of people succeeded in doing so, but there were some hard-coded behaviours 
that preferred in-built
+   executors, and custom executors could not provide full functionality that 
built-in executors had.
+
+
+What is not part of the Public Interface of Apache Airflow?
+===
+
+Everything not mentioned in this document should be considered as non-Public 
Interface.

Review Comment:
   > It's also important to clarify that some methods are intentionally 
"private". E.g. in the operator, it might have helper methods. But these 
methods are more implementation detail right? If users subclass, though, they 
may use and depend on these. So what do we do? Ideally, from maintainer 
perspective, we want the 

[GitHub] [airflow] potiuk commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation

2022-12-28 Thread GitBox


potiuk commented on code in PR #28300:
URL: https://github.com/apache/airflow/pull/28300#discussion_r1058655785


##
docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst:
##
@@ -0,0 +1,87 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Public Interface of Airflow
+===
+
+The Public Interface of Apache Airflow is a set of programmatic interfaces 
that allow developers to interact
+with and access certain features of the Apache Airflow system. This includes 
operations such as
+creating and managing DAGs (directed acyclic graphs), managing tasks and their 
dependencies,
+and extending Airflow capabilities by writing new executors, plugins, 
operators and providers. The
+Public Interface can be useful for building custom tools and integrations with 
other systems,
+and for automating certain aspects of the Airflow workflow.
+
+You can extend Airflow in three ways:
+
+* By writing new custom Python code (via Operators, Plugins, Provider)
+* By using the `Stable REST API `_ (based on the OpenAPI 
specification)
+* By using the `Airflow Command Line Interface (CLI) 
`_
+
+How can you extend Apache Airflow with custom Python Code?
+==
+
+The Public Interface of Airflow consists of a number of different classes and 
packages that provide access
+to the core features and functionality of the system.
+
+The classes and packages that may be considered as the Public Interface 
include:
+
+* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in 
Airflow.
+* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way 
write custom operators.
+* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom 
hooks.
+* The :class:`~airflow.models.connection.Connection`, which provides access to 
external service credentials and configuration.
+* The :class:`~airflow.models.variable.Variable`, which provides access to 
Airflow configuration variables.
+* The :class:`~airflow.models.xcom.XCom` which are used to access to 
inter-task communication data.
+* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define 
custom secret managers.
+* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define 
custom plugins.
+* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement 
custom Custom Deferrable Operators (based on ``asyncio``).
+* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way 
write custom decorators.
+* The :class:`~airflow.listeners.listener.ListenerManager` class which 
provides hooks that can be implemented to respond to DAG/Task lifecycle events.
+
+.. versionadded:: 2.5
+
+   Listener public interface has been added in version 2.5.
+
+* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors 
are the components of Airflow
+  that are responsible for executing tasks.
+
+.. versionadded:: 2.6
+
+   There are a number of different executor implementations built-in Airflow, 
each with its own unique
+   characteristics and capabilities. Executor interface was available in 
earlier version of Airflow but
+   only as of version 2.6 executors are fully decoupled and Airflow does not 
rely on built-in set of executors.
+   You could have implemented (and succeeded) with implementing Executors 
before Airflow 2.6 and a number
+   of people succeeded in doing so, but there were some hard-coded behaviours 
that preferred in-built
+   executors, and custom executors could not provide full functionality that 
built-in executors had.
+
+
+What is not part of the Public Interface of Apache Airflow?
+===
+
+Everything not mentioned in this document should be considered as non-Public 
Interface.

Review Comment:
   > It's also important to clarify that some methods are intentionally 
"private". E.g. in the operator, it might have helper methods. But these 
methods are more implementation detail right? If users subclass, though, they 
may use and depend on these. So what do we do? Ideally, from maintainer 
perspective, we want the 

[airflow] branch constraints-main updated: Updating constraints. Build id:

2022-12-28 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch constraints-main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/constraints-main by this push:
 new 230be22d16 Updating constraints. Build id:
230be22d16 is described below

commit 230be22d165b567d7954c88bb871c6fd7bd40ab6
Author: Automated GitHub Actions commit 
AuthorDate: Wed Dec 28 23:33:27 2022 +

Updating constraints. Build id:

This update in constraints is automatically committed by the CI 
'constraints-push' step based on
HEAD of '' in ''
with commit sha .

All tests passed in this build so we determined we can push the updated 
constraints.

See 
https://github.com/apache/airflow/blob/main/README.md#installing-from-pypi for 
details.
---
 constraints-3.10.txt  |  8 
 constraints-3.7.txt   | 10 +-
 constraints-3.8.txt   | 10 +-
 constraints-3.9.txt   |  8 
 constraints-no-providers-3.10.txt |  2 +-
 constraints-no-providers-3.7.txt  |  4 ++--
 constraints-no-providers-3.8.txt  |  4 ++--
 constraints-no-providers-3.9.txt  |  2 +-
 constraints-source-providers-3.10.txt | 10 +-
 constraints-source-providers-3.7.txt  | 12 ++--
 constraints-source-providers-3.8.txt  | 12 ++--
 constraints-source-providers-3.9.txt  | 10 +-
 12 files changed, 46 insertions(+), 46 deletions(-)

diff --git a/constraints-3.10.txt b/constraints-3.10.txt
index c091006864..aa2056dc23 100644
--- a/constraints-3.10.txt
+++ b/constraints-3.10.txt
@@ -1,5 +1,5 @@
 #
-# This constraints file was automatically generated on 2022-12-28T20:00:53Z
+# This constraints file was automatically generated on 2022-12-28T23:33:09Z
 # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow.
 # This variant of constraints install uses the HEAD of the branch version for 
'apache-airflow' but installs
 # the providers from PIP-released packages at the moment of the constraint 
generation.
@@ -173,9 +173,9 @@ billiard==3.6.4.0
 black==23.1a1
 bleach==5.0.1
 blinker==1.5
-boto3==1.26.38
+boto3==1.26.39
 boto==2.49.0
-botocore==1.29.38
+botocore==1.29.39
 bowler==0.9.0
 cachelib==0.9.0
 cachetools==4.2.2
@@ -382,7 +382,7 @@ msrestazure==0.6.4
 multi-key-dict==2.0.3
 multidict==6.0.4
 mypy-boto3-appflow==1.26.32
-mypy-boto3-rds==1.26.36
+mypy-boto3-rds==1.26.39
 mypy-boto3-redshift-data==1.26.30
 mypy-extensions==0.4.3
 mypy==0.971
diff --git a/constraints-3.7.txt b/constraints-3.7.txt
index 98b9ea54da..a86f768297 100644
--- a/constraints-3.7.txt
+++ b/constraints-3.7.txt
@@ -1,5 +1,5 @@
 #
-# This constraints file was automatically generated on 2022-12-28T20:01:31Z
+# This constraints file was automatically generated on 2022-12-28T23:33:24Z
 # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow.
 # This variant of constraints install uses the HEAD of the branch version for 
'apache-airflow' but installs
 # the providers from PIP-released packages at the moment of the constraint 
generation.
@@ -173,9 +173,9 @@ billiard==3.6.4.0
 black==23.1a1
 bleach==5.0.1
 blinker==1.5
-boto3==1.26.38
+boto3==1.26.39
 boto==2.49.0
-botocore==1.29.38
+botocore==1.29.39
 bowler==0.9.0
 cached-property==1.5.2
 cachelib==0.9.0
@@ -322,7 +322,7 @@ idna==3.4
 ijson==3.1.4
 imagesize==1.4.1
 importlib-metadata==4.13.0
-importlib-resources==5.10.1
+importlib-resources==5.10.2
 incremental==22.10.0
 inflection==0.5.1
 influxdb-client==1.35.0
@@ -383,7 +383,7 @@ msrestazure==0.6.4
 multi-key-dict==2.0.3
 multidict==6.0.4
 mypy-boto3-appflow==1.26.32
-mypy-boto3-rds==1.26.36
+mypy-boto3-rds==1.26.39
 mypy-boto3-redshift-data==1.26.30
 mypy-extensions==0.4.3
 mypy==0.971
diff --git a/constraints-3.8.txt b/constraints-3.8.txt
index 717b16d20a..f390106970 100644
--- a/constraints-3.8.txt
+++ b/constraints-3.8.txt
@@ -1,5 +1,5 @@
 #
-# This constraints file was automatically generated on 2022-12-28T20:01:21Z
+# This constraints file was automatically generated on 2022-12-28T23:33:20Z
 # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow.
 # This variant of constraints install uses the HEAD of the branch version for 
'apache-airflow' but installs
 # the providers from PIP-released packages at the moment of the constraint 
generation.
@@ -174,9 +174,9 @@ billiard==3.6.4.0
 black==23.1a1
 bleach==5.0.1
 blinker==1.5
-boto3==1.26.38
+boto3==1.26.39
 boto==2.49.0
-botocore==1.29.38
+botocore==1.29.39
 bowler==0.9.0
 cachelib==0.9.0
 cachetools==4.2.2
@@ -323,7 +323,7 @@ idna==3.4
 ijson==3.1.4
 imagesize==1.4.1
 importlib-metadata==5.2.0
-importlib-resources==5.10.1
+importlib-resources==5.10.2
 incremental==22.10.0
 inflection==0.5.1
 influxdb-client==1.35.0
@@ -384,7 +384,7 @@ msrestazure==0.6.4
 multi-key-dict==2.0.3
 multidict==6.0.4
 mypy-boto3-appflow==1.26.32

[GitHub] [airflow] potiuk commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation

2022-12-28 Thread GitBox


potiuk commented on code in PR #28300:
URL: https://github.com/apache/airflow/pull/28300#discussion_r1058648489


##
docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst:
##
@@ -0,0 +1,87 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Public Interface of Airflow
+===
+
+The Public Interface of Apache Airflow is a set of programmatic interfaces 
that allow developers to interact
+with and access certain features of the Apache Airflow system. This includes 
operations such as
+creating and managing DAGs (directed acyclic graphs), managing tasks and their 
dependencies,
+and extending Airflow capabilities by writing new executors, plugins, 
operators and providers. The
+Public Interface can be useful for building custom tools and integrations with 
other systems,
+and for automating certain aspects of the Airflow workflow.
+
+You can extend Airflow in three ways:
+
+* By writing new custom Python code (via Operators, Plugins, Provider)
+* By using the `Stable REST API `_ (based on the OpenAPI 
specification)
+* By using the `Airflow Command Line Interface (CLI) 
`_
+
+How can you extend Apache Airflow with custom Python Code?
+==
+
+The Public Interface of Airflow consists of a number of different classes and 
packages that provide access
+to the core features and functionality of the system.
+
+The classes and packages that may be considered as the Public Interface 
include:
+
+* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in 
Airflow.
+* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way 
write custom operators.
+* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom 
hooks.
+* The :class:`~airflow.models.connection.Connection`, which provides access to 
external service credentials and configuration.
+* The :class:`~airflow.models.variable.Variable`, which provides access to 
Airflow configuration variables.
+* The :class:`~airflow.models.xcom.XCom` which are used to access to 
inter-task communication data.
+* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define 
custom secret managers.
+* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define 
custom plugins.
+* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement 
custom Custom Deferrable Operators (based on ``asyncio``).
+* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way 
write custom decorators.
+* The :class:`~airflow.listeners.listener.ListenerManager` class which 
provides hooks that can be implemented to respond to DAG/Task lifecycle events.
+
+.. versionadded:: 2.5
+
+   Listener public interface has been added in version 2.5.
+
+* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors 
are the components of Airflow
+  that are responsible for executing tasks.
+
+.. versionadded:: 2.6
+
+   There are a number of different executor implementations built-in Airflow, 
each with its own unique
+   characteristics and capabilities. Executor interface was available in 
earlier version of Airflow but
+   only as of version 2.6 executors are fully decoupled and Airflow does not 
rely on built-in set of executors.
+   You could have implemented (and succeeded) with implementing Executors 
before Airflow 2.6 and a number
+   of people succeeded in doing so, but there were some hard-coded behaviours 
that preferred in-built
+   executors, and custom executors could not provide full functionality that 
built-in executors had.
+
+
+What is not part of the Public Interface of Apache Airflow?
+===
+
+Everything not mentioned in this document should be considered as non-Public 
Interface.

Review Comment:
   Maybe this is misunderstanding - what I wrote above was not about what 
provider packages "provides" - but what they "consume". This  document indeed 
does not specify anything about what providers provide - it's only what Airflow 
"produces" and providers "consume" as users of that. 
   
   The Community Providers have 

[GitHub] [airflow] dstandish commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation

2022-12-28 Thread GitBox


dstandish commented on code in PR #28300:
URL: https://github.com/apache/airflow/pull/28300#discussion_r1058643196


##
docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst:
##
@@ -0,0 +1,87 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Public Interface of Airflow
+===
+
+The Public Interface of Apache Airflow is a set of programmatic interfaces 
that allow developers to interact
+with and access certain features of the Apache Airflow system. This includes 
operations such as
+creating and managing DAGs (directed acyclic graphs), managing tasks and their 
dependencies,
+and extending Airflow capabilities by writing new executors, plugins, 
operators and providers. The
+Public Interface can be useful for building custom tools and integrations with 
other systems,
+and for automating certain aspects of the Airflow workflow.
+
+You can extend Airflow in three ways:
+
+* By writing new custom Python code (via Operators, Plugins, Provider)
+* By using the `Stable REST API `_ (based on the OpenAPI 
specification)
+* By using the `Airflow Command Line Interface (CLI) 
`_
+
+How can you extend Apache Airflow with custom Python Code?
+==
+
+The Public Interface of Airflow consists of a number of different classes and 
packages that provide access
+to the core features and functionality of the system.
+
+The classes and packages that may be considered as the Public Interface 
include:
+
+* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in 
Airflow.
+* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way 
write custom operators.
+* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom 
hooks.
+* The :class:`~airflow.models.connection.Connection`, which provides access to 
external service credentials and configuration.
+* The :class:`~airflow.models.variable.Variable`, which provides access to 
Airflow configuration variables.
+* The :class:`~airflow.models.xcom.XCom` which are used to access to 
inter-task communication data.
+* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define 
custom secret managers.
+* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define 
custom plugins.
+* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement 
custom Custom Deferrable Operators (based on ``asyncio``).
+* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way 
write custom decorators.
+* The :class:`~airflow.listeners.listener.ListenerManager` class which 
provides hooks that can be implemented to respond to DAG/Task lifecycle events.
+
+.. versionadded:: 2.5
+
+   Listener public interface has been added in version 2.5.
+
+* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors 
are the components of Airflow
+  that are responsible for executing tasks.
+
+.. versionadded:: 2.6
+
+   There are a number of different executor implementations built-in Airflow, 
each with its own unique
+   characteristics and capabilities. Executor interface was available in 
earlier version of Airflow but
+   only as of version 2.6 executors are fully decoupled and Airflow does not 
rely on built-in set of executors.
+   You could have implemented (and succeeded) with implementing Executors 
before Airflow 2.6 and a number
+   of people succeeded in doing so, but there were some hard-coded behaviours 
that preferred in-built
+   executors, and custom executors could not provide full functionality that 
built-in executors had.
+
+
+What is not part of the Public Interface of Apache Airflow?
+===
+
+Everything not mentioned in this document should be considered as non-Public 
Interface.

Review Comment:
   One problem with the public / private distinction is... from tool 
perspective ... even using _my_func across modules is sorta forbidden.   But 
there is a difference between the internal "private" and the user-facing 
"private".  We need to be able to write helper code that can be used across 
modules (internal 

[GitHub] [airflow] dstandish commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation

2022-12-28 Thread GitBox


dstandish commented on code in PR #28300:
URL: https://github.com/apache/airflow/pull/28300#discussion_r1058643196


##
docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst:
##
@@ -0,0 +1,87 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Public Interface of Airflow
+===
+
+The Public Interface of Apache Airflow is a set of programmatic interfaces 
that allow developers to interact
+with and access certain features of the Apache Airflow system. This includes 
operations such as
+creating and managing DAGs (directed acyclic graphs), managing tasks and their 
dependencies,
+and extending Airflow capabilities by writing new executors, plugins, 
operators and providers. The
+Public Interface can be useful for building custom tools and integrations with 
other systems,
+and for automating certain aspects of the Airflow workflow.
+
+You can extend Airflow in three ways:
+
+* By writing new custom Python code (via Operators, Plugins, Provider)
+* By using the `Stable REST API `_ (based on the OpenAPI 
specification)
+* By using the `Airflow Command Line Interface (CLI) 
`_
+
+How can you extend Apache Airflow with custom Python Code?
+==
+
+The Public Interface of Airflow consists of a number of different classes and 
packages that provide access
+to the core features and functionality of the system.
+
+The classes and packages that may be considered as the Public Interface 
include:
+
+* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in 
Airflow.
+* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way 
write custom operators.
+* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom 
hooks.
+* The :class:`~airflow.models.connection.Connection`, which provides access to 
external service credentials and configuration.
+* The :class:`~airflow.models.variable.Variable`, which provides access to 
Airflow configuration variables.
+* The :class:`~airflow.models.xcom.XCom` which are used to access to 
inter-task communication data.
+* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define 
custom secret managers.
+* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define 
custom plugins.
+* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement 
custom Custom Deferrable Operators (based on ``asyncio``).
+* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way 
write custom decorators.
+* The :class:`~airflow.listeners.listener.ListenerManager` class which 
provides hooks that can be implemented to respond to DAG/Task lifecycle events.
+
+.. versionadded:: 2.5
+
+   Listener public interface has been added in version 2.5.
+
+* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors 
are the components of Airflow
+  that are responsible for executing tasks.
+
+.. versionadded:: 2.6
+
+   There are a number of different executor implementations built-in Airflow, 
each with its own unique
+   characteristics and capabilities. Executor interface was available in 
earlier version of Airflow but
+   only as of version 2.6 executors are fully decoupled and Airflow does not 
rely on built-in set of executors.
+   You could have implemented (and succeeded) with implementing Executors 
before Airflow 2.6 and a number
+   of people succeeded in doing so, but there were some hard-coded behaviours 
that preferred in-built
+   executors, and custom executors could not provide full functionality that 
built-in executors had.
+
+
+What is not part of the Public Interface of Apache Airflow?
+===
+
+Everything not mentioned in this document should be considered as non-Public 
Interface.

Review Comment:
   One problem with public / private is... from tool perspective ... even using 
_my_func across modules is forbidden.   But there is a difference between the 
internal "private" and the user-facing "private".  We need to be able to write 
helper code that can be used across modules (internal public), but not have it 
be 

[GitHub] [airflow] dstandish commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation

2022-12-28 Thread GitBox


dstandish commented on code in PR #28300:
URL: https://github.com/apache/airflow/pull/28300#discussion_r1058642484


##
docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst:
##
@@ -0,0 +1,87 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Public Interface of Airflow
+===
+
+The Public Interface of Apache Airflow is a set of programmatic interfaces 
that allow developers to interact
+with and access certain features of the Apache Airflow system. This includes 
operations such as
+creating and managing DAGs (directed acyclic graphs), managing tasks and their 
dependencies,
+and extending Airflow capabilities by writing new executors, plugins, 
operators and providers. The
+Public Interface can be useful for building custom tools and integrations with 
other systems,
+and for automating certain aspects of the Airflow workflow.
+
+You can extend Airflow in three ways:
+
+* By writing new custom Python code (via Operators, Plugins, Provider)
+* By using the `Stable REST API `_ (based on the OpenAPI 
specification)
+* By using the `Airflow Command Line Interface (CLI) 
`_
+
+How can you extend Apache Airflow with custom Python Code?
+==
+
+The Public Interface of Airflow consists of a number of different classes and 
packages that provide access
+to the core features and functionality of the system.
+
+The classes and packages that may be considered as the Public Interface 
include:
+
+* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in 
Airflow.
+* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way 
write custom operators.
+* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom 
hooks.
+* The :class:`~airflow.models.connection.Connection`, which provides access to 
external service credentials and configuration.
+* The :class:`~airflow.models.variable.Variable`, which provides access to 
Airflow configuration variables.
+* The :class:`~airflow.models.xcom.XCom` which are used to access to 
inter-task communication data.
+* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define 
custom secret managers.
+* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define 
custom plugins.
+* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement 
custom Custom Deferrable Operators (based on ``asyncio``).
+* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way 
write custom decorators.
+* The :class:`~airflow.listeners.listener.ListenerManager` class which 
provides hooks that can be implemented to respond to DAG/Task lifecycle events.
+
+.. versionadded:: 2.5
+
+   Listener public interface has been added in version 2.5.
+
+* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors 
are the components of Airflow
+  that are responsible for executing tasks.
+
+.. versionadded:: 2.6
+
+   There are a number of different executor implementations built-in Airflow, 
each with its own unique
+   characteristics and capabilities. Executor interface was available in 
earlier version of Airflow but
+   only as of version 2.6 executors are fully decoupled and Airflow does not 
rely on built-in set of executors.
+   You could have implemented (and succeeded) with implementing Executors 
before Airflow 2.6 and a number
+   of people succeeded in doing so, but there were some hard-coded behaviours 
that preferred in-built
+   executors, and custom executors could not provide full functionality that 
built-in executors had.
+
+
+What is not part of the Public Interface of Apache Airflow?
+===
+
+Everything not mentioned in this document should be considered as non-Public 
Interface.

Review Comment:
   It's also important to clarify that some methods are intentionally 
"private". E.g. in the operator, it might have helper methods.  But these 
methods are more implementation detail right?  If users subclass, though, they 
may use and depend on these.  So what do we do? Ideally, from maintainer 
perspective, we want 

[GitHub] [airflow] dstandish commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation

2022-12-28 Thread GitBox


dstandish commented on code in PR #28300:
URL: https://github.com/apache/airflow/pull/28300#discussion_r1058642484


##
docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst:
##
@@ -0,0 +1,87 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Public Interface of Airflow
+===
+
+The Public Interface of Apache Airflow is a set of programmatic interfaces 
that allow developers to interact
+with and access certain features of the Apache Airflow system. This includes 
operations such as
+creating and managing DAGs (directed acyclic graphs), managing tasks and their 
dependencies,
+and extending Airflow capabilities by writing new executors, plugins, 
operators and providers. The
+Public Interface can be useful for building custom tools and integrations with 
other systems,
+and for automating certain aspects of the Airflow workflow.
+
+You can extend Airflow in three ways:
+
+* By writing new custom Python code (via Operators, Plugins, Provider)
+* By using the `Stable REST API `_ (based on the OpenAPI 
specification)
+* By using the `Airflow Command Line Interface (CLI) 
`_
+
+How can you extend Apache Airflow with custom Python Code?
+==
+
+The Public Interface of Airflow consists of a number of different classes and 
packages that provide access
+to the core features and functionality of the system.
+
+The classes and packages that may be considered as the Public Interface 
include:
+
+* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in 
Airflow.
+* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way 
write custom operators.
+* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom 
hooks.
+* The :class:`~airflow.models.connection.Connection`, which provides access to 
external service credentials and configuration.
+* The :class:`~airflow.models.variable.Variable`, which provides access to 
Airflow configuration variables.
+* The :class:`~airflow.models.xcom.XCom` which are used to access to 
inter-task communication data.
+* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define 
custom secret managers.
+* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define 
custom plugins.
+* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement 
custom Custom Deferrable Operators (based on ``asyncio``).
+* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way 
write custom decorators.
+* The :class:`~airflow.listeners.listener.ListenerManager` class which 
provides hooks that can be implemented to respond to DAG/Task lifecycle events.
+
+.. versionadded:: 2.5
+
+   Listener public interface has been added in version 2.5.
+
+* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors 
are the components of Airflow
+  that are responsible for executing tasks.
+
+.. versionadded:: 2.6
+
+   There are a number of different executor implementations built-in Airflow, 
each with its own unique
+   characteristics and capabilities. Executor interface was available in 
earlier version of Airflow but
+   only as of version 2.6 executors are fully decoupled and Airflow does not 
rely on built-in set of executors.
+   You could have implemented (and succeeded) with implementing Executors 
before Airflow 2.6 and a number
+   of people succeeded in doing so, but there were some hard-coded behaviours 
that preferred in-built
+   executors, and custom executors could not provide full functionality that 
built-in executors had.
+
+
+What is not part of the Public Interface of Apache Airflow?
+===
+
+Everything not mentioned in this document should be considered as non-Public 
Interface.

Review Comment:
   It's also important to clarify that some methods are intentionally 
"private". E.g. in the operator, it might have helper methods.  But these 
methods are more implementation detail right?  If users subclass, though, they 
may use and depend on these.  So what do we do? Ideally, from maintainer 
perspective, we want 

[GitHub] [airflow] dstandish commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation

2022-12-28 Thread GitBox


dstandish commented on code in PR #28300:
URL: https://github.com/apache/airflow/pull/28300#discussion_r1058640908


##
docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst:
##
@@ -0,0 +1,87 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Public Interface of Airflow
+===
+
+The Public Interface of Apache Airflow is a set of programmatic interfaces 
that allow developers to interact
+with and access certain features of the Apache Airflow system. This includes 
operations such as
+creating and managing DAGs (directed acyclic graphs), managing tasks and their 
dependencies,
+and extending Airflow capabilities by writing new executors, plugins, 
operators and providers. The
+Public Interface can be useful for building custom tools and integrations with 
other systems,
+and for automating certain aspects of the Airflow workflow.
+
+You can extend Airflow in three ways:
+
+* By writing new custom Python code (via Operators, Plugins, Provider)
+* By using the `Stable REST API `_ (based on the OpenAPI 
specification)
+* By using the `Airflow Command Line Interface (CLI) 
`_
+
+How can you extend Apache Airflow with custom Python Code?
+==
+
+The Public Interface of Airflow consists of a number of different classes and 
packages that provide access
+to the core features and functionality of the system.
+
+The classes and packages that may be considered as the Public Interface 
include:
+
+* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in 
Airflow.
+* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way 
write custom operators.
+* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom 
hooks.
+* The :class:`~airflow.models.connection.Connection`, which provides access to 
external service credentials and configuration.
+* The :class:`~airflow.models.variable.Variable`, which provides access to 
Airflow configuration variables.
+* The :class:`~airflow.models.xcom.XCom` which are used to access to 
inter-task communication data.
+* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define 
custom secret managers.
+* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define 
custom plugins.
+* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement 
custom Custom Deferrable Operators (based on ``asyncio``).
+* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way 
write custom decorators.
+* The :class:`~airflow.listeners.listener.ListenerManager` class which 
provides hooks that can be implemented to respond to DAG/Task lifecycle events.
+
+.. versionadded:: 2.5
+
+   Listener public interface has been added in version 2.5.
+
+* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors 
are the components of Airflow
+  that are responsible for executing tasks.
+
+.. versionadded:: 2.6
+
+   There are a number of different executor implementations built-in Airflow, 
each with its own unique
+   characteristics and capabilities. Executor interface was available in 
earlier version of Airflow but
+   only as of version 2.6 executors are fully decoupled and Airflow does not 
rely on built-in set of executors.
+   You could have implemented (and succeeded) with implementing Executors 
before Airflow 2.6 and a number
+   of people succeeded in doing so, but there were some hard-coded behaviours 
that preferred in-built
+   executors, and custom executors could not provide full functionality that 
built-in executors had.
+
+
+What is not part of the Public Interface of Apache Airflow?
+===
+
+Everything not mentioned in this document should be considered as non-Public 
Interface.

Review Comment:
   And if it's both then how do we distinguish?  I think the current convention 
is underscores OR meta private tag.
   
   But interested in your thoughts.  We should include that in this proposal i 
think.
   
   Personally I think it is an important goal to minimize backcompat surface 
area so i would suggest that 

[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058633670


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   > For long term we also need to consider get rid of most of the stuff in 
`test.test_utils` and replace them by equivalent as fixture/context 
manager/markers but this better complete after migrate remaining tests to 
`pytest` and document them.
   > 
   > Because most of the new contributors just copy-paste some stuff from one 
test to another without any idea how it actually work.
   
   Yep. Full agreement here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


Taragolis commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r105866


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   For long term we also need to consider get rid of most of the stuff in 
`test.test_utils` and replace them by equivalent as fixture/context 
manager/markers but this better complete after migrate remaining tests to 
`pytest` and document them.
   
   Because most of the new contributors just copy-paste some stuff from one 
test to another without any idea how it actually work.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dstandish commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation

2022-12-28 Thread GitBox


dstandish commented on code in PR #28300:
URL: https://github.com/apache/airflow/pull/28300#discussion_r1058631912


##
docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst:
##
@@ -0,0 +1,87 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Public Interface of Airflow
+===
+
+The Public Interface of Apache Airflow is a set of programmatic interfaces 
that allow developers to interact
+with and access certain features of the Apache Airflow system. This includes 
operations such as
+creating and managing DAGs (directed acyclic graphs), managing tasks and their 
dependencies,
+and extending Airflow capabilities by writing new executors, plugins, 
operators and providers. The
+Public Interface can be useful for building custom tools and integrations with 
other systems,
+and for automating certain aspects of the Airflow workflow.
+
+You can extend Airflow in three ways:
+
+* By writing new custom Python code (via Operators, Plugins, Provider)
+* By using the `Stable REST API `_ (based on the OpenAPI 
specification)
+* By using the `Airflow Command Line Interface (CLI) 
`_
+
+How can you extend Apache Airflow with custom Python Code?
+==
+
+The Public Interface of Airflow consists of a number of different classes and 
packages that provide access
+to the core features and functionality of the system.
+
+The classes and packages that may be considered as the Public Interface 
include:
+
+* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in 
Airflow.
+* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way 
write custom operators.
+* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom 
hooks.
+* The :class:`~airflow.models.connection.Connection`, which provides access to 
external service credentials and configuration.
+* The :class:`~airflow.models.variable.Variable`, which provides access to 
Airflow configuration variables.
+* The :class:`~airflow.models.xcom.XCom` which are used to access to 
inter-task communication data.
+* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define 
custom secret managers.
+* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define 
custom plugins.
+* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement 
custom Custom Deferrable Operators (based on ``asyncio``).
+* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way 
write custom decorators.
+* The :class:`~airflow.listeners.listener.ListenerManager` class which 
provides hooks that can be implemented to respond to DAG/Task lifecycle events.
+
+.. versionadded:: 2.5
+
+   Listener public interface has been added in version 2.5.
+
+* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors 
are the components of Airflow
+  that are responsible for executing tasks.
+
+.. versionadded:: 2.6
+
+   There are a number of different executor implementations built-in Airflow, 
each with its own unique
+   characteristics and capabilities. Executor interface was available in 
earlier version of Airflow but
+   only as of version 2.6 executors are fully decoupled and Airflow does not 
rely on built-in set of executors.
+   You could have implemented (and succeeded) with implementing Executors 
before Airflow 2.6 and a number
+   of people succeeded in doing so, but there were some hard-coded behaviours 
that preferred in-built
+   executors, and custom executors could not provide full functionality that 
built-in executors had.
+
+
+What is not part of the Public Interface of Apache Airflow?
+===
+
+Everything not mentioned in this document should be considered as non-Public 
Interface.

Review Comment:
   Sorry... so ... specifically in the provider package... is everything 
private, public, or some private some public?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: 

[airflow] branch main updated: Variables set in variables.env are automatically exported (#28633)

2022-12-28 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
 new 94b3b897e2 Variables set in variables.env are automatically exported 
(#28633)
94b3b897e2 is described below

commit 94b3b897e2f94902777c4b24fb10c915279d8967
Author: Jarek Potiuk 
AuthorDate: Wed Dec 28 23:28:50 2022 +0100

Variables set in variables.env are automatically exported (#28633)

The variables set in variables.env were not automatically exported.
---
 scripts/in_container/configure_environment.sh | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/scripts/in_container/configure_environment.sh 
b/scripts/in_container/configure_environment.sh
index d9638b3219..6d0cbae41e 100644
--- a/scripts/in_container/configure_environment.sh
+++ b/scripts/in_container/configure_environment.sh
@@ -36,8 +36,10 @@ if [[ -d "${AIRFLOW_BREEZE_CONFIG_DIR}" && \
 echo
 echo "${COLOR_BLUE}Sourcing environment variables from 
${VARIABLES_ENV_FILE} in ${AIRFLOW_BREEZE_CONFIG_DIR}${COLOR_RESET}"
 echo
+set -o allexport
  # shellcheck disable=1090
 source "${VARIABLES_ENV_FILE}"
+set +o allexport
 popd >/dev/null 2>&1 || exit 1
 fi
 



[GitHub] [airflow] potiuk merged pull request #28633: Variables set in variables.env are automatically exported

2022-12-28 Thread GitBox


potiuk merged PR #28633:
URL: https://github.com/apache/airflow/pull/28633


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058629909


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   I could check also request fixture tomorrow, I thought it should contain 
session information, such as how many tests collected and if it equal (or less, 
I don't know how it possible) 1 then do not cleanup anything.
   
   > I think this is not really needed now - the overhead with using the 
test_utils classes is really low indeed. I believe we should enable it for all 
tests and compare execution time on CI and if it is < 5% overhead in total or 
so, we can leave it on for all modules.



##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   > I could check also request fixture tomorrow, I thought it should contain 
session information, such as how many tests collected and if it equal (or less, 
I don't know how it possible) 1 then do not cleanup anything.
   
   I think this is not really needed now - the overhead with using the 
test_utils classes is really low indeed. I believe we should enable it for all 
tests and compare execution time on CI and if it is < 5% overhead in total or 
so, we can leave it on for all modules.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


Taragolis commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058628201


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   And also remove `yield` in this fixture we do not need any teardown, at 
least yet



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation

2022-12-28 Thread GitBox


potiuk commented on code in PR #28300:
URL: https://github.com/apache/airflow/pull/28300#discussion_r1058628066


##
docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst:
##
@@ -0,0 +1,87 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Public Interface of Airflow
+===
+
+The Public Interface of Apache Airflow is a set of programmatic interfaces 
that allow developers to interact
+with and access certain features of the Apache Airflow system. This includes 
operations such as
+creating and managing DAGs (directed acyclic graphs), managing tasks and their 
dependencies,
+and extending Airflow capabilities by writing new executors, plugins, 
operators and providers. The
+Public Interface can be useful for building custom tools and integrations with 
other systems,
+and for automating certain aspects of the Airflow workflow.
+
+You can extend Airflow in three ways:
+
+* By writing new custom Python code (via Operators, Plugins, Provider)
+* By using the `Stable REST API `_ (based on the OpenAPI 
specification)
+* By using the `Airflow Command Line Interface (CLI) 
`_
+
+How can you extend Apache Airflow with custom Python Code?
+==
+
+The Public Interface of Airflow consists of a number of different classes and 
packages that provide access
+to the core features and functionality of the system.
+
+The classes and packages that may be considered as the Public Interface 
include:
+
+* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in 
Airflow.
+* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way 
write custom operators.
+* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom 
hooks.
+* The :class:`~airflow.models.connection.Connection`, which provides access to 
external service credentials and configuration.
+* The :class:`~airflow.models.variable.Variable`, which provides access to 
Airflow configuration variables.
+* The :class:`~airflow.models.xcom.XCom` which are used to access to 
inter-task communication data.
+* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define 
custom secret managers.
+* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define 
custom plugins.
+* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement 
custom Custom Deferrable Operators (based on ``asyncio``).
+* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way 
write custom decorators.
+* The :class:`~airflow.listeners.listener.ListenerManager` class which 
provides hooks that can be implemented to respond to DAG/Task lifecycle events.
+
+.. versionadded:: 2.5
+
+   Listener public interface has been added in version 2.5.
+
+* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors 
are the components of Airflow
+  that are responsible for executing tasks.
+
+.. versionadded:: 2.6
+
+   There are a number of different executor implementations built-in Airflow, 
each with its own unique
+   characteristics and capabilities. Executor interface was available in 
earlier version of Airflow but
+   only as of version 2.6 executors are fully decoupled and Airflow does not 
rely on built-in set of executors.
+   You could have implemented (and succeeded) with implementing Executors 
before Airflow 2.6 and a number
+   of people succeeded in doing so, but there were some hard-coded behaviours 
that preferred in-built
+   executors, and custom executors could not provide full functionality that 
built-in executors had.
+
+
+What is not part of the Public Interface of Apache Airflow?
+===
+
+Everything not mentioned in this document should be considered as non-Public 
Interface.

Review Comment:
   And yes I want to add eventually tests to make sure that all our providers 
are only using what we consider public" and nothing else. This will be 
partially done as part of AIP-44 (for DB methods) but it can be extended to all 
Python calls made by the providers. 



-- 
This is an automated message from the Apache Git 

[GitHub] [airflow] potiuk commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation

2022-12-28 Thread GitBox


potiuk commented on code in PR #28300:
URL: https://github.com/apache/airflow/pull/28300#discussion_r1058627698


##
docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst:
##
@@ -0,0 +1,87 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Public Interface of Airflow
+===
+
+The Public Interface of Apache Airflow is a set of programmatic interfaces 
that allow developers to interact
+with and access certain features of the Apache Airflow system. This includes 
operations such as
+creating and managing DAGs (directed acyclic graphs), managing tasks and their 
dependencies,
+and extending Airflow capabilities by writing new executors, plugins, 
operators and providers. The
+Public Interface can be useful for building custom tools and integrations with 
other systems,
+and for automating certain aspects of the Airflow workflow.
+
+You can extend Airflow in three ways:
+
+* By writing new custom Python code (via Operators, Plugins, Provider)
+* By using the `Stable REST API `_ (based on the OpenAPI 
specification)
+* By using the `Airflow Command Line Interface (CLI) 
`_
+
+How can you extend Apache Airflow with custom Python Code?
+==
+
+The Public Interface of Airflow consists of a number of different classes and 
packages that provide access
+to the core features and functionality of the system.
+
+The classes and packages that may be considered as the Public Interface 
include:
+
+* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in 
Airflow.
+* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way 
write custom operators.
+* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom 
hooks.
+* The :class:`~airflow.models.connection.Connection`, which provides access to 
external service credentials and configuration.
+* The :class:`~airflow.models.variable.Variable`, which provides access to 
Airflow configuration variables.
+* The :class:`~airflow.models.xcom.XCom` which are used to access to 
inter-task communication data.
+* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define 
custom secret managers.
+* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define 
custom plugins.
+* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement 
custom Custom Deferrable Operators (based on ``asyncio``).
+* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way 
write custom decorators.
+* The :class:`~airflow.listeners.listener.ListenerManager` class which 
provides hooks that can be implemented to respond to DAG/Task lifecycle events.
+
+.. versionadded:: 2.5
+
+   Listener public interface has been added in version 2.5.
+
+* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors 
are the components of Airflow
+  that are responsible for executing tasks.
+
+.. versionadded:: 2.6
+
+   There are a number of different executor implementations built-in Airflow, 
each with its own unique
+   characteristics and capabilities. Executor interface was available in 
earlier version of Airflow but
+   only as of version 2.6 executors are fully decoupled and Airflow does not 
rely on built-in set of executors.
+   You could have implemented (and succeeded) with implementing Executors 
before Airflow 2.6 and a number
+   of people succeeded in doing so, but there were some hard-coded behaviours 
that preferred in-built
+   executors, and custom executors could not provide full functionality that 
built-in executors had.
+
+
+What is not part of the Public Interface of Apache Airflow?
+===
+
+Everything not mentioned in this document should be considered as non-Public 
Interface.

Review Comment:
   And the list does not **have** to be complete now. If we forget something 
before merging it - we can always add it later - there is completely no harm in 
that. It will be "eventually complete" if we follow it for quite some time and 
decide that some things should be "publicised".



-- 
This is an automated message 

[GitHub] [airflow] potiuk commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation

2022-12-28 Thread GitBox


potiuk commented on code in PR #28300:
URL: https://github.com/apache/airflow/pull/28300#discussion_r1058627308


##
docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst:
##
@@ -0,0 +1,87 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Public Interface of Airflow
+===
+
+The Public Interface of Apache Airflow is a set of programmatic interfaces 
that allow developers to interact
+with and access certain features of the Apache Airflow system. This includes 
operations such as
+creating and managing DAGs (directed acyclic graphs), managing tasks and their 
dependencies,
+and extending Airflow capabilities by writing new executors, plugins, 
operators and providers. The
+Public Interface can be useful for building custom tools and integrations with 
other systems,
+and for automating certain aspects of the Airflow workflow.
+
+You can extend Airflow in three ways:
+
+* By writing new custom Python code (via Operators, Plugins, Provider)
+* By using the `Stable REST API `_ (based on the OpenAPI 
specification)
+* By using the `Airflow Command Line Interface (CLI) 
`_
+
+How can you extend Apache Airflow with custom Python Code?
+==
+
+The Public Interface of Airflow consists of a number of different classes and 
packages that provide access
+to the core features and functionality of the system.
+
+The classes and packages that may be considered as the Public Interface 
include:
+
+* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in 
Airflow.
+* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way 
write custom operators.
+* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom 
hooks.
+* The :class:`~airflow.models.connection.Connection`, which provides access to 
external service credentials and configuration.
+* The :class:`~airflow.models.variable.Variable`, which provides access to 
Airflow configuration variables.
+* The :class:`~airflow.models.xcom.XCom` which are used to access to 
inter-task communication data.
+* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define 
custom secret managers.
+* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define 
custom plugins.
+* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement 
custom Custom Deferrable Operators (based on ``asyncio``).
+* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way 
write custom decorators.
+* The :class:`~airflow.listeners.listener.ListenerManager` class which 
provides hooks that can be implemented to respond to DAG/Task lifecycle events.
+
+.. versionadded:: 2.5
+
+   Listener public interface has been added in version 2.5.
+
+* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors 
are the components of Airflow
+  that are responsible for executing tasks.
+
+.. versionadded:: 2.6
+
+   There are a number of different executor implementations built-in Airflow, 
each with its own unique
+   characteristics and capabilities. Executor interface was available in 
earlier version of Airflow but
+   only as of version 2.6 executors are fully decoupled and Airflow does not 
rely on built-in set of executors.
+   You could have implemented (and succeeded) with implementing Executors 
before Airflow 2.6 and a number
+   of people succeeded in doing so, but there were some hard-coded behaviours 
that preferred in-built
+   executors, and custom executors could not provide full functionality that 
built-in executors had.
+
+
+What is not part of the Public Interface of Apache Airflow?
+===
+
+Everything not mentioned in this document should be considered as non-Public 
Interface.

Review Comment:
   IMHO - it should also address providers (and especially them). 
   
   Rather than adding private tags, I think we should allowlist everything that 
should be public here. If we forgot something we should add.  All the rest 
should be private by definition IMHO.  This is the whole purpose of this list. 
And we should 

[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


Taragolis commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058626222


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   I could check also `request` fixture tomorrow, I thought it should contain 
session information, such as how many tests collected and if it equal (or less, 
I don't know how it possible) `1` then do not cleanup anything.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated (69df1c5d9e -> e8657ce559)

2022-12-28 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


from 69df1c5d9e Speed up provider validation pre-commit (#28541)
 add e8657ce559 Improve "other" test category selection (#28630)

No new revisions were added by this update.

Summary of changes:
 .github/workflows/ci.yml   |  6 +++--
 Dockerfile.ci  |  4 +++-
 .../src/airflow_breeze/utils/selective_checks.py   | 10 ++--
 dev/breeze/tests/test_selective_checks.py  | 25 +++-
 scripts/docker/entrypoint_ci.sh|  4 +++-
 scripts/in_container/run_extract_tests.sh  | 25 
 ...est_collection.py => test_pytest_collection.py} | 27 --
 7 files changed, 57 insertions(+), 44 deletions(-)
 delete mode 100755 scripts/in_container/run_extract_tests.sh
 rename scripts/in_container/{test_arm_pytest_collection.py => 
test_pytest_collection.py} (73%)



[GitHub] [airflow] potiuk merged pull request #28630: Improve "other" test category selection

2022-12-28 Thread GitBox


potiuk merged PR #28630:
URL: https://github.com/apache/airflow/pull/28630


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dstandish commented on a diff in pull request #28300: Add Public Interface description to Airflow documentation

2022-12-28 Thread GitBox


dstandish commented on code in PR #28300:
URL: https://github.com/apache/airflow/pull/28300#discussion_r1058624408


##
docs/apache-airflow/administration-and-deployment/public-airflow-interface.rst:
##
@@ -0,0 +1,87 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Public Interface of Airflow
+===
+
+The Public Interface of Apache Airflow is a set of programmatic interfaces 
that allow developers to interact
+with and access certain features of the Apache Airflow system. This includes 
operations such as
+creating and managing DAGs (directed acyclic graphs), managing tasks and their 
dependencies,
+and extending Airflow capabilities by writing new executors, plugins, 
operators and providers. The
+Public Interface can be useful for building custom tools and integrations with 
other systems,
+and for automating certain aspects of the Airflow workflow.
+
+You can extend Airflow in three ways:
+
+* By writing new custom Python code (via Operators, Plugins, Provider)
+* By using the `Stable REST API `_ (based on the OpenAPI 
specification)
+* By using the `Airflow Command Line Interface (CLI) 
`_
+
+How can you extend Apache Airflow with custom Python Code?
+==
+
+The Public Interface of Airflow consists of a number of different classes and 
packages that provide access
+to the core features and functionality of the system.
+
+The classes and packages that may be considered as the Public Interface 
include:
+
+* The :class:`~airflow.DAG`, which provides a way to define and manage DAGs in 
Airflow.
+* The :class:`~airflow.models.baseoperator.BaseOperator`, which provides a way 
write custom operators.
+* The :class:`~airflow.hooks.base.BaseHook`, which provides a way write custom 
hooks.
+* The :class:`~airflow.models.connection.Connection`, which provides access to 
external service credentials and configuration.
+* The :class:`~airflow.models.variable.Variable`, which provides access to 
Airflow configuration variables.
+* The :class:`~airflow.models.xcom.XCom` which are used to access to 
inter-task communication data.
+* The :class:`~airflow.secrets.BaseSecretsBackend` which are used to define 
custom secret managers.
+* The :class:`~airflow.plugins_manager.AirflowPlugin` which are used to define 
custom plugins.
+* The :class:`~airflow.triggers.base.BaseTrigger`, which are used to implement 
custom Custom Deferrable Operators (based on ``asyncio``).
+* The :class:`~airflow.decorators.base.TaskDecorator`, which provides a way 
write custom decorators.
+* The :class:`~airflow.listeners.listener.ListenerManager` class which 
provides hooks that can be implemented to respond to DAG/Task lifecycle events.
+
+.. versionadded:: 2.5
+
+   Listener public interface has been added in version 2.5.
+
+* The :class:`~airflow.executors.base_executor.BaseExecutor` - the Executors 
are the components of Airflow
+  that are responsible for executing tasks.
+
+.. versionadded:: 2.6
+
+   There are a number of different executor implementations built-in Airflow, 
each with its own unique
+   characteristics and capabilities. Executor interface was available in 
earlier version of Airflow but
+   only as of version 2.6 executors are fully decoupled and Airflow does not 
rely on built-in set of executors.
+   You could have implemented (and succeeded) with implementing Executors 
before Airflow 2.6 and a number
+   of people succeeded in doing so, but there were some hard-coded behaviours 
that preferred in-built
+   executors, and custom executors could not provide full functionality that 
built-in executors had.
+
+
+What is not part of the Public Interface of Apache Airflow?
+===
+
+Everything not mentioned in this document should be considered as non-Public 
Interface.

Review Comment:
   This doesn't address providers, or how we can include "private" e.g. helper 
code in otherwise "public" modules / classes.
   
   Is it sufficient to include `:meta private:`?  Or must we prefix with 
underscore?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on 

[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058623426


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   I think it costs us almost nothing and is a **long term** solution if we do 
it this way.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058622999


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   You convinced me.
   
   I just tested your solution (My videos were with --with-db-init) and yes. I 
agree it's not noticeable in PyCharm/IntelliJ, so you are likely right that 
Flask import and others add a lot. And yeah. Having DB reset in this case for 
every test is quite OK. It is actually likely that those flask imports cause 
the constraint problems anyway.
   
   Ttry to notice which one is which (it's visible, but barely).
   
   
https://user-images.githubusercontent.com/595491/209874550-732ee92c-43dc-4fc1-b3f7-f0b365b0d281.mov
   
   
https://user-images.githubusercontent.com/595491/209874540-7a7e5611-d888-45a8-b8fd-02f8df147385.mov
   
   And it has the nice benefit that it will help the users in many situations 
where they run single tests as well. This overhead is much smallee that I was 
used to from `--with-db-init` 
   
   However, I think few changes are needed:
   
   1) have a "clear_all()" in the test_utils.db and using it instead of having 
the list of methods in conftest.py  - this will help with future maintainability
   
   2) What's even more ... I **really** think we should apply it to all tests 
and completely apply it to the main conftest.py and do it for all modules - I 
think with such low overhead we can kill few birds with the same stone and the 
overall overhead will be miinimal. I just did this in main "conftest.py". I 
don't think we need to detect any providers at all and run it for **everything**
   
   ```
   @pytest.fixture(scope="module", autouse=True)
   def _clear_db_between_modules(request):
   from tests.test_utils import db
   
   """Clear DB between each separate provider package test runs."""
   print("Clearing DB")
   db.clear_db_runs()
   db.clear_db_datasets()
   db.clear_db_dags()
   db.clear_db_serialized_dags()
   db.clear_db_sla_miss()
   db.clear_db_pools()
   db.clear_db_connections()
   db.clear_db_variables()
   db.clear_db_dag_code()
   db.clear_db_callbacks()
   db.clear_rendered_ti_fields()
   db.clear_db_import_errors()
   db.clear_db_dag_warnings()
   db.clear_db_xcom()
   db.clear_db_logs()
   db.clear_db_jobs()
   db.clear_db_task_fail()
   db.clear_db_task_reschedule()
   db.clear_dag_specific_permissions()
   db.create_default_connections()
   db.set_default_pool_slots(128)
   yield
   ```
   
   And run it with -s 
   
   ```
   pytest -s tests/cli/ tests/core/
   ```
   
   And it looks like it works as expected:
   
   ```
   
tests/cli/commands/test_info_command.py::TestPiiAnonymizer::test_should_remove_pii_from_path
 Clearing DB
   PASSED
   
tests/cli/commands/test_info_command.py::TestPiiAnonymizer::test_should_remove_pii_from_url[postgresql+psycopg2://postgres:airflow@postgres/airflow-postgresql+psycopg2://p...s:PASSWORD@postgres/airflow]
 PASSED
   
tests/cli/commands/test_info_command.py::TestPiiAnonymizer::test_should_remove_pii_from_url[postgresql+psycopg2://postgres@postgres/airflow-postgresql+psycopg2://p...s@postgres/airflow]
 PASSED
   
tests/cli/commands/test_info_command.py::TestPiiAnonymizer::test_should_remove_pii_from_url[postgresql+psycopg2://:airflow@postgres/airflow-postgresql+psycopg2://:PASSWORD@postgres/airflow]
 PASSED
   
tests/cli/commands/test_info_command.py::TestPiiAnonymizer::test_should_remove_pii_from_url[postgresql+psycopg2://postgres/airflow-postgresql+psycopg2://postgres/airflow]
 PASSED
   tests/cli/commands/test_info_command.py::TestAirflowInfo::test_airflow_info 
PASSED
   tests/cli/commands/test_info_command.py::TestAirflowInfo::test_system_info 
PASSED
   tests/cli/commands/test_info_command.py::TestAirflowInfo::test_paths_info 
PASSED
   tests/cli/commands/test_info_command.py::TestAirflowInfo::test_tools_info 
PASSED
   

[GitHub] [airflow] dstandish commented on a diff in pull request #28336: Fixed hanged KubernetesPodOperator

2022-12-28 Thread GitBox


dstandish commented on code in PR #28336:
URL: https://github.com/apache/airflow/pull/28336#discussion_r1058615241


##
airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py:
##
@@ -168,6 +168,7 @@ class KubernetesPodOperator(BaseOperator):
 :param labels: labels to apply to the Pod. (templated)
 :param startup_timeout_seconds: timeout in seconds to startup the pod.
 :param get_logs: get the stdout of the container as logs of the tasks.
+:param logs_timeout: timeout in seconds to read logs after container 
termination.

Review Comment:
   I am not sure this needs to be configurable.
   
   It reads like a setting "if no logs for logs_timeout settings then quit". 
but... really this timeout is applied only after pod completes. and, in this 
case, 2 minutes seems fine, it's a potentially confusing param, and i don't 
think we need to make that configurable.



##
airflow/providers/cncf/kubernetes/utils/pod_manager.py:
##
@@ -91,6 +99,61 @@ def get_container_termination_message(pod: V1Pod, 
container_name: str):
 return container_status.state.terminated.message if container_status 
else None
 
 
+class PodLogsConsumer:
+"""
+PodLogsConsumer is responsible for pulling pod logs from a stream with 
checking a container status before
+reading data.
+This class is a workaround for the issue 
https://github.com/apache/airflow/issues/23497
+"""
+
+def __init__(
+self,
+response: HTTPResponse,
+pod: V1Pod,
+pod_manager: PodManager,
+container_name: str,
+timeout: int = 120,
+):
+self.response = response
+self.pod = pod
+self.pod_manager = pod_manager
+self.container_name = container_name
+self.timeout = timeout
+
+def __iter__(self) -> Generator[bytes, None, None]:
+messages: list[bytes] = []
+if self.logs_available():
+for chunk in self.response.stream(amt=None, decode_content=True):

Review Comment:
   does this work when follow=False?



##
airflow/providers/cncf/kubernetes/utils/pod_manager.py:
##
@@ -91,6 +99,61 @@ def get_container_termination_message(pod: V1Pod, 
container_name: str):
 return container_status.state.terminated.message if container_status 
else None
 
 
+class PodLogsConsumer:
+"""
+PodLogsConsumer is responsible for pulling pod logs from a stream with 
checking a container status before
+reading data.
+This class is a workaround for the issue 
https://github.com/apache/airflow/issues/23497
+"""
+
+def __init__(
+self,
+response: HTTPResponse,
+pod: V1Pod,
+pod_manager: PodManager,
+container_name: str,
+timeout: int = 120,
+):
+self.response = response
+self.pod = pod
+self.pod_manager = pod_manager
+self.container_name = container_name
+self.timeout = timeout
+
+def __iter__(self) -> Generator[bytes, None, None]:
+messages: list[bytes] = []
+if self.logs_available():
+for chunk in self.response.stream(amt=None, decode_content=True):
+if b"\n" in chunk:
+chunks = chunk.split(b"\n")
+yield b"".join(messages) + chunks[0] + b"\n"
+for x in chunks[1:-1]:
+yield x + b"\n"
+if chunks[-1]:
+messages = [chunks[-1]]
+else:
+messages = []
+else:
+messages.append(chunk)
+if not self.logs_available():

Review Comment:
   this would seem to result in a _lot_ of calls to the kube API.  it may be 
wise to limit such cheking to every couple minutes?  wdyt? @jedcunningham ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] maggesssss commented on issue #23666: Add transfers operator S3 to SQL / SQL to SQL

2022-12-28 Thread GitBox


magges commented on issue #23666:
URL: https://github.com/apache/airflow/issues/23666#issuecomment-1366930217

   @thinhnd2104 are you working on this topic?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] pgagnon commented on pull request #28635: Defer to hook setting for split_statements in SQLExecuteQueryOperator

2022-12-28 Thread GitBox


pgagnon commented on PR #28635:
URL: https://github.com/apache/airflow/pull/28635#issuecomment-1366926674

   > I don't think this breaks any backcompat because for most providers this 
is no change and, for snowflake , if you try to submit multistatement r.n. with 
defaults it will just fail
   
   We should be very explicit about this in the release notes though if we 
merge this change (which I am in favor of), because it potentially alters DAG 
behavior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dstandish commented on a diff in pull request #28635: Defer to hook setting for split_statements in SQLExecuteQueryOperator

2022-12-28 Thread GitBox


dstandish commented on code in PR #28635:
URL: https://github.com/apache/airflow/pull/28635#discussion_r1058602568


##
airflow/providers/common/sql/operators/sql.py:
##
@@ -26,6 +26,7 @@
 from airflow.hooks.base import BaseHook
 from airflow.models import BaseOperator, SkipMixin
 from airflow.providers.common.sql.hooks.sql import DbApiHook, 
fetch_all_handler, return_single_query_results
+from airflow.utils.types import NOTSET

Review Comment:
   ```suggestion
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


Taragolis commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058602023


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {
+"alibaba",
+"amazon",
+"apache",
+"atlassian",
+"common",
+"dbt",
+"facebook",
+"google",
+"microsoft",
+}
+PROVIDERS_PACKAGES = set()
+
+
+def get_test_provider_name(m):
+"""Extract provider name from module full qualname."""
+_, _, name = m.__name__.partition("providers.")
+for inner_provider in INNER_PROVIDERS:
+if name.startswith(inner_provider):
+return ".".join(name.split(".", 2)[:2])
+return name.split(".", 1)[0]
+
+
+@pytest.fixture(scope="module", autouse=True)
+def _clear_db_between_providers_tests(request):
+"""Clear DB between each separate provider package test runs."""
+provider_name = get_test_provider_name(request.module)
+if provider_name and provider_name not in PROVIDERS_PACKAGES:
+PROVIDERS_PACKAGES.add(provider_name)
+db.clear_db_runs()

Review Comment:
   We could do it any time. This PR just a temporary solution but unfortunetly 
some of the temporary solution could work for ages and become permanent



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Adityamalik123 opened a new pull request, #28636: Adding base changes for adding updated_at filter for task_instance_en…

2022-12-28 Thread GitBox


Adityamalik123 opened a new pull request, #28636:
URL: https://github.com/apache/airflow/pull/28636

   …dpoint
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a 
newsfragment file, named `{pr_number}.significant.rst` or 
`{issue_number}.significant.rst`, in 
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


Taragolis commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058601195


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   > I hate having to wait even 0.5 s to run single test with Ctrl+R. Plus some 
people have auto-rerrunable tests (there are plugins that run tests 
automatically after saving). We cannot make that experience so much worse.
   
   This could be nice debate for many reason.
   
   First of all `pytest` itself started for about 1 second with load all 
plugins plus outputs might take a lot of resources. 
   
   ```console
   ❯ time pytest --help 
   ...
   pytest --help  1.21s user 0.21s system 76% cpu 1.851 total
   
   ❯ time pytest --help
   ...
   pytest --help  0.58s user 0.11s system 98% cpu 0.697 total
   
   ❯ time pytest --help
   ...
   pytest --help  0.59s user 0.12s system 87% cpu 0.816 total
   ```
   
   PyCharm also could add latency. Hopefully we not enable any coverage in 
adopts because PyCharm/IDEA could add additional 5-10 seconds per run in this 
case.
   
   Personally most of the time I spend in debugger when I try to check is it 
work as expected or try to figure out why it not working. And usually rerun all 
the module tests just for sure that last change not break anything new.
   
   As well as if we consider **X** it is a time for create something new then 
time for cover by tests plus fixing become **X * 5** with most of the time 
siting with hand on the face and told "Why on Earth you don't work as expected" 
or "This concept is garbage I need to rewrite my code because tests shows me 
not everything might work well".
   
   But this is a personal as well as all times only related to my setup, 
someone who has old laptop could additionally wait more than 0.2 second



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dstandish commented on pull request #28635: Defer to hook setting for split_statements in SQLExecuteQueryOperator

2022-12-28 Thread GitBox


dstandish commented on PR #28635:
URL: https://github.com/apache/airflow/pull/28635#issuecomment-1366922185

   I don't think this breaks any backcompat because for most providers this is 
no change and, for snowflake , if you try to submit multistatement r.n. with 
defaults it will just fail


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dstandish opened a new pull request, #28635: Defer to hook setting for split_statements in SQLExecuteQueryOperator

2022-12-28 Thread GitBox


dstandish opened a new pull request, #28635:
URL: https://github.com/apache/airflow/pull/28635

   Some databases, such as snowflake, require you to split statements in order 
to submit multi-statement sql.  For such databases, splitting is the natural 
default, and we should defer to the hook to control that.
   
   cc @potiuk @eladkal @kazanzhy @pgagnon


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] stamixthereal opened a new pull request, #28634: Add doc-strings and small improvement to email util

2022-12-28 Thread GitBox


stamixthereal opened a new pull request, #28634:
URL: https://github.com/apache/airflow/pull/28634

   Docstrings were added to email sending util, some type hints changes, and 
implements regular expression intro `_get_email_list_from_str` function


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


Taragolis commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058578673


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   > See the videos above (that's the native env).. It is noticeable, a lot.
   
   I think this happen because Init DB required load Flask and their 
dependencies.
   All other tests right not use reset db / init db for cleanup their states. 
Instead of just run direct queries by this module: 
https://github.com/apache/airflow/blob/main/tests/test_utils/db.py which run 
pretty fast
   
   I agree that the current solution a bit dumb but the main target right now 
reduce chance of side-effects between different providers tests, and the cost 
about 0.2 sec on single test or +0.2 (plus additional for not expensive logic) 
on single provider package.
   
   A lot of chance that this solution do not required as soon as we change 
structure of providers tests.
   
   For all tests we should consider to use different approach



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058576658


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {
+"alibaba",
+"amazon",
+"apache",
+"atlassian",
+"common",
+"dbt",
+"facebook",
+"google",
+"microsoft",
+}
+PROVIDERS_PACKAGES = set()
+
+
+def get_test_provider_name(m):
+"""Extract provider name from module full qualname."""
+_, _, name = m.__name__.partition("providers.")
+for inner_provider in INNER_PROVIDERS:
+if name.startswith(inner_provider):
+return ".".join(name.split(".", 2)[:2])
+return name.split(".", 1)[0]
+
+
+@pytest.fixture(scope="module", autouse=True)
+def _clear_db_between_providers_tests(request):
+"""Clear DB between each separate provider package test runs."""
+provider_name = get_test_provider_name(request.module)
+if provider_name and provider_name not in PROVIDERS_PACKAGES:
+PROVIDERS_PACKAGES.add(provider_name)
+db.clear_db_runs()

Review Comment:
   If you want I can take over (but not just yet - I need catch-up with other 
stuff :D )



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058576453


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {
+"alibaba",
+"amazon",
+"apache",
+"atlassian",
+"common",
+"dbt",
+"facebook",
+"google",
+"microsoft",
+}
+PROVIDERS_PACKAGES = set()
+
+
+def get_test_provider_name(m):
+"""Extract provider name from module full qualname."""
+_, _, name = m.__name__.partition("providers.")
+for inner_provider in INNER_PROVIDERS:
+if name.startswith(inner_provider):
+return ".".join(name.split(".", 2)[:2])
+return name.split(".", 1)[0]
+
+
+@pytest.fixture(scope="module", autouse=True)
+def _clear_db_between_providers_tests(request):
+"""Clear DB between each separate provider package test runs."""
+provider_name = get_test_provider_name(request.module)
+if provider_name and provider_name not in PROVIDERS_PACKAGES:
+PROVIDERS_PACKAGES.add(provider_name)
+db.clear_db_runs()

Review Comment:
   (BTW. Now you know why I have not tried it **just yet**, I kinda 
subconsciously expected it won't be super easy). 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058576111


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {
+"alibaba",
+"amazon",
+"apache",
+"atlassian",
+"common",
+"dbt",
+"facebook",
+"google",
+"microsoft",
+}
+PROVIDERS_PACKAGES = set()
+
+
+def get_test_provider_name(m):
+"""Extract provider name from module full qualname."""
+_, _, name = m.__name__.partition("providers.")
+for inner_provider in INNER_PROVIDERS:
+if name.startswith(inner_provider):
+return ".".join(name.split(".", 2)[:2])
+return name.split(".", 1)[0]
+
+
+@pytest.fixture(scope="module", autouse=True)
+def _clear_db_between_providers_tests(request):
+"""Clear DB between each separate provider package test runs."""
+provider_name = get_test_provider_name(request.module)
+if provider_name and provider_name not in PROVIDERS_PACKAGES:
+PROVIDERS_PACKAGES.add(provider_name)
+db.clear_db_runs()

Review Comment:
   > Also tried.
   
   For some reason it works when we run --with-db-init. I think this is a 
**real** bug that we need to trace if it does not work here.  I recall similar 
errors raised by our users.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058575510


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {
+"alibaba",
+"amazon",
+"apache",
+"atlassian",
+"common",
+"dbt",
+"facebook",
+"google",
+"microsoft",
+}
+PROVIDERS_PACKAGES = set()
+
+
+def get_test_provider_name(m):
+"""Extract provider name from module full qualname."""
+_, _, name = m.__name__.partition("providers.")
+for inner_provider in INNER_PROVIDERS:
+if name.startswith(inner_provider):
+return ".".join(name.split(".", 2)[:2])
+return name.split(".", 1)[0]
+
+
+@pytest.fixture(scope="module", autouse=True)
+def _clear_db_between_providers_tests(request):
+"""Clear DB between each separate provider package test runs."""
+provider_name = get_test_provider_name(request.module)
+if provider_name and provider_name not in PROVIDERS_PACKAGES:
+PROVIDERS_PACKAGES.add(provider_name)
+db.clear_db_runs()

Review Comment:
   > We also have reset_db fixture which not use anywhere, at least I can't 
find any usage of them
   
   Yep. Noticed it - we can remove it I think while doing it. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


Taragolis commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058574945


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {
+"alibaba",
+"amazon",
+"apache",
+"atlassian",
+"common",
+"dbt",
+"facebook",
+"google",
+"microsoft",
+}
+PROVIDERS_PACKAGES = set()
+
+
+def get_test_provider_name(m):
+"""Extract provider name from module full qualname."""
+_, _, name = m.__name__.partition("providers.")
+for inner_provider in INNER_PROVIDERS:
+if name.startswith(inner_provider):
+return ".".join(name.split(".", 2)[:2])
+return name.split(".", 1)[0]
+
+
+@pytest.fixture(scope="module", autouse=True)
+def _clear_db_between_providers_tests(request):
+"""Clear DB between each separate provider package test runs."""
+provider_name = get_test_provider_name(request.module)
+if provider_name and provider_name not in PROVIDERS_PACKAGES:
+PROVIDERS_PACKAGES.add(provider_name)
+db.clear_db_runs()

Review Comment:
   We also have `reset_db` fixture which not use anywhere, at least I can't 
find any usage of them
   
   
https://github.com/apache/airflow/blob/69df1c5d9e27764daa3584d3da6b0647cda9093d/tests/conftest.py#L91-L100



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


Taragolis commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058574499


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {
+"alibaba",
+"amazon",
+"apache",
+"atlassian",
+"common",
+"dbt",
+"facebook",
+"google",
+"microsoft",
+}
+PROVIDERS_PACKAGES = set()
+
+
+def get_test_provider_name(m):
+"""Extract provider name from module full qualname."""
+_, _, name = m.__name__.partition("providers.")
+for inner_provider in INNER_PROVIDERS:
+if name.startswith(inner_provider):
+return ".".join(name.split(".", 2)[:2])
+return name.split(".", 1)[0]
+
+
+@pytest.fixture(scope="module", autouse=True)
+def _clear_db_between_providers_tests(request):
+"""Clear DB between each separate provider package test runs."""
+provider_name = get_test_provider_name(request.module)
+if provider_name and provider_name not in PROVIDERS_PACKAGES:
+PROVIDERS_PACKAGES.add(provider_name)
+db.clear_db_runs()

Review Comment:
   Also tried.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058574439


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {
+"alibaba",
+"amazon",
+"apache",
+"atlassian",
+"common",
+"dbt",
+"facebook",
+"google",
+"microsoft",
+}
+PROVIDERS_PACKAGES = set()
+
+
+def get_test_provider_name(m):
+"""Extract provider name from module full qualname."""
+_, _, name = m.__name__.partition("providers.")
+for inner_provider in INNER_PROVIDERS:
+if name.startswith(inner_provider):
+return ".".join(name.split(".", 2)[:2])
+return name.split(".", 1)[0]
+
+
+@pytest.fixture(scope="module", autouse=True)
+def _clear_db_between_providers_tests(request):
+"""Clear DB between each separate provider package test runs."""
+provider_name = get_test_provider_name(request.module)
+if provider_name and provider_name not in PROVIDERS_PACKAGES:
+PROVIDERS_PACKAGES.add(provider_name)
+db.clear_db_runs()

Review Comment:
   I think som of those imports are needed to make it works (and no, it was not 
me who implemented it, and yes I also hate it ).



##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {
+"alibaba",
+"amazon",
+"apache",
+"atlassian",
+"common",
+"dbt",
+"facebook",
+"google",
+"microsoft",
+}
+PROVIDERS_PACKAGES = set()
+
+
+def get_test_provider_name(m):
+"""Extract provider name from module full qualname."""
+_, _, name = m.__name__.partition("providers.")
+for inner_provider in INNER_PROVIDERS:
+if name.startswith(inner_provider):
+return ".".join(name.split(".", 2)[:2])
+return name.split(".", 1)[0]
+
+
+@pytest.fixture(scope="module", autouse=True)
+def _clear_db_between_providers_tests(request):
+"""Clear DB between each separate provider package test runs."""
+provider_name = get_test_provider_name(request.module)
+if provider_name and provider_name not in PROVIDERS_PACKAGES:
+PROVIDERS_PACKAGES.add(provider_name)
+db.clear_db_runs()

Review Comment:
   I think some of those imports are needed to make it works (and no, it was 
not me who implemented it, and yes I also hate it ).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058574076


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {
+"alibaba",
+"amazon",
+"apache",
+"atlassian",
+"common",
+"dbt",
+"facebook",
+"google",
+"microsoft",
+}
+PROVIDERS_PACKAGES = set()
+
+
+def get_test_provider_name(m):
+"""Extract provider name from module full qualname."""
+_, _, name = m.__name__.partition("providers.")
+for inner_provider in INNER_PROVIDERS:
+if name.startswith(inner_provider):
+return ".".join(name.split(".", 2)[:2])
+return name.split(".", 1)[0]
+
+
+@pytest.fixture(scope="module", autouse=True)
+def _clear_db_between_providers_tests(request):
+"""Clear DB between each separate provider package test runs."""
+provider_name = get_test_provider_name(request.module)
+if provider_name and provider_name not in PROVIDERS_PACKAGES:
+PROVIDERS_PACKAGES.add(provider_name)
+db.clear_db_runs()

Review Comment:
   This one works when we run `--with-db-init` so maybe you should simply just 
use this command @Taragolis  (from our conftest.py). 
   
   ```
   def initial_db_init():
   from flask import Flask
   
   from airflow.configuration import conf
   from airflow.utils import db
   from airflow.www.app import sync_appbuilder_roles
   from airflow.www.extensions.init_appbuilder import init_appbuilder
   
   db.resetdb()
   db.bootstrap_dagbag()
   # minimal app to add roles
   flask_app = Flask(__name__)
   flask_app.config["SQLALCHEMY_DATABASE_URI"] = conf.get("database", 
"SQL_ALCHEMY_CONN")
   init_appbuilder(flask_app)
   sync_appbuilder_roles(flask_app)
   ```
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058569528


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {
+"alibaba",
+"amazon",
+"apache",
+"atlassian",
+"common",
+"dbt",
+"facebook",
+"google",
+"microsoft",
+}
+PROVIDERS_PACKAGES = set()
+
+
+def get_test_provider_name(m):
+"""Extract provider name from module full qualname."""
+_, _, name = m.__name__.partition("providers.")
+for inner_provider in INNER_PROVIDERS:
+if name.startswith(inner_provider):
+return ".".join(name.split(".", 2)[:2])
+return name.split(".", 1)[0]
+
+
+@pytest.fixture(scope="module", autouse=True)
+def _clear_db_between_providers_tests(request):
+"""Clear DB between each separate provider package test runs."""
+provider_name = get_test_provider_name(request.module)
+if provider_name and provider_name not in PROVIDERS_PACKAGES:
+PROVIDERS_PACKAGES.add(provider_name)
+db.clear_db_runs()

Review Comment:
   I think this is because of some import sequence @ephraimbuddy - does it ring 
a bell ?. It works for sure in 'airflow  db init` command, so we should be able 
to implement it here as well.  I think we **must** sort it out, because 
otherwise it's not future-proof - in case we add new stuff. 
   
   I can help sorting it out when we solve the other comments.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


Taragolis commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058572636


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   > But the same kind of problem happened on multiple occasions for all kinds 
of tests, not only between providers. The mechanism is the same - data created 
in one test (in unrelated package) might impact tests from another package. So 
I think if we attempt to do such isolation, it should be done across the board.
   
   Then I suggest to do in a different way. Because such side effect outside of 
Providers could happened only when two PR merged and create this stuff
   
   1. Migrate all test to pytest (there is less then 100 modules remaining)
   2. Create separate pytest plugin with fixtures/context managers
   3. Migrate all current tests to use this fixtures, most of them already 
exists in 
[conftest.py](https://github.com/apache/airflow/blob/main/tests/conftest.py)
   4. Restrict manually create DAGs, TIs and etc and allow only by 
fixtures/context managers (more tricky part)
   5. All core functionality should have setup and teardown which cleanup (most 
of them already have)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058571297


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   See the videos above (that's the native env).. It is noticeable, a lot. I 
hate having to wait even 0.5 s to run single test with Ctrl+R. Plus some people 
have auto-rerrunable tests (there are plugins that run tests automatically 
after saving). We cannot make that experience so much worse.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058571297


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   See the videos above (that's the native env).. It is noticeable, a lot. I 
hate having to wait even 0.5 to run single test with Ctrl+R. Plus some people 
have auto-rerrunable tests (there are plugins that run tests automatically 
after saving). We cannot make that experience so much worse.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058569528


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {
+"alibaba",
+"amazon",
+"apache",
+"atlassian",
+"common",
+"dbt",
+"facebook",
+"google",
+"microsoft",
+}
+PROVIDERS_PACKAGES = set()
+
+
+def get_test_provider_name(m):
+"""Extract provider name from module full qualname."""
+_, _, name = m.__name__.partition("providers.")
+for inner_provider in INNER_PROVIDERS:
+if name.startswith(inner_provider):
+return ".".join(name.split(".", 2)[:2])
+return name.split(".", 1)[0]
+
+
+@pytest.fixture(scope="module", autouse=True)
+def _clear_db_between_providers_tests(request):
+"""Clear DB between each separate provider package test runs."""
+provider_name = get_test_provider_name(request.module)
+if provider_name and provider_name not in PROVIDERS_PACKAGES:
+PROVIDERS_PACKAGES.add(provider_name)
+db.clear_db_runs()

Review Comment:
   I think this is because of some import sequence @ephraimbuddy - does it ring 
a bell ?. It works for I think we **must** sort it out, because otherwise it's 
not future-proof - in case we add new stuff. 
   
   I can help sorting it out when we solve the other comments.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


Taragolis commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058568341


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   1 second difference with this approach only in breeze in MacOS, in native 
environment differences even less
   
   ```console
   
   1 passed in 0.21s
   1 passed in 0.23s
   1 passed in 0.24s
   ```
   
   ```console
   
   1 passed in 0.03s
   1 passed in 0.03s
   1 passed in 0.03s
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dstandish commented on a diff in pull request #28439: CustomTTYColoredFormatter should inherit TimezoneAware formatter

2022-12-28 Thread GitBox


dstandish commented on code in PR #28439:
URL: https://github.com/apache/airflow/pull/28439#discussion_r1058564087


##
airflow/utils/log/colored_log.py:
##
@@ -38,7 +40,7 @@
 BOLD_OFF = esc("22")
 
 
-class CustomTTYColoredFormatter(TTYColoredFormatter):
+class CustomTTYColoredFormatter(TTYColoredFormatter, TimezoneAware):

Review Comment:
   oh forgot to add... sec



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dstandish commented on a diff in pull request #28439: CustomTTYColoredFormatter should inherit TimezoneAware formatter

2022-12-28 Thread GitBox


dstandish commented on code in PR #28439:
URL: https://github.com/apache/airflow/pull/28439#discussion_r1058563809


##
airflow/utils/log/colored_log.py:
##
@@ -38,7 +40,7 @@
 BOLD_OFF = esc("22")
 
 
-class CustomTTYColoredFormatter(TTYColoredFormatter):
+class CustomTTYColoredFormatter(TTYColoredFormatter, TimezoneAware):

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058561504


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   > So one second per run it not a big deal and if developer run all tests in 
single module or just one test class, it would be even unnoticeable.
   
   One second is a lot for tests that do not require the DB at all - especially 
if you run them from IDE with re-running single tests after the whole suite is 
run,  The difference is - immediate green/red vs. waiting a second. I often 
forget to remove `--with-db-init` when I add it and this happens only for one 
run - the difference is really noticeable so I remove it as soon as I realise I 
left it.
   
   Compare those two: 
   
   With db init:
   
   
https://user-images.githubusercontent.com/595491/209865511-d2815c9f-0747-4445-adbe-0318705fbc1d.mov
   
   Without:
   
   
https://user-images.githubusercontent.com/595491/209865517-0931689a-3491-43ec-824d-8db14100a775.mov
   
   > All others tests run all together, so if one tests from Core affect 
another then most probably we notice this in Core section
   
   Correct. But the same kind of problem happened on multiple occasions for all 
kinds of tests, not only between providers. The mechanism is the same - data 
created in one test (in unrelated package) might impact tests from another 
package. So I think if we attempt to do such isolation, it should be done 
across the board.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058561504


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   > So one second per run it not a big deal and if developer run all tests in 
single module or just one test class, it would be even unnoticeable.
   
   One second is a lot for tests that do not require the DB at all - especially 
if you run them from IDE with re-running single tests after the whole suite is 
run,  The difference is - immediate green/red vs. waiting a second. I often 
forget to remove `--with-db-init` when I add it and this happens only for one 
run - the difference is really noticeable so I remove it as soon as I realise I 
left it.
   
   Compare those two: 
   
   With db reset:
   
   
https://user-images.githubusercontent.com/595491/209865511-d2815c9f-0747-4445-adbe-0318705fbc1d.mov
   
   Without:
   
   
https://user-images.githubusercontent.com/595491/209865517-0931689a-3491-43ec-824d-8db14100a775.mov
   
   > All others tests run all together, so if one tests from Core affect 
another then most probably we notice this in Core section
   
   Correct. But the same kind of problem happened on multiple occasions for all 
kinds of tests, not only between providers. The mechanism is the same - data 
created in one test (in unrelated package) might impact tests from another 
package. So I think if we attempt to do such isolation, it should be done 
across the board.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch constraints-main updated: Updating constraints. Build id:

2022-12-28 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch constraints-main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/constraints-main by this push:
 new 6d85c982c7 Updating constraints. Build id:
6d85c982c7 is described below

commit 6d85c982c7cd3a1ef4f9f463e3d48023d8c50089
Author: Automated GitHub Actions commit 
AuthorDate: Wed Dec 28 20:01:34 2022 +

Updating constraints. Build id:

This update in constraints is automatically committed by the CI 
'constraints-push' step based on
HEAD of '' in ''
with commit sha .

All tests passed in this build so we determined we can push the updated 
constraints.

See 
https://github.com/apache/airflow/blob/main/README.md#installing-from-pypi for 
details.
---
 constraints-3.10.txt  | 4 ++--
 constraints-3.7.txt   | 4 ++--
 constraints-3.8.txt   | 4 ++--
 constraints-3.9.txt   | 4 ++--
 constraints-no-providers-3.10.txt | 4 ++--
 constraints-no-providers-3.7.txt  | 4 ++--
 constraints-no-providers-3.8.txt  | 4 ++--
 constraints-no-providers-3.9.txt  | 4 ++--
 constraints-source-providers-3.10.txt | 6 +++---
 constraints-source-providers-3.7.txt  | 6 +++---
 constraints-source-providers-3.8.txt  | 6 +++---
 constraints-source-providers-3.9.txt  | 6 +++---
 12 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/constraints-3.10.txt b/constraints-3.10.txt
index dfe02ecc3f..c091006864 100644
--- a/constraints-3.10.txt
+++ b/constraints-3.10.txt
@@ -1,5 +1,5 @@
 #
-# This constraints file was automatically generated on 2022-12-28T17:49:32Z
+# This constraints file was automatically generated on 2022-12-28T20:00:53Z
 # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow.
 # This variant of constraints install uses the HEAD of the branch version for 
'apache-airflow' but installs
 # the providers from PIP-released packages at the moment of the constraint 
generation.
@@ -423,7 +423,7 @@ pinotdb==0.4.12
 pipdeptree==2.3.3
 pipx==1.1.0
 pkginfo==1.9.2
-platformdirs==2.6.1
+platformdirs==2.6.2
 pluggy==1.0.0
 ply==3.11
 plyvel==1.5.0
diff --git a/constraints-3.7.txt b/constraints-3.7.txt
index a0d3f9db8f..98b9ea54da 100644
--- a/constraints-3.7.txt
+++ b/constraints-3.7.txt
@@ -1,5 +1,5 @@
 #
-# This constraints file was automatically generated on 2022-12-28T17:50:04Z
+# This constraints file was automatically generated on 2022-12-28T20:01:31Z
 # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow.
 # This variant of constraints install uses the HEAD of the branch version for 
'apache-airflow' but installs
 # the providers from PIP-released packages at the moment of the constraint 
generation.
@@ -425,7 +425,7 @@ pipdeptree==2.3.3
 pipx==1.1.0
 pkginfo==1.9.2
 pkgutil_resolve_name==1.3.10
-platformdirs==2.6.1
+platformdirs==2.6.2
 pluggy==1.0.0
 ply==3.11
 plyvel==1.5.0
diff --git a/constraints-3.8.txt b/constraints-3.8.txt
index 30688906b2..717b16d20a 100644
--- a/constraints-3.8.txt
+++ b/constraints-3.8.txt
@@ -1,5 +1,5 @@
 #
-# This constraints file was automatically generated on 2022-12-28T17:49:55Z
+# This constraints file was automatically generated on 2022-12-28T20:01:21Z
 # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow.
 # This variant of constraints install uses the HEAD of the branch version for 
'apache-airflow' but installs
 # the providers from PIP-released packages at the moment of the constraint 
generation.
@@ -426,7 +426,7 @@ pipdeptree==2.3.3
 pipx==1.1.0
 pkginfo==1.9.2
 pkgutil_resolve_name==1.3.10
-platformdirs==2.6.1
+platformdirs==2.6.2
 pluggy==1.0.0
 ply==3.11
 plyvel==1.5.0
diff --git a/constraints-3.9.txt b/constraints-3.9.txt
index bb81259e17..4ac0d98675 100644
--- a/constraints-3.9.txt
+++ b/constraints-3.9.txt
@@ -1,5 +1,5 @@
 #
-# This constraints file was automatically generated on 2022-12-28T17:49:54Z
+# This constraints file was automatically generated on 2022-12-28T20:01:21Z
 # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow.
 # This variant of constraints install uses the HEAD of the branch version for 
'apache-airflow' but installs
 # the providers from PIP-released packages at the moment of the constraint 
generation.
@@ -423,7 +423,7 @@ pinotdb==0.4.12
 pipdeptree==2.3.3
 pipx==1.1.0
 pkginfo==1.9.2
-platformdirs==2.6.1
+platformdirs==2.6.2
 pluggy==1.0.0
 ply==3.11
 plyvel==1.5.0
diff --git a/constraints-no-providers-3.10.txt 
b/constraints-no-providers-3.10.txt
index 1ece205362..a0e7244016 100644
--- a/constraints-no-providers-3.10.txt
+++ b/constraints-no-providers-3.10.txt
@@ -1,5 +1,5 @@
 #
-# This constraints file was automatically generated on 2022-12-28T17:46:31Z
+# This constraints file was automatically generated on 2022-12-28T19:57:59Z
 # via "eager-upgrade" mechanism of PIP. For the "main" 

[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058561504


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   > So one second per run it not a big deal and if developer run all tests in 
single module or just one test class, it would be even unnoticeable.
   
   One second is a lot for tests that do not require the DB at all - especially 
if you run them from IDE with re-running single tests after the whole suite is 
run,  The difference is - immediate green/red vs. waiting a second. I often 
forget to remove `--with-db-init` when I add it and this happens only for one 
run - the difference is really noticeable.
   
   Compare those two: 
   
   With db reset:
   
   
https://user-images.githubusercontent.com/595491/209865511-d2815c9f-0747-4445-adbe-0318705fbc1d.mov
   
   Without:
   
   
https://user-images.githubusercontent.com/595491/209865517-0931689a-3491-43ec-824d-8db14100a775.mov
   
   > All others tests run all together, so if one tests from Core affect 
another then most probably we notice this in Core section
   
   Correct. But the same kind of problem happened on multiple occasions for all 
kinds of tests, not only between providers. The mechanism is the same - data 
created in one test (in unrelated package) might impact tests from another 
package. So I think if we attempt to do such isolation, it should be done 
across the board.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


Taragolis commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058557413


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {
+"alibaba",
+"amazon",
+"apache",
+"atlassian",
+"common",
+"dbt",
+"facebook",
+"google",
+"microsoft",
+}
+PROVIDERS_PACKAGES = set()
+
+
+def get_test_provider_name(m):
+"""Extract provider name from module full qualname."""
+_, _, name = m.__name__.partition("providers.")
+for inner_provider in INNER_PROVIDERS:
+if name.startswith(inner_provider):
+return ".".join(name.split(".", 2)[:2])
+return name.split(".", 1)[0]
+
+
+@pytest.fixture(scope="module", autouse=True)
+def _clear_db_between_providers_tests(request):
+"""Clear DB between each separate provider package test runs."""
+provider_name = get_test_provider_name(request.module)
+if provider_name and provider_name not in PROVIDERS_PACKAGES:
+PROVIDERS_PACKAGES.add(provider_name)
+db.clear_db_runs()

Review Comment:
   That what I tried initially
   
   > Use separate clear tests helpers instead of airflow.utils.db.resetdb(). 
locally cause some errors with duplicate constraints.
   
   
 Long error traceback under the spoiler
   
   ``` console
   ❯ breeze --python 3.7 --backend postgres shell --db-reset 
   
   root@ba342530c0bc:/opt/airflow# pytest 
tests/providers/docker/operators/test_docker.py::TestDockerOperator::test_execute
 
tests/providers/slack/hooks/test_slack.py::TestSlackHook::test_token_property_deprecated
 
   
==
 test session starts 
==
   platform linux -- Python 3.7.16, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- 
/usr/local/bin/python
   cachedir: .pytest_cache
   rootdir: /opt/airflow, configfile: pytest.ini
   plugins: cov-4.0.0, asyncio-0.20.3, rerunfailures-9.1.1, instafail-0.4.2, 
anyio-3.6.2, timeouts-1.2.1, xdist-3.1.0, requests-mock-1.10.0, 
capture-warnings-0.0.4, httpx-0.21.2, time-machine-2.8.2
   asyncio: mode=strict
   setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s
   collected 2 items

   
   
   
tests/providers/docker/operators/test_docker.py::TestDockerOperator::test_execute
 ERROR  
 [ 50%]
   
tests/providers/slack/hooks/test_slack.py::TestSlackHook::test_token_property_deprecated
 ERROR  
  [100%]
   
   

 ERRORS 
=
   ___ ERROR at 
setup of TestDockerOperator.test_execute 
___
   
   self = , dialect 
= 
   constructor = >
   statement = 'CREATE UNIQUE INDEX idx_ab_user_username ON ab_user 
(lower(username))', parameters = {}, execution_options = 
immutabledict({'autocommit': True})
   args = (,), kw = {}, branched = , yp = None
   conn = , 
context = 
   cursor = , evt_handled = False
   
   def _execute_context(
   self,
   dialect,
   constructor,
   statement,
   parameters,
   execution_options,
   *args,
   **kw
   ):
   """Create an :class:`.ExecutionContext` and execute, returning
   a :class:`_engine.CursorResult`."""
   
   branched = self
   if self.__branch_from:
   # if this is a "branched" connection, do everything in terms
   # of the "root" 

[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058561504


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   
   > So one second per run it not a big deal and if developer run all tests in 
single module or just one test class, it would be even unnoticeable.
   
   One second is a lot for tests that do not require the DB at all - especially 
if you run them from IDE with re-running single tests after the whole suite is 
run,  The difference is - immediate green/red vs. waiting a second. I often 
forget to remove `--with-db-init` when I add it and this happens only for one 
run - the difference is really noticeable.
   
   Compare those two: 
   
   With db reset:
   
   
https://user-images.githubusercontent.com/595491/209865511-d2815c9f-0747-4445-adbe-0318705fbc1d.mov
   
   Without:
   
   
https://user-images.githubusercontent.com/595491/209865517-0931689a-3491-43ec-824d-8db14100a775.mov
   
   > All others tests run all together, so if one tests from Core affect 
another then most probably we notice this in Core section
   
   But the same kind of problem happened on multiple occasions for all kinds of 
tests, not only between providers. The mechanism is the same - data created in 
one test (in unrelated package) might impact tests from another package. So I 
think if we attempt to do such isolation, it should be done across the board.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


Taragolis commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058557413


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {
+"alibaba",
+"amazon",
+"apache",
+"atlassian",
+"common",
+"dbt",
+"facebook",
+"google",
+"microsoft",
+}
+PROVIDERS_PACKAGES = set()
+
+
+def get_test_provider_name(m):
+"""Extract provider name from module full qualname."""
+_, _, name = m.__name__.partition("providers.")
+for inner_provider in INNER_PROVIDERS:
+if name.startswith(inner_provider):
+return ".".join(name.split(".", 2)[:2])
+return name.split(".", 1)[0]
+
+
+@pytest.fixture(scope="module", autouse=True)
+def _clear_db_between_providers_tests(request):
+"""Clear DB between each separate provider package test runs."""
+provider_name = get_test_provider_name(request.module)
+if provider_name and provider_name not in PROVIDERS_PACKAGES:
+PROVIDERS_PACKAGES.add(provider_name)
+db.clear_db_runs()

Review Comment:
   That what I tried initially
   
   > Use separate clear tests helpers instead of airflow.utils.db.resetdb(). 
locally cause some errors with duplicate constraints.
   
   ``` console
   ❯ breeze --python 3.7 --backend postgres shell --db-reset 
   
   root@ba342530c0bc:/opt/airflow# pytest 
tests/providers/docker/operators/test_docker.py::TestDockerOperator::test_execute
 
tests/providers/slack/hooks/test_slack.py::TestSlackHook::test_token_property_deprecated
 
   
==
 test session starts 
==
   platform linux -- Python 3.7.16, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- 
/usr/local/bin/python
   cachedir: .pytest_cache
   rootdir: /opt/airflow, configfile: pytest.ini
   plugins: cov-4.0.0, asyncio-0.20.3, rerunfailures-9.1.1, instafail-0.4.2, 
anyio-3.6.2, timeouts-1.2.1, xdist-3.1.0, requests-mock-1.10.0, 
capture-warnings-0.0.4, httpx-0.21.2, time-machine-2.8.2
   asyncio: mode=strict
   setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s
   collected 2 items

   
   
   
tests/providers/docker/operators/test_docker.py::TestDockerOperator::test_execute
 ERROR  
 [ 50%]
   
tests/providers/slack/hooks/test_slack.py::TestSlackHook::test_token_property_deprecated
 ERROR  
  [100%]
   
   

 ERRORS 
=
   ___ ERROR at 
setup of TestDockerOperator.test_execute 
___
   
   self = , dialect 
= 
   constructor = >
   statement = 'CREATE UNIQUE INDEX idx_ab_user_username ON ab_user 
(lower(username))', parameters = {}, execution_options = 
immutabledict({'autocommit': True})
   args = (,), kw = {}, branched = , yp = None
   conn = , 
context = 
   cursor = , evt_handled = False
   
   def _execute_context(
   self,
   dialect,
   constructor,
   statement,
   parameters,
   execution_options,
   *args,
   **kw
   ):
   """Create an :class:`.ExecutionContext` and execute, returning
   a :class:`_engine.CursorResult`."""
   
   branched = self
   if self.__branch_from:
   # if this is a "branched" connection, do everything in terms
   # of the "root" connection, *except* for .close(), which is
   

[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


Taragolis commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058554942


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   > but the way it is implemented now, it will also run the auto-fixture even 
if you run a single unit test. This will dramatically slow-down any iterations 
you do I am afraid (and it does not really matter if we do it only for 
Providers or for whole airflow).
   
   A lot of tests right now clear some part of DB between the run, so I can't 
say it dramatically affect of single test.
   
   
   Single test with fixture (3 runs) on Postgres in breeze
   ```console
   
   1 passed in 1.95s
   1 passed in 1.96s
   1 passed in 2.00s
   ```
   
   Single test without fixture (3 runs) on Postgres in breeze
   ```console
   
   1 passed in 1.02s
   1 passed in 0.98s
   1 passed in 0.99s
   ```
   
   So one second per run it not a big deal and if developer run all tests in 
single module or just one test class, it would be even unnoticeable.
   
   > Don't we want to do it for ALL packages that are "leaves" - not only for 
providers (including all airflow packages)?
   
   Initial problem happen because different providers packages could run 
separately for each other, e.g. SSH and Oracle and if developer do some stuff 
wrong we could find it only in main.
   
   All others tests run all together, so if one tests from Core affect another 
then most probably we notice this in Core section



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk opened a new pull request, #28633: Variables set in variables.env are automatically exported

2022-12-28 Thread GitBox


potiuk opened a new pull request, #28633:
URL: https://github.com/apache/airflow/pull/28633

   The variables set in variables.env were not automatically exported.
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a 
newsfragment file, named `{pr_number}.significant.rst` or 
`{issue_number}.significant.rst`, in 
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058533615


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {
+"alibaba",
+"amazon",
+"apache",
+"atlassian",
+"common",
+"dbt",
+"facebook",
+"google",
+"microsoft",
+}
+PROVIDERS_PACKAGES = set()
+
+
+def get_test_provider_name(m):
+"""Extract provider name from module full qualname."""
+_, _, name = m.__name__.partition("providers.")
+for inner_provider in INNER_PROVIDERS:
+if name.startswith(inner_provider):
+return ".".join(name.split(".", 2)[:2])
+return name.split(".", 1)[0]
+
+
+@pytest.fixture(scope="module", autouse=True)
+def _clear_db_between_providers_tests(request):
+"""Clear DB between each separate provider package test runs."""
+provider_name = get_test_provider_name(request.module)
+if provider_name and provider_name not in PROVIDERS_PACKAGES:
+PROVIDERS_PACKAGES.add(provider_name)
+db.clear_db_runs()

Review Comment:
   Hm. I think we can do it with simply `db.resetdb()` (from airflow.utils)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058533615


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {
+"alibaba",
+"amazon",
+"apache",
+"atlassian",
+"common",
+"dbt",
+"facebook",
+"google",
+"microsoft",
+}
+PROVIDERS_PACKAGES = set()
+
+
+def get_test_provider_name(m):
+"""Extract provider name from module full qualname."""
+_, _, name = m.__name__.partition("providers.")
+for inner_provider in INNER_PROVIDERS:
+if name.startswith(inner_provider):
+return ".".join(name.split(".", 2)[:2])
+return name.split(".", 1)[0]
+
+
+@pytest.fixture(scope="module", autouse=True)
+def _clear_db_between_providers_tests(request):
+"""Clear DB between each separate provider package test runs."""
+provider_name = get_test_provider_name(request.module)
+if provider_name and provider_name not in PROVIDERS_PACKAGES:
+PROVIDERS_PACKAGES.add(provider_name)
+db.clear_db_runs()

Review Comment:
   Hm. I thinkwe can do it with simply `db.resetdb()`



##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {
+"alibaba",
+"amazon",
+"apache",
+"atlassian",
+"common",
+"dbt",
+"facebook",
+"google",
+"microsoft",
+}
+PROVIDERS_PACKAGES = set()
+
+
+def get_test_provider_name(m):
+"""Extract provider name from module full qualname."""
+_, _, name = m.__name__.partition("providers.")
+for inner_provider in INNER_PROVIDERS:
+if name.startswith(inner_provider):
+return ".".join(name.split(".", 2)[:2])
+return name.split(".", 1)[0]
+
+
+@pytest.fixture(scope="module", autouse=True)
+def _clear_db_between_providers_tests(request):
+"""Clear DB between each separate provider package test runs."""
+provider_name = get_test_provider_name(request.module)
+if provider_name and provider_name not in PROVIDERS_PACKAGES:
+PROVIDERS_PACKAGES.add(provider_name)
+db.clear_db_runs()

Review Comment:
   Hm. I think we can do it with simply `db.resetdb()`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058531880


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   There is also one side-effect that we should take into account here.  This 
is fine when we run the tests in CI, but the way it is implemented now, it will 
also run the auto-fixture even if you run a single unit test. This will 
dramatically slow-down any iterations you do I am afraid (and it does not 
really matter if we do it only for Providers or for whole airflow).
   
   I think this can be solved by only doing this when we run:
   
   1) for the first time ever  (so some kind of check if db is initialized) 
   2) or always when we run in CI (`CI` env variable set to "true").
   
   
   The 1) would be nice for first-time users - we badly miss it now, the user 
needs to know that they have to run the tests with `--with-db-init` flag. 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058531880


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   There is also one side-effect that we should take into account here.  This 
is fine when we run the tests in CI, but the way it is implemented now, it will 
also run the auto-fixture even if you run a single unit test. This will 
dramatically slow-down any iterations you do I am afraid (and it does not 
really matter if we do it only for Providers or for whole airflow).
   
   I think this can be solved by only doing this when we run:
   
   1) for the first time ever  (so some kind of check if db is initialized) 
   2) or when we run in CI (`CI` env variable set to "true").
   
   
   The 1) would be nice for first-time users - we badly miss it now, the user 
needs to know that they have to run the tests with `--with-db-init` flag. 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058531880


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   There is also one side-effect that we should take into account here.  This 
is fine when we run the tests in CI, but the way it is implemented now, it will 
also run the auto-fixture even if you run a single unit test. This will 
dramatically slow-down any iterations you do I am afraid (and it does not 
really matter if we do it only for Providers or for whole airflow).
   
   I think this can be solved by only doing this when we run for:
   
   1) the first time at all (so some kind of check if db is initialized) 
   2) or when we run in CI (`CI` env variable set to "true").
   
   
   The 1) would be nice for first-time users - we badly miss it now, the user 
needs to know that they have to run the tests with `--with-db-init` flag. 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


Taragolis commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058532316


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   I think we could . Just need to scan directory until found one of the 
folder: `hooks`, `operators`, `sensors`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058531880


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   There is also one side-effect that we should take into account here.  This 
is fine when we run the tests in CI, but the way it is implemented now, it will 
also run the auto-fixture even if you run a single unit test. This will 
dramatically slow-down any iterations you do I am afraid (and it does not 
really matter if we do it only for Providers or for whole airflow).
   
   I think this can be solved by only doing this when we run for:
   
   1) the first time at all (so some kind of check if db is initialized) 
   2) or when we run in CI (`CI` env variable set to "true").
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


Taragolis commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058532316


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   I think we could just need to scan directory until found one of the folder: 
`hooks`, `operators`, `sensors`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058531880


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   There is also one side-effect that we should take into account here.  This 
is fine when we run the tests in CI, but the way it is implemented now, it will 
also run the auto-fixture even if you run a single unit test. This will 
dramatically slow-down any iterations you do I am afraid (and it does not 
really matter if we do it only for Providers or for whole airflow).. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058530854


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   Another comment here @Taragolis . Maybe we do not need that at all. Don't we 
want to do it for ALL packages that are "leaves" - not only for providers 
(including all airflow packages)? I think that would be a nice compromise 
between isolation and speed of execution



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058530854


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   Another comment here @Taragolis . Maybe we do not need that at all. Don't we 
want to do it for ALL packages that are "leaves" - not only for providers? I 
thin that would be a nice compromise between isolation and speed of execution



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #28631: Clear DB between each separate providers tests

2022-12-28 Thread GitBox


potiuk commented on code in PR #28631:
URL: https://github.com/apache/airflow/pull/28631#discussion_r1058530302


##
tests/providers/conftest.py:
##
@@ -0,0 +1,75 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+import pytest
+
+from tests.test_utils import db
+
+# Providers with subpackages
+INNER_PROVIDERS = {

Review Comment:
   We can check for presence of provider.yaml. Or better - rely on 
`generated/provider_dependencies.json`  instead. the provider_dependencies.json 
is guaranteed to be updated by pre-commit and it is far easier to use than 
scanning folders, as well as it's intention is to remain stable when we 
refactor and move the providers.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated (8a23bf47a4 -> 69df1c5d9e)

2022-12-28 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


from 8a23bf47a4 Add note for pymssql requirement (#28625)
 add 69df1c5d9e Speed up provider validation pre-commit (#28541)

No new revisions were added by this update.

Summary of changes:
 .pre-commit-config.yaml   | 5 ++---
 scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py | 6 +-
 scripts/in_container/run_provider_yaml_files_check.py | 5 ++---
 3 files changed, 9 insertions(+), 7 deletions(-)



[GitHub] [airflow] potiuk merged pull request #28541: Speed up provider validation pre-commit

2022-12-28 Thread GitBox


potiuk merged PR #28541:
URL: https://github.com/apache/airflow/pull/28541


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #28280: Inconsistent handling of newlines in tasks

2022-12-28 Thread GitBox


potiuk commented on issue #28280:
URL: https://github.com/apache/airflow/issues/28280#issuecomment-1366846230

   Yep. Let's keep it as the kind of issue that we can send anyone to if they 
stumble upon similar case :).
   
   We can even start counting.
   
Number of people experiencing the less-than-stellar handling of newlines in 
both browser and Jinja: :one: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   3   >