jason810496 commented on code in PR #64209:
URL: https://github.com/apache/airflow/pull/64209#discussion_r2998790109
##########
shared/providers_discovery/src/airflow_shared/providers_discovery/providers_discovery.py:
##########
@@ -29,12 +30,11 @@
from time import perf_counter
from typing import Any, NamedTuple, ParamSpec, Protocol, cast
-import structlog
from packaging.utils import canonicalize_name
from ..module_loading import entry_points_with_dist
-log = structlog.getLogger(__name__)
+log = logging.getLogger(__name__)
Review Comment:
I change it intentionally, I found that the `structlog.getLogger(__name__)`
in this case is not respecting the `AIRFLOW__LOGGING__LOGGING_LEVEL` log level,
switching from `structlog` to `logging` does make the logging respect
`AIRFLOW__LOGGING__LOGGING_LEVEL` again.
Or I will get the debug logging from `airflow config list --default` even I
set `AIRFLOW__LOGGING__LOGGING_LEVEL=WARNING` , which cause the `prek run
check-default-configuration --all-files` fail in the previous CI run.
The output of `airflow config list --default` before this change: showing
debugging logging that cause the further `airflow config lint` parsing error
and not respect `AIRFLOW__LOGGING__LOGGING_LEVEL`.
```
Generated default config for debugging:
2026-03-27T03:41:39.306693Z [debug ] Setting up DB connection pool (PID
21) [airflow.settings] loc=settings.py:401
2026-03-27T03:41:39.497327Z [debug ] Initializing Provider
Manager[provider_configs]
[airflow.sdk._shared.providers_discovery.providers_discovery]
loc=providers_discovery.py:285
2026-03-27T03:41:39.497432Z [debug ] Initializing Provider Manager[list]
[airflow.sdk._shared.providers_discovery.providers_discovery]
loc=providers_discovery.py:285
2026-03-27T03:41:39.508459Z [debug ] Loading
EntryPoint(name='provider_info',
value='airflow.providers.ssh.get_provider_info:get_provider_info',
group='apache_airflow_provider') from package apache-airflow-providers-ssh
[airflow.sdk._shared.providers_discovery.providers_discovery]
loc=providers_discovery.py:322
2026-03-27T03:41:39.509265Z [debug ] Loading
EntryPoint(name='provider_info',
value='airflow.providers.papermill.get_provider_info:get_provider_info',
group='apache_airflow_provider') from package
apache-airflow-providers-papermill
[airflow.sdk._shared.providers_discovery.providers_discovery]
loc=providers_discovery.py:322
# ...
[core]
# The folder where your airflow pipelines live, most likely a
# subfolder in a code repository. This path must be absolute.
#
# Variable: AIRFLOW__CORE__DAGS_FOLDER
#
# dags_folder =
```
The output of `airflow config list --default` after this change: no more
debug logging and respect the `AIRFLOW__LOGGING__LOGGING_LEVEL`.
```
[core]
# The folder where your airflow pipelines live, most likely a
# subfolder in a code repository. This path must be absolute.
#
# Variable: AIRFLOW__CORE__DAGS_FOLDER
#
# dags_folder =
```
---
Root cause: providers_discovery.py used structlog.getLogger() directly.
Before structlog.configure() is called (which happens later in
settings.py:726), structlog's default PrintLogger writes to stdout with no
level filtering. So debug logs during early provider
discovery pollute the stdout of airflow config list --default, corrupting
the generated config file.
Fix: Switched to logging.getLogger() (stdlib). stdlib logging defaults to
WARNING level and writes to stderr, so debug logs are suppressed and stdout
stays clean. This is also the correct pattern for shared library code —
structlog configuration is the application's
responsibility.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]