jason810496 commented on code in PR #64209:
URL: https://github.com/apache/airflow/pull/64209#discussion_r2998790109


##########
shared/providers_discovery/src/airflow_shared/providers_discovery/providers_discovery.py:
##########
@@ -29,12 +30,11 @@
 from time import perf_counter
 from typing import Any, NamedTuple, ParamSpec, Protocol, cast
 
-import structlog
 from packaging.utils import canonicalize_name
 
 from ..module_loading import entry_points_with_dist
 
-log = structlog.getLogger(__name__)
+log = logging.getLogger(__name__)

Review Comment:
   I change it intentionally, I found that the `structlog.getLogger(__name__)` 
in this case is not respecting the `AIRFLOW__LOGGING__LOGGING_LEVEL` log level, 
switching from `structlog` to `logging` does make the logging respect 
`AIRFLOW__LOGGING__LOGGING_LEVEL` again.
   
   Or I will get the debug logging from `airflow config list --default` even I 
set `AIRFLOW__LOGGING__LOGGING_LEVEL=WARNING` , which cause the `prek run 
check-default-configuration --all-files` fail in the previous CI run.
   
   The output of `airflow config list --default` before this change: showing 
debugging logging that cause the further `airflow config lint` parsing error 
and not respect `AIRFLOW__LOGGING__LOGGING_LEVEL`.
   
   ```
   Generated default config for debugging:
   
   2026-03-27T03:41:39.306693Z [debug    ] Setting up DB connection pool (PID 
21) [airflow.settings] loc=settings.py:401
   2026-03-27T03:41:39.497327Z [debug    ] Initializing Provider 
Manager[provider_configs] 
[airflow.sdk._shared.providers_discovery.providers_discovery] 
loc=providers_discovery.py:285
   2026-03-27T03:41:39.497432Z [debug    ] Initializing Provider Manager[list] 
[airflow.sdk._shared.providers_discovery.providers_discovery] 
loc=providers_discovery.py:285
   2026-03-27T03:41:39.508459Z [debug    ] Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.ssh.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-ssh 
[airflow.sdk._shared.providers_discovery.providers_discovery] 
loc=providers_discovery.py:322
   2026-03-27T03:41:39.509265Z [debug    ] Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.papermill.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package 
apache-airflow-providers-papermill 
[airflow.sdk._shared.providers_discovery.providers_discovery] 
loc=providers_discovery.py:322
   # ...
   
   [core]
   # The folder where your airflow pipelines live, most likely a
   # subfolder in a code repository. This path must be absolute.
   #
   # Variable: AIRFLOW__CORE__DAGS_FOLDER
   #
   # dags_folder = 
   ```
   
   The output of `airflow config list --default` after this change: no more 
debug logging and respect the `AIRFLOW__LOGGING__LOGGING_LEVEL`.
   
   ```
   [core]
   # The folder where your airflow pipelines live, most likely a
   # subfolder in a code repository. This path must be absolute.
   #
   # Variable: AIRFLOW__CORE__DAGS_FOLDER
   #
   # dags_folder = 
   ```
   
   
   ---
   
   Root cause: providers_discovery.py used structlog.getLogger() directly. 
Before structlog.configure() is called (which happens later in 
settings.py:726), structlog's default PrintLogger writes to stdout with no 
level filtering. So debug logs during early provider
     discovery pollute the stdout of airflow config list --default, corrupting 
the generated config file.
   
   Fix: Switched to logging.getLogger() (stdlib). stdlib logging defaults to 
WARNING level and writes to stderr, so debug logs are suppressed and stdout 
stays clean. This is also the correct pattern for shared library code — 
structlog configuration is the application's
     responsibility.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to