xBis7 opened a new pull request, #68675:
URL: https://github.com/apache/airflow/pull/68675
<!-- SPDX-License-Identifier: Apache-2.0
https://www.apache.org/licenses/LICENSE-2.0 -->
<!--
Thank you for contributing!
Please provide above a brief description of the changes made in this pull
request.
Write a good git commit message following this guide:
http://chris.beams.io/posts/git-commit/
Please make sure that your code changes are covered with tests.
And in case of new features or big changes remember to adjust the
documentation.
For user-facing UI changes, please attach before/after screenshots (or a
short
screen recording) so reviewers can assess the visual impact.
Feel free to ping (in general) for the review if you do not see reaction for
a few days
(72 Hours is the minimum reaction time you can expect from volunteers) - we
sometimes miss notifications.
In case of an existing issue, reference it using one of the following:
* closes: #ISSUE
* related: #ISSUE
-->
The realization of the code gap and the need for a fix came from a review
discussion under a PR adding new stats
https://github.com/apache/airflow/pull/68213#discussion_r3372689365
The metrics registry YAML file is used by all the functions for applying the
legacy metric emission logic and also generating the metrics docs. It's
important for the YAML to always stay up to date and that's why a pre-commit
hook that enforces it, exists.
Since the `stats` refactoring, all methods delegating to a metrics emitting
backend, they were refactored and converted from class to module-level
functions.
The hook scanner recognizes calls `<obj>.<method>`. But now that all the
methods have been moved to the module level, they are directly importable and
it's going to be very common that someone calls them directly without using the
`stats` namespace. As a result, the hook won't identify these calls as metric
calls that need to be validated.
Here are some examples to explain the issue
* Namespace access, it's recognized by the scanner
```python
from airflow._shared.observability.metrics import stats
stats.incr(...)
```
* Direct access of the module-level function, ignored by the scanner
```python
from airflow._shared.observability.metrics.stats import incr
incr(...)
```
The easiest and most efficient approach is to ban direct imports. In this
patch, the pre-commit hook script has been updated to enforce this. It only
allows direct imports, when wrapped with `try-except` and the caught error is
either `ImportError` or `ModuleNotFoundError` or both. In any other case the
import is flagged.
The import inside such a `try-except` is allowed because it's used under
providers for back-compat checks. For example,
```python
try:
# Check whether a module-level function from stats is importable.
from airflow._shared.observability.metrics.stats import gauge # noqa:
F401
stats_reference = "airflow._shared.observability.metrics.stats"
_executor_name_tag_key = "executor_class_name"
except ImportError:
stats_reference = "airflow.executors.base_executor.Stats"
_executor_name_tag_key = "name"
```
https://github.com/xBis7/airflow/blob/main/providers/cncf/kubernetes/tests/unit/cncf/kubernetes/executors/test_kubernetes_executor.py#L71-L79
## Testing
I added unit-tests that cover all the new functionality. Furthermore, I
added some violating imports in a few files and ran the hook. This is what the
error output looks like
```bash
> prek run check-metrics-synced-with-registry --all-files
Check that metrics in the codebase are in sync with the metrics registry
YAML file........................Failed
- hook id: check-metrics-synced-with-registry
- exit code: 1
Found 1 violation(s).
-> 1 direct import(s) of stats functions (use the `stats` namespace
instead):
airflow-core/src/airflow/jobs/scheduler_job_runner.py line 59:
from airflow._shared.observability.metrics.stats import incr
Replace direct imports with namespace access: `from
<parent>.observability.metrics import stats` and call `stats.<method>(...)`.
Imports inside a `try` block that catches `ImportError` are exempt
(back-compat checks).
```
---
##### Was generative AI tooling used to co-author this PR?
<!--
If generative AI tooling has been used in the process of authoring this PR,
please
change below checkbox to `[X]` followed by the name of the tool, uncomment
the "Generated-by".
-->
- [X] Yes (please specify the tool below)
Claude code, Opus 4.7
<!--
Generated-by: [Tool Name] following [the
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)
-->
---
* Read the **[Pull Request
Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)**
for more information. Note: commit author/co-author name and email in commits
become permanently public when merged.
* For fundamental code changes, an Airflow Improvement Proposal
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
is needed.
* When adding dependency, check compliance with the [ASF 3rd Party License
Policy](https://www.apache.org/legal/resolved.html#category-x).
* For significant user-facing changes create newsfragment:
`{pr_number}.significant.rst`, in
[airflow-core/newsfragments](https://github.com/apache/airflow/tree/main/airflow-core/newsfragments).
You can add this file in a follow-up commit after the PR is created so you
know the PR number.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]