[
https://issues.apache.org/jira/browse/HADOOP-19920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18089029#comment-18089029
]
ASF GitHub Bot commented on HADOOP-19920:
-----------------------------------------
pan3793 opened a new pull request, #8549:
URL: https://github.com/apache/hadoop/pull/8549
<!--
Thanks for sending a pull request!
1. If this is your first time, please read our contributor guidelines:
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
2. Make sure your PR title starts with JIRA issue id, e.g.,
'HADOOP-17799. Your PR title ...'.
-->
### Description of PR
Fix flaky tests like
https://github.com/kokonguyen191/hadoop/actions/runs/27467848524/job/81194100665
```
Error: Errors:
Error: org.apache.hadoop.metrics2.sink.TestPrometheusMetricsSink.testPublish
Error: Run 1: TestPrometheusMetricsSink.testPublish:60 » Metrics Metrics
source TestMetrics already exists!
Error: Run 2: TestPrometheusMetricsSink.testPublish:60 » Metrics Metrics
source TestMetrics already exists!
Error: Run 3: TestPrometheusMetricsSink.testPublish:60 » Metrics Metrics
source TestMetrics already exists!
[INFO]
Error:
org.apache.hadoop.metrics2.sink.TestPrometheusMetricsSink.testPublishFlush
Error: Run 1: TestPrometheusMetricsSink.testPublishFlush:159 The first
metric should not exist after flushing ==> expected: <false> but was: <true>
Error: Run 2: TestPrometheusMetricsSink.testPublishFlush:137 » Metrics
Metrics source TestMetrics already exists!
Error: Run 3: TestPrometheusMetricsSink.testPublishFlush:137 » Metrics
Metrics source TestMetrics already exists!
[INFO]
Error:
org.apache.hadoop.metrics2.sink.TestPrometheusMetricsSink.testPublishMultiple
Error: Run 1: TestPrometheusMetricsSink.testPublishMultiple:112 The
expected first metric line is missing from prometheus metrics output ==>
expected: <true> but was: <false>
Error: Run 2: TestPrometheusMetricsSink.testPublishMultiple:95 » Metrics
Metrics source TestMetrics1 already exists!
Error: Run 3: TestPrometheusMetricsSink.testPublishMultiple:95 » Metrics
Metrics source TestMetrics1 already exists!
[INFO]
[INFO]
Error: Tests run: 5385, Failures: 0, Errors: 3, Skipped: 201
```
The root cause is the refCount-based global singleton:
- `DefaultMetricsSystem.instance()` returns a JVM-global singleton shared by
all tests in hadoop-common.
- `init()` short-circuits (returns early without incrementing refCount) if
monitoring is already `true`.
- `shutdown()` only clears `allSources`/`allSinks` when refCount hits 0.
So if any test (this class's own methods, or any other metrics test in the
module) throws before its inline `stop()`/`shutdown()`, the singleton stays
`monitoring=true` with leaked sources. Subsequent tests' `init()` becomes a
no-op, sources never get cleared, and you get cascading "Metrics source
TestMetrics already exists!" plus stale-data assertion failures — exactly what
the CI shows (including on surefire reruns).
### How was this patch tested?
Pass GHA.
### For code changes:
- [x] Does the title or this PR starts with the corresponding JIRA issue id
(HADOOP-19920)?
- [ ] Object storage: have the integration tests been executed and the
endpoint declared according to the connector-specific documentation?
- [ ] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`,
`NOTICE-binary` files?
### AI Tooling
Contains content generated by Claude Opus 4.8
> Fix flaky TestPrometheusMetricsSink
> -----------------------------------
>
> Key: HADOOP-19920
> URL: https://issues.apache.org/jira/browse/HADOOP-19920
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: test
> Reporter: Cheng Pan
> Priority: Major
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]