xiangfu0 opened a new pull request, #18253: URL: https://github.com/apache/pinot/pull/18253
## Summary - Fixes flaky `MergeRollupMinionClusterIntegrationTest.testRealtimeTableProcessAllModeMultiLevelConcat`, which occasionally fails at `assertTrue(MetricValueUtils.gaugeExists(...))` for `mergeRollupTaskNumBucketsToProcess.myTable6_REALTIME.100days`. - Root cause: the `mergeRollupTaskNumBucketsToProcess.*` gauges are (re)registered and updated only when `PinotTaskManager.scheduleTasks` runs for a merge level with no in-flight task. The per-iteration check in the test races with (a) the in-flight task's Helix `COMPLETED` transition and (b) the segment-lineage commit that follows — producing either a stale value or, more rarely, a missed gauge registration. - Fix: extract the gauge check into `waitForExpectedNumBucketsToProcess`, which polls via `TestUtils.waitForCondition` until both gauges exist and their values match the expected tuple. This absorbs the short race window and replaces the duplicated inline assertions in both for-loops. ## Test plan - [x] `./mvnw test-compile -pl pinot-integration-tests` passes. - [x] `./mvnw spotless:apply checkstyle:check license:format license:check -pl pinot-integration-tests` clean. - [ ] CI runs `MergeRollupMinionClusterIntegrationTest.testRealtimeTableProcessAllModeMultiLevelConcat` successfully across multiple runs. 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
