xiangfu0 opened a new pull request, #18253:
URL: https://github.com/apache/pinot/pull/18253

   ## Summary
   
   - Fixes flaky 
`MergeRollupMinionClusterIntegrationTest.testRealtimeTableProcessAllModeMultiLevelConcat`,
 which occasionally fails at `assertTrue(MetricValueUtils.gaugeExists(...))` 
for `mergeRollupTaskNumBucketsToProcess.myTable6_REALTIME.100days`.
   - Root cause: the `mergeRollupTaskNumBucketsToProcess.*` gauges are 
(re)registered and updated only when `PinotTaskManager.scheduleTasks` runs for 
a merge level with no in-flight task. The per-iteration check in the test races 
with (a) the in-flight task's Helix `COMPLETED` transition and (b) the 
segment-lineage commit that follows — producing either a stale value or, more 
rarely, a missed gauge registration.
   - Fix: extract the gauge check into `waitForExpectedNumBucketsToProcess`, 
which polls via `TestUtils.waitForCondition` until both gauges exist and their 
values match the expected tuple. This absorbs the short race window and 
replaces the duplicated inline assertions in both for-loops.
   
   ## Test plan
   
   - [x] `./mvnw test-compile -pl pinot-integration-tests` passes.
   - [x] `./mvnw spotless:apply checkstyle:check license:format license:check 
-pl pinot-integration-tests` clean.
   - [ ] CI runs 
`MergeRollupMinionClusterIntegrationTest.testRealtimeTableProcessAllModeMultiLevelConcat`
 successfully across multiple runs.
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to