[
https://issues.apache.org/jira/browse/IGNITE-28830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anton Vinogradov updated IGNITE-28830:
--------------------------------------
Ignite Flags: (was: Release Notes Required)
> Flaky CacheGroupsMetricsRebalanceTest#testRebalanceProgressUnderLoad:
> EVT_CACHE_REBALANCE_STOPPED listener is registered after the joining node
> starts rebalancing
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: IGNITE-28830
> URL: https://issues.apache.org/jira/browse/IGNITE-28830
> Project: Ignite
> Issue Type: Bug
> Reporter: Anton Vinogradov
> Priority: Major
> Fix For: 2.19
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> {{testRebalanceProgressUnderLoad()}} registers an
> {{EVT_CACHE_REBALANCE_STOPPED}} listener via
> {{ignite.events().localListen(...)}} *after* {{startGrid(4)}}. The joining
> node begins rebalancing during {{start()}}, so the event can fire before the
> listener is registered. The event is then missed, the {{CountDownLatch}} is
> never counted down, and the unbounded {{latch.await()}} hangs until the 600s
> test timeout.
> Reproduces in ~7% of runs; a thread dump shows the test runner parked in
> {{latch.await()}}.
> h3. Fix
> Register the listener through node configuration
> ({{IgniteConfiguration.setLocalEventListeners}}) so it is active before the
> node joins and starts rebalancing. Both paths funnel into
> {{GridEventStorageManager.registerListener}} and neither enables the event
> type, so delivery is identical — the listener is simply active earlier.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)