[ https://issues.apache.org/jira/browse/IGNITE-21194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Denis Chudov updated IGNITE-21194: ---------------------------------- Description: The scenario of this test includes altering the distribution zone. But the subsequent notification about stable assignments at the end of rebalance happens 2 times on the same node, with the same assignments. As a result, redundant partitions are stopped and the storages are deleted on the first event handling, and they are not found on the second one, which causes exceptions. We should investigate why there are 2 records in meta storage about the same stable assignments with different revisions. Seems that the second stable assignments change is triggered by the rebalance raft configuration listener ( RebalanceRaftGroupEventsListener#doOnNewPeersConfigurationApplied ) which is triggered on the configuration changed by the new leader election: {code:java} [2024-01-05T19:18:36,891][INFO ][%iinrt_dosor_1%rebalance-scheduler-0][RebalanceRaftGroupEventsListener] New leader elected. Going to apply new configuration [tablePartitionId=6_part_0, peers=[iinrt_dosor_1], learners=[]]{code} was: The scenario of this test includes altering the distribution zone. But the subsequent notification about stable assignments at the end of rebalance happens 2 times on the same node, with the same assignments. As a result, redundant partitions are stopped and the storages are deleted on the first event handling, and they are not found on the second one, which causes exceptions. We should investigate why there are 2 records in meta storage about the same stable assignments with different revisions. > StorageException in ItIgniteNodeRestartTest#destroyObsoleteStoragesOnRestart > ---------------------------------------------------------------------------- > > Key: IGNITE-21194 > URL: https://issues.apache.org/jira/browse/IGNITE-21194 > Project: Ignite > Issue Type: Bug > Reporter: Denis Chudov > Priority: Major > Labels: ignite-3 > > The scenario of this test includes altering the distribution zone. But the > subsequent notification about stable assignments at the end of rebalance > happens 2 times on the same node, with the same assignments. As a result, > redundant partitions are stopped and the storages are deleted on the first > event handling, and they are not found on the second one, which causes > exceptions. > We should investigate why there are 2 records in meta storage about the same > stable assignments with different revisions. > Seems that the second stable assignments change is triggered by the rebalance > raft configuration listener ( > RebalanceRaftGroupEventsListener#doOnNewPeersConfigurationApplied ) which is > triggered on the configuration changed by the new leader election: > {code:java} > [2024-01-05T19:18:36,891][INFO > ][%iinrt_dosor_1%rebalance-scheduler-0][RebalanceRaftGroupEventsListener] New > leader elected. Going to apply new configuration [tablePartitionId=6_part_0, > peers=[iinrt_dosor_1], learners=[]]{code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)