[
https://issues.apache.org/jira/browse/IGNITE-24513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vladimir Pligin reassigned IGNITE-24513:
----------------------------------------
Assignee: Mirza Aliev
> HA: stable is not expected after recovered availability and node restarts
> --------------------------------------------------------------------------
>
> Key: IGNITE-24513
> URL: https://issues.apache.org/jira/browse/IGNITE-24513
> Project: Ignite
> Issue Type: Bug
> Reporter: Mirza Aliev
> Assignee: Mirza Aliev
> Priority: Major
> Labels: MakeTeamcityGreenAgain, ignite-3
>
> See
> {{ItHighAvailablePartitionsRecoveryByFilterUpdateTest#testSeveralHaResetsAndSomeNodeRestart}}
> - the test that covers this scenario.
> *Precondition*
> * Create a zone in HA mode (7 nodes, A, B, C, D, E, F, G) - phase 1
> * Insert data and wait for replication to all nodes.
> * Stop a majority of nodes (4 nodes A, B, C, D)
> * Wait for the partition to become available (E, F, G), no new writes
> - phase 2
> * Stop a majority of nodes once again (E, F)
> * Wait for the partition to become available (G), no new writes -
> phase 3
> * Stop the last node G
> * Start one node from phase 1, A
> * Start one node from phase 3, G
> * Start one node from phase 2, E
> * No data should be lost (reads from partition on A and E must be
> consistent with G)
> *Result*
> Before last step we check that stable is A, G, E, but it times out with
> stable equals to G
>
> *Expected result*
> Stable is A, G, E after restart A, G, E
> h3. Implementation notes
> First of all, for debug purposes, I would simplify test to restart only A and
> G, and assert that stable is (A, G)
> The second thought is to check if scale Up after A and G are restarted is
> scheduled. And also check that there is no redundant partition reset actions,
> I bet we have reset after nodes are restarted, because we check majority
> using replica factor but not the actual stable size
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)