[ 
https://issues.apache.org/jira/browse/IGNITE-20996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denis Chudov updated IGNITE-20996:
----------------------------------
    Description: 
The successful rebalance happens after successful raft configuration change and 
invoke to meta storage with replacing stable assignments with pending ones. 
Also there is the following line in the log:

 
{code:java}
[2023-11-30T09:20:25,028][INFO 
][%iinrt_tcg_0%rebalance-scheduler-1][RebalanceRaftGroupEventsListener] 
Rebalance finished [tablePartitionId=9_part_1, appliedPeers=[Assignment 
[consistentId=iinrt_tcg_0, isPeer=true], Assignment [consistentId=iinrt_tcg_1, 
isPeer=true], Assignment [consistentId=iinrt_tcg_2, isPeer=true], Assignment 
[consistentId=iinrt_tcg_3, isPeer=true]]]{code}
 
But in case when there were errors on FSM on replication, the result of the 
rebalance looks the same, in spite the data was not actually replicated.

Probably there are some problems in raft that allow triggering the 
configuration change event in spite of errors in state machine.

  was:
The successful rebalance happens after successful raft configuration change and 
invoke to meta storage with replacing stable assignments with pending ones. 
Also there is the following line in the logs:

 
{code:java}
[2023-11-30T09:20:25,028][INFO 
][%iinrt_tcg_0%rebalance-scheduler-1][RebalanceRaftGroupEventsListener] 
Rebalance finished [tablePartitionId=9_part_1, appliedPeers=[Assignment 
[consistentId=iinrt_tcg_0, isPeer=true], Assignment [consistentId=iinrt_tcg_1, 
isPeer=true], Assignment [consistentId=iinrt_tcg_2, isPeer=true], Assignment 
[consistentId=iinrt_tcg_3, isPeer=true]]]{code}
 
But in case when there were errors on FSM on replication, the result of the 
rebalance looks the same, in spite the data was not actually replicated.

Probably there are some problems in raft that allow triggering the 
configuration change event in spite of errors in state machine.


> Rebalance can be considered as successful while actual data replication failed
> ------------------------------------------------------------------------------
>
>                 Key: IGNITE-20996
>                 URL: https://issues.apache.org/jira/browse/IGNITE-20996
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Denis Chudov
>            Priority: Major
>              Labels: ignite-3
>
> The successful rebalance happens after successful raft configuration change 
> and invoke to meta storage with replacing stable assignments with pending 
> ones. Also there is the following line in the log:
>  
> {code:java}
> [2023-11-30T09:20:25,028][INFO 
> ][%iinrt_tcg_0%rebalance-scheduler-1][RebalanceRaftGroupEventsListener] 
> Rebalance finished [tablePartitionId=9_part_1, appliedPeers=[Assignment 
> [consistentId=iinrt_tcg_0, isPeer=true], Assignment 
> [consistentId=iinrt_tcg_1, isPeer=true], Assignment 
> [consistentId=iinrt_tcg_2, isPeer=true], Assignment 
> [consistentId=iinrt_tcg_3, isPeer=true]]]{code}
>  
> But in case when there were errors on FSM on replication, the result of the 
> rebalance looks the same, in spite the data was not actually replicated.
> Probably there are some problems in raft that allow triggering the 
> configuration change event in spite of errors in state machine.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to