Kirill Tkalenko created IGNITE-25501:
----------------------------------------
Summary: Incorrect partition state when entering node with index
greater than current majority after snapshot
Key: IGNITE-25501
URL: https://issues.apache.org/jira/browse/IGNITE-25501
Project: Ignite
Issue Type: Bug
Reporter: Kirill Tkalenko
When analyzing IGNITE-24802, it was discovered that if a snapshot is taken
before stopping the partition leader, thereby disabling raft log suffix
truncations. Then when the node returns, the logs will show the message "FATAL
ERROR: Can't truncate logs before appliedId=LogId [index=26, term=2],
lastIndexKept=0" and the partition will be in a healthy state and it will be
possible to read records from it that should not be there. This needs to be
fixed.
It is possible to simply put the partition in an erroneous state so that the
user can then fix this situation himself using the disaster recovery mechanism.
To reproduce this, a raft snapshot needs to be taken in
*org.apache.ignite.internal.ItTruncateRaftLogAndRestartNodesTest#enterNodeWithIndexGreaterThanCurrentMajority*
before stopping the node with index "2", for example like this
*org.apache.ignite.internal.replicator.Replica#createSnapshotOn*.
It is suggested to add a new test with this behavior, since the current test
has another problem, it will be possible to get the partition state through
*org.apache.ignite.internal.table.distributed.disaster.DisasterRecoveryManager#localTablePartitionStates*.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)