[
https://issues.apache.org/jira/browse/IGNITE-27097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Roman Puchkovskiy updated IGNITE-27097:
---------------------------------------
Description:
When starting to accept a Raft snapshot to a replica, we set lastAppliedIndex
on all storages of the replica to -1 (aka REBALANCE_IN_PROGRESS).
When recovering the corresponding table in TableManager, we check whether any
of its storages has lastAppliedIndex set to REBALANCE_IN_PROGRESS. If this is
true, we understand that the rebalance was initiated but was not able to
complete, so we clear the storages.
But for per-zone mode (aka colocation mode) we don't check for
REBALANCE_IN_PROGRESS in tx state storage of the starting zone partition
replica. This should be fixed.
There is also another potential problem. Imagine that we find that MV storage's
index is REBALANCE_IN_PROGRESS while tx state storage's is not. We initiate
cleaning of both of the storages. MV storage gets cleaned and persisted, but tx
state storage doesn't; then the electricity goes away. We did not clean the tx
state storage, but we lost information that we needed to clean the replica
storages up.
Please note that currently non-colocation mode is being removed, so the
corresponding code could disappear from TableManager to the moment when this
ticket is taken to work.
> Fully support REBALANCE_IN_PROGRESS case with colocation
> --------------------------------------------------------
>
> Key: IGNITE-27097
> URL: https://issues.apache.org/jira/browse/IGNITE-27097
> Project: Ignite
> Issue Type: Improvement
> Reporter: Roman Puchkovskiy
> Priority: Major
> Labels: ignite-3
>
> When starting to accept a Raft snapshot to a replica, we set lastAppliedIndex
> on all storages of the replica to -1 (aka REBALANCE_IN_PROGRESS).
> When recovering the corresponding table in TableManager, we check whether any
> of its storages has lastAppliedIndex set to REBALANCE_IN_PROGRESS. If this is
> true, we understand that the rebalance was initiated but was not able to
> complete, so we clear the storages.
> But for per-zone mode (aka colocation mode) we don't check for
> REBALANCE_IN_PROGRESS in tx state storage of the starting zone partition
> replica. This should be fixed.
> There is also another potential problem. Imagine that we find that MV
> storage's index is REBALANCE_IN_PROGRESS while tx state storage's is not. We
> initiate cleaning of both of the storages. MV storage gets cleaned and
> persisted, but tx state storage doesn't; then the electricity goes away. We
> did not clean the tx state storage, but we lost information that we needed to
> clean the replica storages up.
> Please note that currently non-colocation mode is being removed, so the
> corresponding code could disappear from TableManager to the moment when this
> ticket is taken to work.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)