[
https://issues.apache.org/jira/browse/IGNITE-19692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sergey Chugunov updated IGNITE-19692:
-------------------------------------
Epic Link: IGNITE-18733
> Design Resilient Distributed Operations mechanism
> -------------------------------------------------
>
> Key: IGNITE-19692
> URL: https://issues.apache.org/jira/browse/IGNITE-19692
> Project: Ignite
> Issue Type: Task
> Reporter: Roman Puchkovskiy
> Priority: Major
> Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> We need a mechanism that would allow to do the following:
> # Execute an operation on all (or some of) partitions of a table
> # The whole operation is split into sub-operations (each of which operate on
> a single partition)
> # Each sub-operation must be resilient: that is, if the node that hosts it
> restarts or the partition moves to another node, the operation should proceed
> # When a sub-operation ends, it notifies the operation tracker/coordinator
> # When all sub-operations end, the tracker might take some action (like
> starting a subsequent operation)
> # The tracker is also resilient
> We need such a mechanism in a few places in the system:
> # Transaction cleanup?
> # Index build
> # Table data validation as a part of a schema change that requires a
> validation (like a narrowing type change)
> Probably, more applications of the mechanism will emerge.
>
> On the possible implementation: the tracker could be collocated with table's
> primary replica (that would guarantee that at most one tracker exists at all
> times). We could store the data needed to track the operation in the
> Meta-Storage under a prefix corresponding to the table, like
> 'ops.<tableId>.<opType>.<opKey>'. We could store the completion status for
> each of the partitions there along with some operation-wide status.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)