[jira] [Updated] (IGNITE-19692) Design Resilient Distributed Operations mechanism

Sergey Chugunov (Jira) Fri, 09 Jun 2023 01:22:06 -0700


     [ 
https://issues.apache.org/jira/browse/IGNITE-19692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sergey Chugunov updated IGNITE-19692:
-------------------------------------
    Epic Link: IGNITE-18733

> Design Resilient Distributed Operations mechanism
> -------------------------------------------------
>
>                 Key: IGNITE-19692
>                 URL: https://issues.apache.org/jira/browse/IGNITE-19692
>             Project: Ignite
>          Issue Type: Task
>            Reporter: Roman Puchkovskiy
>            Priority: Major
>              Labels: ignite-3
>             Fix For: 3.0.0-beta2
>
>
> We need a mechanism that would allow to do the following:
>  # Execute an operation on all (or some of) partitions of a table
>  # The whole operation is split into sub-operations (each of which operate on 
> a single partition)
>  # Each sub-operation must be resilient: that is, if the node that hosts it 
> restarts or the partition moves to another node, the operation should proceed
>  # When a sub-operation ends, it notifies the operation tracker/coordinator
>  # When all sub-operations end, the tracker might take some action (like 
> starting a subsequent operation)
>  # The tracker is also resilient
> We need such a mechanism in a few places in the system:
>  # Transaction cleanup?
>  # Index build
>  # Table data validation as a part of a schema change that requires a 
> validation (like a narrowing type change)
> Probably, more applications of the mechanism will emerge.
>  
> On the possible implementation: the tracker could be collocated with table's 
> primary replica (that would guarantee that at most one tracker exists at all 
> times). We could store the data needed to track the operation in the 
> Meta-Storage under a prefix corresponding to the table, like 
> 'ops.<tableId>.<opType>.<opKey>'. We could store the completion status for 
> each of the partitions there along with some operation-wide status.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-19692) Design Resilient Distributed Operations mechanism

Reply via email to