[ 
https://issues.apache.org/jira/browse/IGNITE-20187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-20187:
-------------------------------------
    Description: 
h3. Motivation

Prior to the implementation of the meta storage compaction and the related node 
restart updates, the node restored its volatile state in terms of assignments 
through ms.watches starting from APPLIED_REVISION + 1. Meaning that after the 
restart, the node was notified about missing state through {*}the events{*}. 
However, it's no longer true: new logic assumes that the node will register 
ms.watch starting from APPLIED_REVISION + X + 1 and will manually read local 
meta storage state for APPLIED_REVISION +X along with related processing. The 
implementation of the above process is the essence of this ticket.
h3. Definition of Done

Within node restart process, TableManager or similar should manually read local 
assignments pending keys (reading assignments stable will be covered in a 
separate ticket) and schedule corresponding rebalance.
h3. Implementation Notes

It's possible that assignemnts.pending keys will be stale at the moment of 
processing, so in order to overcome given issue following 
common-for-current-rebalance steps are proposed:
 # Start all new needed nodes {{partition.assignments.pending / 
partition.assignments.stable}}
 # After successful starts - check if current node is the leader of raft group 
(leader response must be updated by current term), if it is
 # Read distributed {{partition.assignments.pending }}and if the retrieved 
revision is less or equal to the one retrieved within initial local read run 
RaftGroupService#changePeersAsync(leaderTerm, peers) 
RaftGroupService#changePeersAsync from old terms must be skipped.

Seems that 
https://github.com/apache/ignite-3/blob/main/modules/table/tech-notes/rebalance.md
 should be also updated a bit.

  was:
h3. Motivation

Prior to the implementation of the meta storage compaction and the related node 
restart updates, the node restored its volatile state in terms of assignments 
through ms.watches starting from APPLIED_REVISION + 1. Meaning that after the 
restart, the node was notified about missing state through {*}the events{*}. 
However, it's no longer true: new logic assumes that the node will register 
ms.watch starting from APPLIED_REVISION + X + 1 and will manually read local 
meta storage state for APPLIED_REVISION +X along with related processing. The 
implementation of the above process is the essence of this ticket.
h3. Definition of Done

Within node restart process, TableManager or similar should manually read local 
assignments pending keys (reading assignments stable will be covered in a 
separate ticket) and schedule corresponding rebalance.
h3. Implementation Notes

It's possible that assignemnts.pending keys will be stale at the moment of 
processing, so in order to overcome given issue following 
common-for-current-rebalance steps are proposed:
 # Start all new needed nodes {{partition.assignments.pending / 
partition.assignments.stable}}
 # After successful starts - check if current node is the leader of raft group 
(leader response must be updated by current term), if it is
 # Read distributed {{partition.assignments.pending }}and if the retrieved 
revision is less or equal to the one retrieved within initial local read run{{ 
}}{{{}RaftGroupService#changePeersAsync(leaderTerm, peers){}}}{{{}. 
{}}}{{RaftGroupService#changePeersAsync}}{{ from old terms must be skipped.}}

Seems that 
https://github.com/apache/ignite-3/blob/main/modules/table/tech-notes/rebalance.md
 should be also updated a bit.


> Catch-up rebalance on node restart: assignments keys
> ----------------------------------------------------
>
>                 Key: IGNITE-20187
>                 URL: https://issues.apache.org/jira/browse/IGNITE-20187
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Alexander Lapin
>            Priority: Major
>              Labels: ignite-3
>
> h3. Motivation
> Prior to the implementation of the meta storage compaction and the related 
> node restart updates, the node restored its volatile state in terms of 
> assignments through ms.watches starting from APPLIED_REVISION + 1. Meaning 
> that after the restart, the node was notified about missing state through 
> {*}the events{*}. However, it's no longer true: new logic assumes that the 
> node will register ms.watch starting from APPLIED_REVISION + X + 1 and will 
> manually read local meta storage state for APPLIED_REVISION +X along with 
> related processing. The implementation of the above process is the essence of 
> this ticket.
> h3. Definition of Done
> Within node restart process, TableManager or similar should manually read 
> local assignments pending keys (reading assignments stable will be covered in 
> a separate ticket) and schedule corresponding rebalance.
> h3. Implementation Notes
> It's possible that assignemnts.pending keys will be stale at the moment of 
> processing, so in order to overcome given issue following 
> common-for-current-rebalance steps are proposed:
>  # Start all new needed nodes {{partition.assignments.pending / 
> partition.assignments.stable}}
>  # After successful starts - check if current node is the leader of raft 
> group (leader response must be updated by current term), if it is
>  # Read distributed {{partition.assignments.pending }}and if the retrieved 
> revision is less or equal to the one retrieved within initial local read run 
> RaftGroupService#changePeersAsync(leaderTerm, peers) 
> RaftGroupService#changePeersAsync from old terms must be skipped.
> Seems that 
> https://github.com/apache/ignite-3/blob/main/modules/table/tech-notes/rebalance.md
>  should be also updated a bit.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to