[ 
https://issues.apache.org/jira/browse/RATIS-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Shao Hong updated RATIS-1582:
--------------------------------
    Summary: Add notify install snapshot finished event to inform the finish 
stage  (was: Add notify install snapshot finished event to trigger cleanup of 
snapshot)

> Add notify install snapshot finished event to inform the finish stage
> ---------------------------------------------------------------------
>
>                 Key: RATIS-1582
>                 URL: https://issues.apache.org/jira/browse/RATIS-1582
>             Project: Ratis
>          Issue Type: Improvement
>          Components: snapshot
>            Reporter: Xu Shao Hong
>            Assignee: Xu Shao Hong
>            Priority: Major
>
> Currently, the notify install snapshot would not inform when the whole 
> progress is done
> From the Ozone side, the statemachine's notifyInstallSnapshotFromLeader is a 
> single request and process. It is fine before we find out that the 
> installation of the snapshot could get stuck due to the whole RocksDB 
> replacement each time (the leader could have purged the raft log during 
> transferring the snapshot and thus triggers another snapshot installation 
> when the previous install request is done). To solve this, we come up with 
> the incremental snapshot idea, which could transfer the incremental part of 
> RocksDB in the next install request, and needs to preserve the checkpoints.  
> The incremental snapshot needs to compare the checkpoints and hence the 
> checkpoints cannot be deleted after the first request to install a snapshot.
> The cleanup time of these checkpoints is hard to determine. It is difficult 
> for the follower to tell whether the latest installed snapshot is the last 
> one and apply the logs immediately. The cleanup time depends on the leader's 
> state, and only the leader knows if it is the time to notify the snapshot 
> again or just send append entries. Only when the leader thinks that the 
> follower has already caught up could trigger the cleanup( error case is not 
> included here).
> Thus, we shall have an event to help trigger the cleanup the checkpoints for 
> the Ozone or generally inform the completeness of the install snapshot, which 
> means no more install snapshot requests will be sent and the follower has 
> caught up.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to