Xushaohong opened a new pull request, #647: URL: https://github.com/apache/ratis/pull/647
## What changes were proposed in this pull request? Currently, the notify install snapshot would not inform when the whole progress is done From the Ozone side, the statemachine's notifyInstallSnapshotFromLeader is a single request and process. It is fine before we find out that the installation of the snapshot could get stuck due to the whole RocksDB replacement each time (the leader could have purged the raft log during transferring the snapshot and thus triggers another snapshot installation when the previous install request is done). To solve this, we come up with the incremental snapshot idea, which could transfer the incremental part of RocksDB in the next install request, and needs to preserve the checkpoints. The incremental snapshot needs to compare the checkpoints and hence the checkpoints cannot be deleted after the first request to install a snapshot. The cleanup time of these checkpoints is hard to determine. It is difficult for the follower to tell whether the latest installed snapshot is the last one and apply the logs immediately. The cleanup time depends on the leader's state, and only the leader knows if it is the time to notify the snapshot again or just send append entries. Only when the leader thinks that the follower has already caught up could trigger the cleanup( error case is not included here). Thus, we shall have an event to help trigger the cleanup of the checkpoints for the Ozone or generally inform the completeness of the install snapshot, which means no more install snapshot requests will be sent and the follower has caught up. We trigger this event for both the leader and the follower. As for the leader 1. when the leader receives the snapshot result `SNAPSHOT_INSTALLED` 2. when the leader receives the snapshot result `SNAPSHOT_UNAVAILABLE` As for the follower 1. when the follower tries appending new entries after successfully installed one snapshot for the first time 2. when the follower knows the statemachine's snapshot is unavailable ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/RATIS-1582 ## How was this patch tested? Manual test for ozone. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
