ivanandika98 opened a new pull request, #4371:
URL: https://github.com/apache/ozone/pull/4371

   ## What changes were proposed in this pull request?
   
   Currently Ozone Manager enables `raft.server.log.purge.upto.snapshot.index` 
by default.
   However, for OM cluster with large metadata store, there might be a case 
where OM leader purge its Ratis logs before a slow follower replicated it to 
its log. This means that the follower needs to download the whole metadata 
store from the OM leader. This can be problematic if the metadata store in 
leader is too large.
   
   We should add two configurations in OM to enable/disable Ratis purge 
parameters:
   
   - `raft.server.log.purge.upto.snapshot.index`
     - Disabling this would guarantee that the OM leader will not purge its 
Ratis log unless all the logs have been replicated to all the followers 
(through commitIndex).
     - This would effectively means that there shouldn't be a case where the 
slow follower needs to download the full metadata from the leader. So no 
snapshot down from follower. For small OM metadata, it can be faster for 
follower to download the leader's metadata snapshot than normally replicating 
and applying the outstanding logs.
     - For a very slow follower / downed follower, the OM leader cannot purge 
the log until the follower catch up to it. This might increase the disk space 
usage for OM leader.
     - Default would be `true` to preserve the current OM snapshot behavior
   - `raft.server.log.purge.preservation.log.num`
     - [RATIS-1626](https://issues.apache.org/jira/browse/RATIS-1626) 
introduces logic to preserve the latest n won't-be-purged logs
     - Setting n > 0 while still enabling 
raft.server.log.purge.upto.snapshot.index should balance a between the cost of 
preserving & transferring logs and the cost of transferring snapshot.
     - Default would be 0 to preserve the current OM snapshot behavior
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-8131
   
   ## How was this patch tested?
   
   Should have already be covered in Ratis test.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to