Alex Parvulescu created OAK-3603:
------------------------------------

             Summary: Evaluate skipping cleanup of a subset of tar files
                 Key: OAK-3603
                 URL: https://issues.apache.org/jira/browse/OAK-3603
             Project: Jackrabbit Oak
          Issue Type: Improvement
          Components: segmentmk
            Reporter: Alex Parvulescu
            Assignee: Alex Parvulescu


Given the fact that tar readers are immutable (we only create new generations 
of them once they reach a certain threshold of garbage) we can consider coming 
up with a heuristic for skipping cleanup entirely for consequent cleanup calls 
based on the same referenced id set (provided we can make this set more stable, 
aka. OAK-2849).

Ex: for a specific input set a cleanup call on a tar reader might decide that 
there's no enough garbage (some IO involved in reading through all existing 
entries). if the following cleanup cycle would have the exact same input, it 
doesn't make sense to recheck the tar file, we already know cleanup can be 
skipped, moreover we can skip the older tar files too, as their input would 
also not change. the gains increase the larger the number of tar files.






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to