[ 
https://issues.apache.org/jira/browse/OAK-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996622#comment-14996622
 ] 

Alex Parvulescu commented on OAK-3603:
--------------------------------------

this depends on a stable set or referenced ids, so we'd need to fix OAK-3602 
first.

> Evaluate skipping cleanup of a subset of tar files
> --------------------------------------------------
>
>                 Key: OAK-3603
>                 URL: https://issues.apache.org/jira/browse/OAK-3603
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: segmentmk
>            Reporter: Alex Parvulescu
>            Assignee: Alex Parvulescu
>
> Given the fact that tar readers are immutable (we only create new generations 
> of them once they reach a certain threshold of garbage) we can consider 
> coming up with a heuristic for skipping cleanup entirely for consequent 
> cleanup calls based on the same referenced id set (provided we can make this 
> set more stable, aka. OAK-2849).
> Ex: for a specific input set a cleanup call on a tar reader might decide that 
> there's no enough garbage (some IO involved in reading through all existing 
> entries). if the following cleanup cycle would have the exact same input, it 
> doesn't make sense to recheck the tar file, we already know cleanup can be 
> skipped, moreover we can skip the older tar files too, as their input would 
> also not change. the gains increase the larger the number of tar files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to