[ https://issues.apache.org/jira/browse/OAK-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Dürig updated OAK-3603: ------------------------------- Labels: gc (was: ) > Evaluate skipping cleanup of a subset of tar files > -------------------------------------------------- > > Key: OAK-3603 > URL: https://issues.apache.org/jira/browse/OAK-3603 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segmentmk > Reporter: Alex Parvulescu > Assignee: Alex Parvulescu > Labels: cleanup, gc > > Given the fact that tar readers are immutable (we only create new generations > of them once they reach a certain threshold of garbage) we can consider > coming up with a heuristic for skipping cleanup entirely for consequent > cleanup calls based on the same referenced id set (provided we can make this > set more stable, aka. OAK-2849). > Ex: for a specific input set a cleanup call on a tar reader might decide that > there's no enough garbage (some IO involved in reading through all existing > entries). if the following cleanup cycle would have the exact same input, it > doesn't make sense to recheck the tar file, we already know cleanup can be > skipped, moreover we can skip the older tar files too, as their input would > also not change. the gains increase the larger the number of tar files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)