[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874915#comment-16874915 ] Julian Reschke commented on OAK-4780: - trunk: (1.7.0) [r1792051|http://svn.apache.org/r1792051] [r1791681|http://svn.apache.org/r1791681] [r1790796|http://svn.apache.org/r1790796] > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke >Priority: Major > Fix For: 1.7.0, 1.8.0 > > Attachments: OAK-4780-core-1.patch, OAK-4780-core-2.diff, > OAK-4780-core-4.diff, OAK-4780-core-5.diff, OAK-4780-core.diff, > OAK-4780-rdb.diff, leafnodes-v2.diff, leafnodes-v3.diff, leafnodes.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976627#comment-15976627 ] Marcel Reutegger commented on OAK-4780: --- Fixed a potential division by zero: http://svn.apache.org/r1792051 > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > Fix For: 1.7.0, 1.8 > > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff, > OAK-4780-core-1.patch, OAK-4780-core-2.diff, OAK-4780-core-4.diff, > OAK-4780-core-5.diff, OAK-4780-core.diff, OAK-4780-rdb.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976459#comment-15976459 ] Marcel Reutegger commented on OAK-4780: --- Created OAK-6109 for the revisions command in oak-run proposed by [~stefan.eissing]. > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > Fix For: 1.7.0, 1.8 > > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff, > OAK-4780-core-1.patch, OAK-4780-core-2.diff, OAK-4780-core-4.diff, > OAK-4780-core-5.diff, OAK-4780-core.diff, OAK-4780-rdb.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970990#comment-15970990 ] Julian Reschke commented on OAK-4780: - (moved the RDB extensions to OAK-6083 so we can close this one) > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff, > OAK-4780-core-1.patch, OAK-4780-core-2.diff, OAK-4780-core-4.diff, > OAK-4780-core-5.diff, OAK-4780-core.diff, OAK-4780-rdb.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970979#comment-15970979 ] Julian Reschke commented on OAK-4780: - trunk: [r1791681|http://svn.apache.org/r1791681] [r1790796|http://svn.apache.org/r1790796] > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff, > OAK-4780-core-1.patch, OAK-4780-core-2.diff, OAK-4780-core-4.diff, > OAK-4780-core-5.diff, OAK-4780-core.diff, OAK-4780-rdb.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15962693#comment-15962693 ] Marcel Reutegger commented on OAK-4780: --- [~reschke], I leave this issue open because I did not apply the RDB specific patch. Please apply if you think it is ready and resolve this issue. > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff, > OAK-4780-core-1.patch, OAK-4780-core-2.diff, OAK-4780-core-4.diff, > OAK-4780-core-5.diff, OAK-4780-core.diff, OAK-4780-rdb.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15962685#comment-15962685 ] Marcel Reutegger commented on OAK-4780: --- Looks good to me. I applied a slightly modified patch to trunk: http://svn.apache.org/r1790796 - Committed fixes for VersionGCDeletionTest separately - Use MongoDB count() method with read preference to avoid usage of method only available with 3.4 driver - Add serialVersionUID to LimitExceededException > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff, > OAK-4780-core-1.patch, OAK-4780-core-2.diff, OAK-4780-core-4.diff, > OAK-4780-core-5.diff, OAK-4780-core.diff, OAK-4780-rdb.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15946925#comment-15946925 ] Julian Reschke commented on OAK-4780: - [~stefan.eissing] - could you please bring your fork up-to-date with trunk, produce a patch, and attach it over here? > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff, > OAJK-4780-rdb.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15932275#comment-15932275 ] Marcel Reutegger commented on OAK-4780: --- bq. You mean the class itself or special VersionGarbageCollector tests? I assume the former. The former. Tests for the new TimeInterval class also help to better understand its contract. E.g. are bounds inclusive? > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15930041#comment-15930041 ] Julian Reschke commented on OAK-4780: - bq. At the end of all iterations, the VersionGarbageCollector returns a VersionGCStats that accumulates data from all iterations made. Only the stopwatches are currently not, as Stopwatch lacks methods to add time from other watches. And re-using the same instances in each iteration will show the incorrect data for each iteration. The code sort of works: {noformat} package org.apache.jackrabbit.oak.plugins.document; import java.util.concurrent.TimeUnit; import com.google.common.base.Stopwatch; import com.google.common.base.Ticker; public class StopWatchTest { public static void main(String[] args) throws InterruptedException { final Stopwatch w1 = Stopwatch.createStarted(); Thread.sleep(123); w1.stop(); System.out.println(w1); final Stopwatch w2 = Stopwatch.createStarted(); Thread.sleep(456); w2.stop(); System.out.println(w2); Stopwatch sum = Stopwatch.createStarted(new Ticker() { private boolean calledOnce = false; @Override public long read() { if (calledOnce) { return w1.elapsed(TimeUnit.NANOSECONDS) + w2.elapsed(TimeUnit.NANOSECONDS); } else { calledOnce = true; return 0; } } }); sum.stop(); System.out.println(sum); } } {noformat} output: {noformat} 123,2 ms 456,5 ms 579,7 ms {noformat} > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15930035#comment-15930035 ] Julian Reschke commented on OAK-4780: - bq. At the end of all iterations, the VersionGarbageCollector returns a VersionGCStats that accumulates data from all iterations made. Only the stopwatches are currently not, as Stopwatch lacks methods to add time from other watches. And re-using the same instances in each iteration will show the incorrect data for each iteration. The code below works (for some value of "works"): {noformat} public static void main(String[] args) throws InterruptedException { final Stopwatch w1 =Stopwatch.createStarted(); Thread.sleep(123); w1.stop(); System.out.println(w1); final Stopwatch w2 =Stopwatch.createStarted(); Thread.sleep(456); w2.stop(); System.out.println(w2); Stopwatch sum = Stopwatch.createStarted(new Ticker() { private boolean calledOnce = false; @Override public long read() { if (calledOnce) { return w1.elapsed(TimeUnit.MILLISECONDS) + w2.elapsed(TimeUnit.MILLISECONDS); } else { calledOnce = true; return 0; } } }); sum.stop(); System.out.println(sum); } {noformat} > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15929996#comment-15929996 ] Stefan Eissing commented on OAK-4780: - And I will make two patches, one for oak-core and one for oak-run on top of that. The second one will get its own ticket. > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15929993#comment-15929993 ] Stefan Eissing commented on OAK-4780: - bq. VersionGarbageCollector.reset() can be simplified with just the remove() call. It will be a noop if the document doesn't exist. ack. bq. Can you please add tests for TimeInterval? You mean the class itself or special VersionGarbageCollector tests? I assume the former. bq. Did you consider moving the new set methods on VersionGarbageCollector to a new class (e.g. VersionGCOptions) and pass it as an argument to gc()? I think with the current patch it is possible to influence a running GC by calling one of those set methods. That is true. Will add some such immutable options object that is part of an active run. bq. What is the TODO about in VersionGCStats.addRun()? At the end of all iterations, the VersionGarbageCollector returns a VersionGCStats that accumulates data from all iterations made. Only the stopwatches are currently not, as Stopwatch lacks methods to add time from other watches. And re-using the same instances in each iteration will show the incorrect data for each iteration. bq. Usage of LimitExceededException from javax.naming is a bit funky but I guess you didn't want to invent yet another exception class ;-) Will add a new internal exception class. bq. VersionGarbageCollector.delayOnModification() should use Clock.waitUntil(). This allows to write efficient tests with a virtual clock. ack. bq. Only minor: the diff for VersionGarbageCollector also contains a couple of indentation changes for anonymous inner classes, which are unrelated to this improvement. I will cleanup the diffs before attaching them here. Idea is sometimes overeager to reformat things. bq. In MongoVersionGCSupport.getDeletedOnceCount(): ReadPreference.nearest().secondaryPreferred(). You cannot have both nearest and secondaryPreferred. The class will always give you a secondaryPreferred ReadPreference. ack. will change. bq. Minor: some unused imports in VersionGCSupport ack. > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15928304#comment-15928304 ] Marcel Reutegger commented on OAK-4780: --- This looks very promising. I'd like to include those changes step by step. That is, first the VersionGC part in oak-core and in a second step the new run mode for oak-run. I would even prefer if the second part goes into a separate issue. Regarding your github branch. It contains a 'patches' directory with two diffs. What are those changes? Some more comments: - VersionGarbageCollector.reset() can be simplified with just the remove() call. It will be a noop if the document doesn't exist. - Can you please add tests for TimeInterval? - Did you consider moving the new set methods on VersionGarbageCollector to a new class (e.g. VersionGCOptions) and pass it as an argument to gc()? I think with the current patch it is possible to influence a running GC by calling one of those set methods. - What is the TODO about in VersionGCStats.addRun()? - Usage of LimitExceededException from javax.naming is a big funky ;) but I guess you didn't want to invent yet another exception class - VersionGarbageCollector.delayOnModification() should use Clock.waitUntil(). This allows to write efficient tests with a virtual clock. - Only minor: the diff for VersionGarbageCollector also contains a couple of indentation changes for anonymous inner classes, which are unrelated to this improvement. - In MongoVersionGCSupport.getDeletedOnceCount(): {{ReadPreference.nearest().secondaryPreferred()}}. You cannot have both nearest and secondaryPreferred. The class will always give you a secondaryPreferred ReadPreference. - Minor: some unused imports in VersionGCSupport > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15904841#comment-15904841 ] Stefan Eissing commented on OAK-4780: - Result from run of current version against a large customer database: h4. Run Timings {code} Started: 2017-03-09 18:39:22.787 Ended: 2017-03-10 06:56:01.721 Duration: 12.28 hours Stats: VersionGCStats{ignoredGCDueToCheckPoint=true, canceled=true, deletedDocGCCount=2011642 (of which leaf: 687880), updateResurrectedGCCount=25229600, splitDocGCCount=92, intermediateSplitDocGCCount=0, iterationCount=60, timeActive=12,28 h, timeToCollectDeletedDocs=0,000 ns, timeToCheckDeletedDocs=0,000 ns, timeToSortDocIds=0,000 ns, timeTakenToUpdateResurrectedDocs=0,000 ns, timeTakenToDeleteDeletedDocs=0,000 ns, timeTakenToCollectAndDeleteSplitDocs=0,000 ns} {code} _(stopwatches are not accumulative over runs atm, so they show 0 for the total)_ The RGC run was initiated by: {code} java -jar target/oak-run-1.8-SNAPSHOT.jar revisions mongodb://localhost/ collect {code} It performed 60 iterations over almost 12 hours (the database has ~250 million nodes of which 11% had {{_deletedOnce}} set), starting with: {code} 18:39:22.789 INFO start 1. run (avg duration 0.0 sec) 18:39:22.792 DEBUG No lastOldestTimestamp found, querying for the oldest deletedOnce candidate 18:39:37.861 DEBUG lastOldestTimestamp found: 2016-10-20 17:47:49.999 18:39:50.491 DEBUG deletedOnce candidates: 27241631 found, 95000 preferred, scope now [2016-10-20 17:47:49.999, 2016-10-21 05:31:15.835] ... 18:39:54.469 DEBUG successful run using 0.0% of limit, raising recommended interval to 63308 seconds 18:39:54.471 INFO Revision garbage collection finished in 3,973 s. VersionGCStats{ignoredGCDueToCheckPoint=false, canceled=false, deletedDocGCCount=0 (of which leaf: 0), updateResurrectedGCCount=9, splitDocGCCount=0, intermediateSplitDocGCCount=0, iterationCount=0, timeActive=31,68 s, timeToCollectDeletedDocs=3,935 s, timeToCheckDeletedDocs=2,953 ms, timeToSortDocIds=0,000 ns, timeTakenToUpdateResurrectedDocs=19,46 ms, timeTakenToDeleteDeletedDocs=0,000 ns, timeTakenToCollectAndDeleteSplitDocs=13,27 ms} {code} until the last one: {code} 06:56:01.717 INFO start 60. run (avg duration 749.118 sec) 06:56:01.717 DEBUG previous runs recommend a 68783 sec duration, scope now [2017-01-29 21:09:00.734, 2017-01-30 16:15:24.416] 06:56:01.717 WARN Ignoring RGC run because a valid checkpoint [revision: "r159ebd85640-0-4", clusterId: 4, time: "2017-01-29 21:09:00.736"] exists inside minimal scope [2017-01-29 21:09:00.734, 2017-01-29 21:10:00.734]. {code} h4. Time Interval Handling The runs enforced the default collection limit (10), so that no disk file sorting was necessary. Because the delete candidates were not evenly distributed, it aborted 15 runs and halfed the time interval used. (those runs did still delete leaf nodes and reset _deletedOnce for visible documents encountered, so they are not "lost"). The time interval was again raised by a factor of 1.5 whenever the collected nodes (non-leaf) stayed below 66% of the limit. This happened initially on every run, as they encountered only nodes where _deletedOnce needed to be reset. Intervals grew from ~9 hours to 13 days, then needed to shrink to cleanup the lump of really deleted non-leaf docs, using 13.25 minutes as shortest scope for a run. In detail: {code} 18:39:54.469 used 0.0% of limit, rec. interval of 63308 seconds 18:39:54.485 used 0.0% of limit, rec. interval of 94963 seconds 18:39:54.497 used 0.0% of limit, rec. interval of 142444 seconds 18:39:54.509 used 0.0% of limit, rec. interval of 213667 seconds 18:44:01.693 used 0.0% of limit, rec. interval of 320500 seconds 18:44:01.924 used 0.0% of limit, rec. interval of 480750 seconds 18:52:36.998 used 0.0% of limit, rec. interval of 721126 seconds 19:11:03.765 used 0.0% of limit, rec. interval of 1081689 seconds 00:27:51.562 used 0.0% of limit, rec. interval of 1622534 seconds 01:08:25.855 used 0.0% of limit, rec. interval of 2433801 seconds 03:41:52.568 used 0.0% of limit, rec. interval of 3650701 seconds 06:37:03.949 limit exceeded, reducing interval to 762539 seconds 06:41:30.360 used 0.0% of limit, rec. interval of 1143809 seconds 06:42:27.120 limit exceeded, reducing interval to 381269 seconds 06:42:48.380 used 0.0% of limit, rec. interval of 571904 seconds 06:42:51.860 limit exceeded, reducing interval to 190634 seconds 06:43:06.386 limit exceeded, reducing interval to 95317 seconds 06:43:21.085 used 0.0% of limit, rec. interval of 142976 seconds 06:43:23.816 limit exceeded, reducing interval to 71488 seconds 06:43:40.364 used 7.8% of limit, rec. interval of 107232 seconds 06:43:43.166 limit exceeded, reducing interval to 53616 seconds 06:43:59.157 limit exceeded, reducing interval to 13404 seconds 06:43:55.608 limit
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903456#comment-15903456 ] Julian Reschke commented on OAK-4780: - I'm tracking the Stefan's changes in https://github.com/reschke/jackrabbit-oak/tree/revision-garbage-collector - where I include the RDB specific changes taken from OAK-5855. > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15901533#comment-15901533 ] Stefan Eissing commented on OAK-4780: - Updated https://github.com/apache/jackrabbit-oak/compare/trunk...icing:revision-garbage-collector?expand=1 with some of the promised changes. Most relevant, this patch now adds the command 'revisions' to the oak-run suite. {{java -jar target/oak-run-1.8-SNAPSHOT.jar revisions mongodb://host/dbname}} Without further arguments will just print information about last run, recommended parameters and number of delete candidates found. This is the command {{info}}, which is default. {{collect}} will run the real revision garbage collection. There are some options, for example {{--once}} to only run a single iteration. The other command currently supported is {{reset}} which clears all persisted meta information from rgc. On a sample customer base with ~250 million nodes and ~25 million delete candidates overall, a query for candidates which does not select any node runs for about 10 minutes on my developer machine on this database. My initial algorithm to find the oldest time did finish, but took over an hour. I made a mongo specific query implementation which takes about 3 minutes to find the same result. Since this is normally only run once, this seems fine. It now runs here with unlimited iterations. I will report back tomorrow how it went. > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899710#comment-15899710 ] Stefan Eissing commented on OAK-4780: - * I will merge current trunk when you have added your OAK-3070 changes. * Revision.getCurrentTimestamp() will change that to store.getClock().getTime() * {{maxIterations}} is only good for testing, {{maxDuration}} is good when people want to run it in a maintenance window. I know the goal could be to have it run all the time, but {{maxDuration}} is for the fallback. * {{batchDelay}} as factor - interesting. Will do. > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899438#comment-15899438 ] Marcel Reutegger commented on OAK-4780: --- bq. Shall it repeat itself when it has not caught up to "now" I'd say, yes. If needed, the GC can be canceled already. bq. What is the best value for "precisionMs", the minimal time interval for queries? I don't think a one minute resolution is needed. Maybe it's easier we define how many iterations are done to find the 'oldest' _deletedOnce? But then a time is more specific than a rather abstract number of iterations. Other comments on your patch: - We should first resolve OAK-3070 and remove that part from your patch. - VersionGCSupport.getOldestDeletedOnceTimestamp(long) uses System.currentTimeMillis(). Might be useful to use the Clock abstraction instead, which allows usage of a virtual clock for tests. - Similar for VersionGarbageCollector.gc(long, TimeUnit): Revision.getCurrentTimestamp() does give you the current time of a Clock, but I think it would be better to use the clock from the DocumentNodeStore passed in the constructor. - {{maxIterations}} and {{maxDuration}}: are those really necessary? I think it would be easier to use if those are implementation details and all you need to do is trigger gc() with a maxRevisionAge. The GC would stop iterations when it reaches currentTime - maxRevisionAge or when it is canceled. - {{batchDelay}}, I like the feature, but would prefer a more adaptive approach. That is, have a value that defines the delay multiplier which is applied to the time it took for some operation. Let's say it took 500 ms to remove a batch of documents and the delay multiplier is 0.5, then the VGC would wait 250 ms until it proceeds to the next bach. > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15894387#comment-15894387 ] Stefan Eissing commented on OAK-4780: - Updated my github clone with the following: * configure ```maxIterations``` that a gc run is allowed to make (default 0 == no limit) * configure ```maxDuration``` that a gc run might take (default 0 == no limit) * configure ```batchDelay``` that gc shall sleep between modification batches (default == 0, no delay) added test case for cleanup in iterations. The idea how to use these configuration parameters is: * use ```maxIterations``` only in test setups where one wants to check the immediate results * use ```maxDuration``` when the gc runs in a daily (weekly?) maintenance window, e.g. during the night and shall stop iterating when working hours resume. * use ```batchDelay``` when gc shall run during busy times or all the time, e.g. on 24/7 systems. A small delay should prevent the gc from taking over the write locks (on db/table/index), depending on database used. > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15890290#comment-15890290 ] Stefan Eissing commented on OAK-4780: - Here https://github.com/apache/jackrabbit-oak/compare/trunk...icing:revision-garbage-collector?expand=1 is a version 0.1 of an incremental, adapting VersionGarbageCollector strategy. Key points: - it tracks the lastOldestTimestamp, just like OAK-3070 - it queries only against the newly calculated timestamp (just like trunk now) - it uses a precisionMs for the minimal time interval to use, configurable - it enforces a maximum number of collected documents (to be sorted), configurable The algorithm tries to find the best time interval, calculated from the last successful run or the oldest revision to be found, so that Garbage Collection may run without hitting the limit on the amount of documents to be sorted. If collection limit is reached, the run is stopped and the time interval for the next run is reduced by 50%. So, this shrinks very fast if it encounters a time with many positive delete candidates. It grows a bit slower when the number of collected docs stayed significantly below the limit. It has no preferred maintenance interval. It can be called repeatedly. Questions: 1. Shall it repeat itself when it has not caught up to "now" (minus MaxRevisionAge). 2. What is the best value for "precisionMs", the minimal time interval for queries? > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875914#comment-15875914 ] Julian Reschke commented on OAK-4780: - bq. Julian Reschke any estimation how long that "eventually" will take on a never cleaned up 140TB cluster? And what would the GC do when this strategy does not lead to success during a maintenance interval? Too many variables in that question. bq. Measurements from large clusters showed that the collect phase of a GC can take as long as 4 hours - only processing changes from the last day. How would this work? Let's say there are 10 million node candidates daily. How would you configure the limit to operate in this environment? The proposal is about cases where the VGC hasn't been run regularly. If it *does* run regularly but still can't keep up, we have a different problem, right? bq. How would the same setting work in a cluster never cleaned up (worst case, I know)? There seem to be two choices: restrict ourselves to a maintenance window, which may mean that we'll never recover. Or allow to run beyond the maintenance window. (FWIW, we currently do not have a defined window, just an interval and the hope that one run has finished before the next is supposed to start) > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875897#comment-15875897 ] Stefan Eissing commented on OAK-4780: - [~reschke] any estimation how long that "eventually" will take on a never cleaned up 140TB cluster? And what would the GC do when this strategy does not lead to success during a maintenance interval? Measurements from large clusters showed that the collect phase of a GC can take as long as 4 hours - only processing changes from the last day. How would this work? Let's say there are 10 million node candidates daily. How would you configure the limit to operate in this environment? How would the same setting work in a cluster never cleaned up (worst case, I know)? > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875870#comment-15875870 ] Julian Reschke commented on OAK-4780: - Here's an approach that might be simpler but in the end achieves the same goal: - set a limit for the collection phase, both for elapsed time and # of documents - when limit reached, sort the collected IDs by modified date, and compute a new upper limit so that half of the documents become out of range; throw these entries as well - continue the collection with the smaller time window (this just needs an internal API that allows to specify the _id to start with) - compute new limit for elapsed time (half of the original?) Eventually, we should have a set of documents that we *can* garbage collect. Finally, if maintenance window still open, just rerun the GC again. > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15874699#comment-15874699 ] Stefan Eissing commented on OAK-4780: - Assuming there is the following in place: 1. Time/Revision of last successful run (as patch proposed in OAK-3070) 2. Reset of _deletedOnce flags for nodes that have become visible/valid again (not ticket) 3. remove leaf nodes early (implemented via OAK-5571) An incremental strategy needs to adapt to various situations: Case A: the RGC starts and finds the Timestamp(TS)/Revision from the last successful run * A.1: the TS is from last night which should be the 99% case * A.2: the TS is from several days/weeks/months before. Possible causes: - a) the cluster was down for that time period - b) the cluster was down every time the RGC was supposed to run (e.g. other system maintenance colliding with daily maintenance window) - c) the cluster is in a state where the RGC never finishes successfully - d) a backup of the database was restored Case B: the RGC starts and does not find any data from a previous run. That means * B.1: its a new installation and it runs the first time * B.2: its an updated installation where an earlier version of the RGC - a) has run successfully before - b) never did run (successfully) The question is when to apply which strategy to make RGC reach the stable case A.1 again. Assuming a common customer is one who has his system running 24/7 (more or less) and has configured a "maintenance window" periodically (every day/night or every weekend) where tasks as RGC are supposed to run. During this time interval, load on the system by RGC is acceptable. On other times not. The degenerate case is where the system produces so much to clean up that the maintenance time interval is insufficient to perform the necessary tasks. The installation needs either a larger maintenance window or more CPU/database power or another RGC strategy than the one discussed here. --> TODO Let's focus on how to do the task under the assumption that it is possible to do A.1 in the time given, plus a safety margin of XX% for fluctuations. And that XX% needs to be large enough to allow the system recovery from the other scenarios. Example: the maintenance window has a safety margin of 100%, meaning that its duration is twice as large as the average work that needs to be done by the RGC after a maintenance period (let's assume daily). 100% margin means that the RGC can do 2 days of work during on maintenance run. Looking at the scenarios above, this means in * A.1: RGC is finished in half the time on average * A.2.a: RGC will catch up all work, next run will find A.1 case * A.2.b: RGC is able to catch up one day per run. If it did not run for N days, it will take N-1 days to reach A.1 * A.2.c: panic mode, red alert * A.2.d: will reach A.1 on the next run * B.1: will reach A.1 on the next run * B.2.a: could reach A.1, question is how * B.2.b: will need several maintenance runs to catch up to A.1, depending on amount of nodes to clean up, but how exactly? And then there is the case where the 100% margin on maintenance time is only valid on *average* and that there can be more cleanup work after a busy day. It could be more work than what is possible to do in one maintenance run. Folding A and B cases: If the TS information is missing, the RGC should select the oldest (_modified) node with _deletedOnce and use that as TS (maybe an hour before that). TODO: find out the cost of such a query on our large scale test clusters With A/B folded, RGC start with a last TS and a MaxRevisionAge setting that gives a time duration TD. In the normal case A.1, TD < 24 hours (or whatever the maintenance interval is). In the abnormal cases, this can be several times as long. Assuming the maintenance time is configured correctly, the chunks that can be safely tackled by RGC are 1 maintenance interval. For ease of description, let's assume this to be a day. So, we have {code} MP := Maintenance Period (e.g. 24 hours) MD := Allowed Maintenance Duration (e.g. 4 hours) MStart := time RGC started MEnd := MStart + MD LMax := max duration of run (0 at start) while Now() < MEnd - LMax: TS := as persisted or oldest _deletedOnce node TD := MStart - MaxRevisionAge - TS TM := Min( TD, MP ) collectAndInspectNopes({ _deletedOnce: 1, TS < _modified < ( TS + TM ) }) save TS := TS + TM LMax := Max( LMax, duration of this run) {code} This tries to cleanup nodes in the time interval of 1 Maintenance period, repeatedly, until the time for maintenance runs out (by taking the max time of each iteration into account). Example: the daily maintenance has 2 hours, the last run was 3 days ago. In the first iteration, the RGC would collect nodes del
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871632#comment-15871632 ] Julian Reschke commented on OAK-4780: - To cover the case of a VGC that has never run, or one that hasn't run for a long time, we'll need a way to constrain the scope of the scan to a smaller time window. One method would be, as previously proposed, to time-box the collection phase, to track the lastmod range upon reaching the limit, to give up and try with a smaller time window. An alternative might be to extend the {{VersionGCSupport}} even more and to allow just returning the *count* of candidate nodes. With that information, we might be able to shrink the time window of the candidate scan *before* actually trying to collect the nodes. > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871594#comment-15871594 ] Julian Reschke commented on OAK-4780: - [~stefan.eissing] - OAK-5571 improves the situation, but IMHO doesn't completely reach the goal. Optimally, the version GC will "eventually" be able to complete garbage collection, no matter for how long it did not run before. > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871558#comment-15871558 ] Stefan Eissing commented on OAK-4780: - [~reschke] With OAK-5571 committed in trunk, this one can be closed, right? > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15849819#comment-15849819 ] Stefan Eissing commented on OAK-4780: - +1 for OAK-5571 In regards to resilience, the VGC needs improvements when it # encounters and needs to cope with a huge amounts of delete candidates # needs to work "during the day" where other vital task also need capacity One such improvement was already discussed, namely tagging the repository with the timestamp of the last successful run and only collecting delete candidates in that time interval. Another idea is to adjust the collected time interval so that the amount of collected nodes can be kept in check. This can be either done: # statically, by setting a max time interval to clean. Example: max is 1.5 days, last clean was 5 days ago, so collect only nodes modified between -5 days and -3.5 days. # dynamically, store the proposed time interval for collection in the repository. Collect nodes in that interval since the last cleanup. On reaching a threshold, e.g. 100.000, abort collecting, shrink the time interval and try again. When it runs through, process the next interval. Compare the amount collected each time and grow the time interval if there is room. Another idea would be to sleep() in between delete batches, in order to avoid spamming the database. > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15849614#comment-15849614 ] Julian Reschke commented on OAK-4780: - As the proposed change really is not about making things run incrementally, I have opened a separate ticket for *this* specific change: OAK-5571. > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15828487#comment-15828487 ] Julian Reschke commented on OAK-4780: - OK, so here's a question: why can't we delete leaf nodes that have previous document eagerly (as long as we make sure we remove the previous documents later)? > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15828482#comment-15828482 ] Julian Reschke commented on OAK-4780: - Yep, mine. As long as it's not committed... > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15828405#comment-15828405 ] Stefan Eissing commented on OAK-4780: - [~mreutegg], thanks for the review. I clearly lay the blame of the System.err.println() on Julian's patch. Will look tomorrow into the other things. > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15828372#comment-15828372 ] Marcel Reutegger commented on OAK-4780: --- [~stefan.eissing], thanks for the patch. A couple of comments. - There's a System.err.println() in GCJob.gc() - I would prefer Guava's collection factory methods instead of diamond operations. That makes it easier to backport changes to branches that are still on Java 6. I'm not suggesting we should backport this change to 1.2. This is just a general note. - The check whether a document references previous documents can be simplified to: {{doc.getPreviousRanges().isEmpty()}}. - Please add tests for the new functionality > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15828227#comment-15828227 ] Stefan Eissing commented on OAK-4780: - Ah, yes. worked in the change that split leaf nodes do not get deleted right away. > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff, leafnodes-v2.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15827799#comment-15827799 ] Marcel Reutegger commented on OAK-4780: --- The VGC will abort when an operation fails with an exception, yes. Orphaned previous documents will get collected in a next run when the VGC collects old previous documents. See {{VersionGCSupport.deleteSplitDocuments()}}. > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15827723#comment-15827723 ] Julian Reschke commented on OAK-4780: - [~mreutegg] - another question - am I right that if any of the delete operations fails, the whole VGC will abort? What would happen with "orphaned" prev docs then? > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15827716#comment-15827716 ] Stefan Eissing commented on OAK-4780: - Since Julian is currently busy with other things, I will give it a shot, take his patch and continue along those lines. I plan to keep the whole things single-threaded as several people said that parallel deletes against DBs like mongo do not benefit performance. > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15827717#comment-15827717 ] Julian Reschke commented on OAK-4780: - Yes, work in progress - something else came up and thus I posted what I currently have. The tricky part is to actually take advantage of the leaf node characteristics without making the code change too intrusive. > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15827707#comment-15827707 ] Marcel Reutegger commented on OAK-4780: --- I like the idea of deleting leaf documents eagerly. AFAICS, the patch still deletes leaf documents in the last phase of the garbage collection. Is this just work in progress? The garbage collector could remove a batch of leaf documents as soon as it reaches a threshold and it could collect those paths is a in-memory set. To simplify things, it is probably best to only consider leaf documents without links to previous documents. The garbage collector must only remove previous documents when the associated main document was successfully deleted (i.e. no concurrent modification of the document). > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > Attachments: leafnodes.diff > > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821886#comment-15821886 ] Julian Reschke commented on OAK-4780: - Maybe we can shortcut certain operations for nodes where _children != true? In those cases, deletion order really doesn't matter, right? (We just counted nodes on a large instance, and approximately 1/3 of the nodes with _deletedOnce == true were leaf nodes) > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4780) VersionGarbageCollector should be able to run incrementally
[ https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15818348#comment-15818348 ] Julian Reschke commented on OAK-4780: - ...or by adding logic that pushes the lastmod boundary if the collection phase finds to many deletion candidates... > VersionGarbageCollector should be able to run incrementally > --- > > Key: OAK-4780 > URL: https://issues.apache.org/jira/browse/OAK-4780 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Julian Reschke > > Right now, the documentmk's version garbage collection runs in several phases. > It first collects the paths of candidate nodes, and only once this has been > successfully finished, starts actually deleting nodes. > This can be a problem when the regularly scheduled garbage collection is > interrupted during the path collection phase, maybe due to other maintenance > tasks. On the next run, the number of paths to be collected will be even > bigger, thus making it even more likely to fail. > We should think about a change in the logic that would allow the GC to run in > chunks; maybe by partitioning the path space by top level directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)