[ https://issues.apache.org/jira/browse/OAK-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14358413#comment-14358413 ]
Marcel Reutegger commented on OAK-2557: --------------------------------------- I have a couple of suggestions: In VersionGarbageCollector: {noformat} if (log.isDebugEnabled() && docIdsToDelete.size < 1000) { {noformat} Rather call {{docIdsToDelete.getSize()}}? Maybe even promote NodeDocIdCollector to a top level class to avoid breaking encapsulation? NodeDocIdCollector.sort() uses a hard coded Comparator when sorting in memory instead of the instance passed in the constructor. NodeDocIdCollector.flushToFile() uses {{PrintWriter.println()}}. This method does not throw an IOException when the write fails. I think it would be better to use BufferedWriter directly. VersionGCState.close() deletes the the directory before resources are closed. I think this will fail on Windows based machines. > VersionGC uses way too much memory if there is a large pile of garbage > ---------------------------------------------------------------------- > > Key: OAK-2557 > URL: https://issues.apache.org/jira/browse/OAK-2557 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core, mongomk > Affects Versions: 1.0.11 > Reporter: Stefan Egli > Assignee: Chetan Mehrotra > Priority: Blocker > Fix For: 1.0.13, 1.2 > > Attachments: OAK-2557-2.patch, OAK-2557.patch > > > It has been noticed that on a system where revision-gc > (VersionGarbageCollector of mongomk) did not run for a few days (due to not > interfering with some tests/large bulk operations) that there was such a > large pile of garbage accumulating, that the following code > {code} > VersionGarbageCollector.collectDeletedDocuments > {code} > in the for loop, creates such a large list of NodeDocuments to delete > (docIdsToDelete) that it uses up too much memory, causing the JVM's GC to > constantly spin in Full-GCs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)