rishabhdaim commented on code in PR #1328: URL: https://github.com/apache/jackrabbit-oak/pull/1328#discussion_r1511357714
########## oak-run/src/main/java/org/apache/jackrabbit/oak/run/RevisionsCommand.java: ########## @@ -32,6 +37,7 @@ import java.util.concurrent.TimeUnit; import java.util.concurrent.TimeoutException; import java.util.concurrent.atomic.AtomicBoolean; +import java.util.stream.Collectors; Review Comment: Not used inside code. Please remove this. ########## oak-run/src/main/java/org/apache/jackrabbit/oak/run/RevisionsCommand.java: ########## @@ -74,15 +89,18 @@ public class RevisionsCommand implements Command { private static final Logger LOG = LoggerFactory.getLogger(RevisionsCommand.class); + private static final int REVISION_CAP = getInteger("oak.revision.cap", 250); + private static final String USAGE = Joiner.on(System.lineSeparator()).join( "revisions {<jdbc-uri> | <mongodb-uri>} <sub-command> [options]", "where sub-command is one of", " info give information about the revisions state without performing", " any modifications", - " collect perform garbage collection", - " reset clear all persisted metadata", - " sweep clean up uncommitted changes", - " detailedGC perform detailed garbage collection i.e. remove unmerged branch commits, old revisions, deleted properties etc" + " collect perform garbage collection", + " reset clear all persisted metadata", + " sweep clean up uncommitted changes", + " detailedGC perform detailed garbage collection i.e. remove unmerged branch commits, old revisions, deleted properties etc", + " pathCleanup clean up old/unused revisions and unmerged branch commits on a specific path" Review Comment: we are removing whole detailed garbage here. I would change this description to reflect that it is performing detailedGC on given path. ########## oak-run/src/main/java/org/apache/jackrabbit/oak/run/RevisionsCommand.java: ########## @@ -460,6 +479,55 @@ private void sweep(RevisionsOptions options, Closer closer) SweepHelper.sweep(store, new RevisionContextWrapper(ns, clusterId), seeker); } + private void pathCleanup(RevisionsOptions options, Closer closer) throws IOException { + String path = options.getPath(); + if (path == null || path.isEmpty()) { + System.err.println("path option is required for " + RevisionsOptions.CMD_CLEANUP + " command"); + return; + } + + DocumentNodeStoreBuilder<?> builder = createDocumentMKBuilder(options, closer); + if (builder == null) { + System.err.println("revisions mode only available for DocumentNodeStore"); + return; + } + + DocumentStore documentStore = builder.getDocumentStore(); + builder.setReadOnlyMode(); + useMemoryBlobStore(builder); + DocumentNodeStore documentNodeStore = builder.build(); + String id = org.apache.jackrabbit.oak.plugins.document.util.Utils.getIdFromPath(path); + NodeDocument workingDocument = documentStore.find(NODES, id); + //NodeDocumentRevisionCleaner revisionCleaner = new NodeDocumentRevisionCleaner(documentNodeStore, workingDocument); + + VersionGarbageCollector gc = bootstrapVGC(options, closer, true); + // Set a LoggingGC monitor that will output the detailedGC operations + gc.setGCMonitor(new LoggingGCMonitor(LOG)); + // Run detailedGC on the given document + UpdateOp op = gc.collectGarbageOnDocument(documentNodeStore, workingDocument); Review Comment: Returning `UpdateOp` from detailedGC and handling deletion here is not a good idea, cause we might need to duplicate the code written in `DetailedGC.removeGarbage()` OR we might miss something. In this case, we are missing orphan nodes cause they are handled differently in `detailedGC.removeGarbage()`. I would delegate the collection & removal of garbage to DetailedGC itself. ########## oak-run/src/main/java/org/apache/jackrabbit/oak/run/RevisionsCommand.java: ########## @@ -23,6 +23,11 @@ import java.io.IOException; import java.util.List; import java.util.Locale; +import java.util.Map; +import java.util.Scanner; +import java.util.SortedMap; +import java.util.TreeMap; +import java.util.TreeSet; Review Comment: please remove these un-used imports. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org