Hi Andreas Without looking at any more detail, I noticed that in your invocation I believe the flag name is misspelled: "—trheads=16" should probably be "—threads=16", unless it is misspelled in the implementation.
Please confirm that this is NOT the issue. Thanks! Regards Julian On Fri, May 24, 2024 at 8:21 PM Andreas Schaefer <schaef...@me.com.invalid> wrote: > > Hi > > I already posted a question on Jackrabbit Users but did not get a response so > far. That said we changed the approach and ran into new issues with the > TarBall Compaction. > > Our Segment Store on an AEM 6.5.6 (oak.core v. 1.22.4) is about 700GB and was > not compacted for many years. > > We tested that we can run the compaction with 1.62.0 without any side-effects > and so we started it this way with JDK 11: > > java \ > -Dtar.memoryMapped=true \ > -Doak.compaction.eagerFlush=true \ > -Dlogback.configurationFile=logback-compaction.xml \ > -jar oak-run-1.62.0.jar \ > —compactor=parallel \ > —trheads=16 \ > <path to segment store> > > This started pretty well until about 15% compared and then came to a crawl > where only one process is actually running. > > Thread Dump: > > "pool-2-thread-2" #23 prio=5 os_prio=0 cpu=1837679.95ms elapsed=61999.36s > tid=0x00007ecf0d913800 nid=0x1835e6 waiting on condition [0x00007ecec0afd000] > java.lang.Thread.State: WAITING (parking) > at jdk.internal.misc.Unsafe.park(java.base@11.0.22/Native Method) > - parking to wait for <0x0000000451000178> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.park(java.base@11.0.22/LockSupport.java:194) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@11.0.22/AbstractQueuedSynchronizer.java:2081) > at > java.util.concurrent.LinkedBlockingQueue.take(java.base@11.0.22/LinkedBlockingQueue.java:433) > at > java.util.concurrent.ThreadPoolExecutor.getTask(java.base@11.0.22/ThreadPoolExecutor.java:1054) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.22/ThreadPoolExecutor.java:1114) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.22/ThreadPoolExecutor.java:628) > at java.lang.Thread.run(java.base@11.0.22/Thread.java:834) > > "pool-2-thread-3" #24 prio=5 os_prio=0 cpu=34785318.09ms elapsed=61999.29s > tid=0x00007ecf0d914800 nid=0x1835e8 runnable [0x00007ece7bffd000] > java.lang.Thread.State: RUNNABLE > at java.lang.ThreadLocal.get(java.base@11.0.22/ThreadLocal.java:163) > at > java.lang.StringCoding.decodeUTF8(java.base@11.0.22/StringCoding.java:723) > at > java.lang.StringCoding.decode(java.base@11.0.22/StringCoding.java:257) > at java.lang.String.<init>(java.base@11.0.22/String.java:507) > at java.lang.String.<init>(java.base@11.0.22/String.java:561) > at > org.apache.jackrabbit.oak.segment.data.SegmentDataV12.getSignature(SegmentDataV12.java:88) > at org.apache.jackrabbit.oak.segment.Segment.<init>(Segment.java:201) > at > org.apache.jackrabbit.oak.segment.file.AbstractFileStore.readSegmentUncached(AbstractFileStore.java:300) > at > org.apache.jackrabbit.oak.segment.file.FileStore.lambda$readSegment$10(FileStore.java:512) > at > org.apache.jackrabbit.oak.segment.file.FileStore$$Lambda$95/0x00000008001ff840.call(Unknown > Source) > at > org.apache.jackrabbit.oak.segment.SegmentCache$NonEmptyCache.lambda$getSegment$0(SegmentCache.java:163) > at > org.apache.jackrabbit.oak.segment.SegmentCache$NonEmptyCache$$Lambda$96/0x00000008001ffc40.call(Unknown > Source) > at > org.apache.jackrabbit.guava.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4938) > at > org.apache.jackrabbit.guava.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3576) > at > org.apache.jackrabbit.guava.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2318) > at > org.apache.jackrabbit.guava.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2191) > - locked <0x00000006f028d408> (a > org.apache.jackrabbit.guava.common.cache.LocalCache$StrongAccessEntry) > at > org.apache.jackrabbit.guava.common.cache.LocalCache$Segment.get(LocalCache.java:2081) > at > org.apache.jackrabbit.guava.common.cache.LocalCache.get(LocalCache.java:4019) > at > org.apache.jackrabbit.guava.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4933) > at > org.apache.jackrabbit.oak.segment.SegmentCache$NonEmptyCache.getSegment(SegmentCache.java:160) > at > org.apache.jackrabbit.oak.segment.file.FileStore.readSegment(FileStore.java:512) > at > org.apache.jackrabbit.oak.segment.SegmentId.getSegment(SegmentId.java:153) > - locked <0x00000006f028d218> (a > org.apache.jackrabbit.oak.segment.SegmentId) > at > org.apache.jackrabbit.oak.segment.CachingSegmentReader$1.apply(CachingSegmentReader.java:105) > at > org.apache.jackrabbit.oak.segment.CachingSegmentReader$1.apply(CachingSegmentReader.java:101) > at > org.apache.jackrabbit.oak.segment.ReaderCache.get(ReaderCache.java:117) > at > org.apache.jackrabbit.oak.segment.CachingSegmentReader.readString(CachingSegmentReader.java:101) > at > org.apache.jackrabbit.oak.segment.MapRecord.getEntries(MapRecord.java:400) > at > org.apache.jackrabbit.oak.segment.MapRecord$2.iterator(MapRecord.java:384) > at > org.apache.jackrabbit.guava.common.collect.FluentIterable$2$$Lambda$105/0x0000000800229440.apply(Unknown > Source) > at > org.apache.jackrabbit.guava.common.collect.Iterators$6.transform(Iterators.java:829) > at > org.apache.jackrabbit.guava.common.collect.TransformedIterator.next(TransformedIterator.java:52) > at > org.apache.jackrabbit.guava.common.collect.Iterators$ConcatenatedIterator.hasNext(Iterators.java:1405) > at > org.apache.jackrabbit.oak.plugins.memory.EmptyNodeState.compareAgainstEmptyState(EmptyNodeState.java:159) > at > org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:504) > at > org.apache.jackrabbit.oak.segment.ClassicCompactor$CompactDiff.diff(ClassicCompactor.java:166) > at > org.apache.jackrabbit.oak.segment.ClassicCompactor.compact(ClassicCompactor.java:113) > at > org.apache.jackrabbit.oak.segment.ClassicCompactor.compact(ClassicCompactor.java:101) > at > org.apache.jackrabbit.oak.segment.ParallelCompactor$CompactionTree.lambda$compactAsync$1(ParallelCompactor.java:208) > at > org.apache.jackrabbit.oak.segment.ParallelCompactor$CompactionTree$$Lambda$106/0x000000080022f840.call(Unknown > Source) > at > java.util.concurrent.FutureTask.run(java.base@11.0.22/FutureTask.java:264) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.22/ThreadPoolExecutor.java:1128) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.22/ThreadPoolExecutor.java:628) > at java.lang.Thread.run(java.base@11.0.22/Thread.java:834) > > "pool-2-thread-4" #25 prio=5 os_prio=0 cpu=1831855.41ms elapsed=61999.26s > tid=0x00007ecf0d916000 nid=0x1835e9 waiting on condition [0x00007ece7befe000] > java.lang.Thread.State: WAITING (parking) > at jdk.internal.misc.Unsafe.park(java.base@11.0.22/Native Method) > - parking to wait for <0x0000000451000178> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.park(java.base@11.0.22/LockSupport.java:194) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@11.0.22/AbstractQueuedSynchronizer.java:2081) > at > java.util.concurrent.LinkedBlockingQueue.take(java.base@11.0.22/LinkedBlockingQueue.java:433) > at > java.util.concurrent.ThreadPoolExecutor.getTask(java.base@11.0.22/ThreadPoolExecutor.java:1054) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.22/ThreadPoolExecutor.java:1114) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.22/ThreadPoolExecutor.java:628) > at java.lang.Thread.run(java.base@11.0.22/Thread.java:834) > > All threads beside 'pool-2-thread-3”’ are blocked / waiting like 2 and 4. > > The log output is also just reporting thread-3 compaction: > > 17:55:16.733 [pool-2-thread-3] INFO o.a.j.oak.segment.file.FileStore - > compacted 343500000 nodes, 2634495864 properties, 30105613 binaries in > 65065178 ms. 17% complete. > 17:58:18.665 [pool-2-thread-3] INFO o.a.j.oak.segment.file.FileStore - > compacted 343650000 nodes, 2634645864 properties, 30105613 binaries in > 65247081 ms. 17% complete. > 18:01:19.141 [pool-2-thread-3] INFO o.a.j.oak.segment.file.FileStore - > compacted 343800000 nodes, 2634795864 properties, 30105613 binaries in > 65427586 ms. 17% complete. > 18:04:16.164 [pool-2-thread-3] INFO o.a.j.oak.segment.file.FileStore - > compacted 343950000 nodes, 2634945864 properties, 30105613 binaries in > 65604609 ms. 17% complete. > > Any idea why all other threads are blocked? > > Any suggestions on how to speed it up? > > Could we use TAIL to incrementally compact the segment store? > > Kind Regards - Andreas Schaefer