[ https://issues.apache.org/jira/browse/OAK-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thomas Mueller resolved OAK-9184. --------------------------------- Resolution: Fixed > Very slow, potential endless loop in LucenePropertyIndex.loadDocs() > ------------------------------------------------------------------- > > Key: OAK-9184 > URL: https://issues.apache.org/jira/browse/OAK-9184 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene > Affects Versions: 1.22.4, 1.32.0, 1.8.23 > Reporter: Thomas Mueller > Assignee: Thomas Mueller > Priority: Major > Fix For: 1.34.0, 1.22.5, 1.8.24 > > > We found one case of a very slow, possibly endless loop in this method. > {noformat} > java.lang.Thread.State: RUNNABLE at > org.apache.lucene.search.TopScoreDocCollector$InOrderPagingScoreDocCollector.collect(TopScoreDocCollector.java:85) > at org.apache.lucene.search.Scorer.score(Scorer.java:65) at > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621) at > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491) at > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448) at > org.apache.lucene.search.IndexSearcher.searchAfter(IndexSearcher.java:243) at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.loadDocs(LucenePropertyIndex.java:434) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.computeNext(LucenePropertyIndex.java:353) > {noformat} > Background: Many threads are waiting to lock here: > {noformat} > - waiting to lock <0x00007fbd804a9448> (a > org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker) > at > org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.acquireIndexNode(IndexTracker.java:202) > public IndexNode acquireIndexNode(String path) { > IndexNodeManager index = indices.get(path); > IndexNode indexNode = index != null ? index.acquire() : null; > if (indexNode != null) { > return indexNode; > } else { > return findIndexNode(path); <<= synchronized method > } > } > {noformat} > And the thread that is holding that (synchronized) lock on IndexTracker is > also waiting to get a write lock here: > {noformat} > "oak-lucene-992" #21559 daemon prio=1 os_prio=0 tid=0x00007ed0280ad000 > nid=0x1be8 waiting on condition [0x00007fb96c352000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00007fbedad77210> (a > java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) > at > java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) > at > org.apache.jackrabbit.oak.plugins.index.lucene.IndexNodeManager.close(IndexNodeManager.java:174) > at > org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.close(IndexTracker.java:104) > - locked <0x00007fbd804a9448> (a > org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker) > at > org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.update(IndexTracker.java:114) > - locked <0x00007fbd804a9448> (a > org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker) > void close() throws IOException { > lock.writeLock().lock(); <<== waiting here > try { > checkState(!closed); > closed = true; > } finally { > lock.writeLock().unlock(); > } > {noformat} > It looks like a read lock is not released. The places where a read lock is > acquired: > {noformat} > IndexNode acquire() { > lock.readLock().lock(); > (released if the method returns null or throws an exception) > private void release() { > lock.readLock().unlock(); > } > {noformat} > acquire() is called in acquireIndexNode: > {noformat} > public IndexNode acquireIndexNode(String path) { > IndexNodeManager index = indices.get(path); > IndexNode indexNode = index != null ? index.acquire() : null; <<== > here > if (indexNode != null) { > return indexNode; > } else { > return findIndexNode(path); > } > } > private synchronized IndexNode findIndexNode(String path) { > // Retry the lookup from acquireIndexNode now that we're > // synchronized. The acquire() call is guaranteed to succeed > // since the close() method is also synchronized. > IndexNodeManager index = indices.get(path); > if (index != null) { > IndexNode indexNode = index.acquire(); <<== here > return checkNotNull(indexNode); > } > {noformat} > This method is called in multiple places (getPlanDescription, getPlans, > loadDocs, getSize, getOldFullTextIndexPath, dumpIndexContent, getFieldInfo, > getFieldTermPrefixInfo, getIndexedPaths, getIndexStats) but always released > in "finally". > There is one thread that seems to hold this read lock: > {noformat} > "10.13.0.73 [1597750655085] GET > /mnt/overlay/granite/ui/content/shell/omnisearch/searchresults.html HTTP/1.1" > #10953 prio=5 os_prio=0 tid=0x00007fb97000f000 nid=0x8b59 runnable > [0x00007fb96c44c000] > java.lang.Thread.State: RUNNABLE > at > org.apache.lucene.search.TopScoreDocCollector$InOrderPagingScoreDocCollector.collect(TopScoreDocCollector.java:85) > at org.apache.lucene.search.Scorer.score(Scorer.java:65) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448) > at > org.apache.lucene.search.IndexSearcher.searchAfter(IndexSearcher.java:243) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.loadDocs(LucenePropertyIndex.java:434) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.computeNext(LucenePropertyIndex.java:353) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.computeNext(LucenePropertyIndex.java:339) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$LucenePathCursor$1.hasNext(LucenePropertyIndex.java:1718) > at com.google.common.collect.Iterators$7.computeNext(Iterators.java:645) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > at > org.apache.jackrabbit.oak.plugins.index.Cursors$PathCursor.hasNext(Cursors.java:216) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$LucenePathCursor.hasNext(LucenePropertyIndex.java:1751) > at > org.apache.jackrabbit.oak.query.ast.SelectorImpl.next(SelectorImpl.java:432) > at > org.apache.jackrabbit.oak.query.QueryImpl$RowIterator.fetchNext(QueryImpl.java:824) > at > org.apache.jackrabbit.oak.query.QueryImpl$RowIterator.hasNext(QueryImpl.java:851) > at com.google.common.collect.Iterators$5.hasNext(Iterators.java:542) > at > org.apache.jackrabbit.oak.query.FilterIterators$DistinctIterator.fetchNext(FilterIterators.java:141) > at > org.apache.jackrabbit.oak.query.FilterIterators$DistinctIterator.hasNext(FilterIterators.java:155) > at > org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.fetch(QueryResultImpl.java:98) > at > org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.next(QueryResultImpl.java:123) > at > org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.next(QueryResultImpl.java:78) > at > org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate$SynchronizedIterator.next(SessionDelegate.java:702) > at > org.apache.jackrabbit.oak.jcr.query.PrefetchIterator.size(PrefetchIterator.java:135) > at > org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$3.getSize(QueryResultImpl.java:143) > at > org.apache.jackrabbit.oak.jcr.query.QueryResultImpl.getRows(QueryResultImpl.java:140) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)