Thomas Mueller created OAK-12193:
------------------------------------

             Summary: Large numbers of deleted Lucene documents cause 
incremental indexing to consume too much memory
                 Key: OAK-12193
                 URL: https://issues.apache.org/jira/browse/OAK-12193
             Project: Jackrabbit Oak
          Issue Type: Improvement
          Components: indexing, lucene
            Reporter: Thomas Mueller


When Oak performs an incremental indexing cycle which contains a lot of 
deletions, it can consume a lot of memory.

The issue is caused by accumulated Lucene document deletions, which are sent to 
each of the Lucene Index writers for all the indexes that cover the path of the 
deleted node (regardless of whether they match the nodetype of it).  Since 
there can be numerous indexes the cumulative memory use can easily overwhelm 
the JVM.

{noformat}
Example thread:

org.apache.jackrabbit.oak.plugins.index.lucene.writer.MultiplexingIndexWriter  
org.apache.jackrabbit.oak.plugins.index.search.CompositePropertyUpdateCallback  
org.apache.jackrabbit.oak.plugins.index.search.spi.binary.FulltextBinaryTextExtractor
  
org.apache.jackrabbit.oak.plugins.index.lucene.directory.DefaultDirectoryFactory
  
org.apache.jackrabbit.oak.plugins.index.lucene.writer.DefaultIndexWriterFactory 

org.apache.jackrabbit.oak.plugins.index.lucene.LucenelndexEditorContext

Thread name: async-index-update-async

memory bytes    percentage      class @ instance
2,516,736,256   69.87%  java.lang.Thread @ 0x6a52565d0 async-index-update-async 
Thread
879,048,256     24.41%  
org.apache.jackrabbit.oak.plugins.index.lucene.LucenelndexEditorContext @ 
0x6be46e860
879,047,904     24.41%  
org.apache.jackrabbit.oak.plugins.index.lucene.writer.MultiplexingIndexWriter @ 
0x6be46e8e0
879,047,864     24.41%  java.util.HashMap @ 0x6be46e908
879,047,816     24.41%  • java.util.HashMap$Node[16] @ 0x6be46e938
879,047,736     24.41%  java.util.HashMap$Node @ 0x6be46e988
879,047,704     24.41%  
org.apache.jackrabbit.oak.plugins.index.lucene.writer.DefaultIndexWriter @ 
0x6be46e9a8
879,047,648     24.41%  org.apache.lucene.index.IndexWriter @ 0x6be46e9e0
870,569,576     24.17%  
org.apache.lucene.index.ThreadAffinityDocumentsWriterThreadPool @ 0x6be477680
870,568,880     24.17%  
org.apache.lucene.index.DocumentsWriterPerThreadPool$ThreadState @ 0x6be477888
870,568,816     24.17%  org.apache.lucene.index.DocumentsWriterPerThread @ 
0x6be4778c8
863,858,200     23.98%  
org.apache.lucene.index.DocumentsWriterDeleteQueue$DeleteSlice @ 0x6be4f4020
4,381,272       0.12%   org.apache.lucene.index.BufferedUpdates @ 0x6be4b3978
2,325,704       0.06%   org.apache.lucene.index.DocFieldProcessor @ 0x6be477a50
 {noformat}





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to