Thomas Mueller created OAK-12193:
------------------------------------
Summary: Large numbers of deleted Lucene documents cause
incremental indexing to consume too much memory
Key: OAK-12193
URL: https://issues.apache.org/jira/browse/OAK-12193
Project: Jackrabbit Oak
Issue Type: Improvement
Components: indexing, lucene
Reporter: Thomas Mueller
When Oak performs an incremental indexing cycle which contains a lot of
deletions, it can consume a lot of memory.
The issue is caused by accumulated Lucene document deletions, which are sent to
each of the Lucene Index writers for all the indexes that cover the path of the
deleted node (regardless of whether they match the nodetype of it). Since
there can be numerous indexes the cumulative memory use can easily overwhelm
the JVM.
{noformat}
Example thread:
org.apache.jackrabbit.oak.plugins.index.lucene.writer.MultiplexingIndexWriter
org.apache.jackrabbit.oak.plugins.index.search.CompositePropertyUpdateCallback
org.apache.jackrabbit.oak.plugins.index.search.spi.binary.FulltextBinaryTextExtractor
org.apache.jackrabbit.oak.plugins.index.lucene.directory.DefaultDirectoryFactory
org.apache.jackrabbit.oak.plugins.index.lucene.writer.DefaultIndexWriterFactory
org.apache.jackrabbit.oak.plugins.index.lucene.LucenelndexEditorContext
Thread name: async-index-update-async
memory bytes percentage class @ instance
2,516,736,256 69.87% java.lang.Thread @ 0x6a52565d0 async-index-update-async
Thread
879,048,256 24.41%
org.apache.jackrabbit.oak.plugins.index.lucene.LucenelndexEditorContext @
0x6be46e860
879,047,904 24.41%
org.apache.jackrabbit.oak.plugins.index.lucene.writer.MultiplexingIndexWriter @
0x6be46e8e0
879,047,864 24.41% java.util.HashMap @ 0x6be46e908
879,047,816 24.41% • java.util.HashMap$Node[16] @ 0x6be46e938
879,047,736 24.41% java.util.HashMap$Node @ 0x6be46e988
879,047,704 24.41%
org.apache.jackrabbit.oak.plugins.index.lucene.writer.DefaultIndexWriter @
0x6be46e9a8
879,047,648 24.41% org.apache.lucene.index.IndexWriter @ 0x6be46e9e0
870,569,576 24.17%
org.apache.lucene.index.ThreadAffinityDocumentsWriterThreadPool @ 0x6be477680
870,568,880 24.17%
org.apache.lucene.index.DocumentsWriterPerThreadPool$ThreadState @ 0x6be477888
870,568,816 24.17% org.apache.lucene.index.DocumentsWriterPerThread @
0x6be4778c8
863,858,200 23.98%
org.apache.lucene.index.DocumentsWriterDeleteQueue$DeleteSlice @ 0x6be4f4020
4,381,272 0.12% org.apache.lucene.index.BufferedUpdates @ 0x6be4b3978
2,325,704 0.06% org.apache.lucene.index.DocFieldProcessor @ 0x6be477a50
{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)