[ https://issues.apache.org/jira/browse/LUCENE-9621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17248119#comment-17248119 ]
Michael Froh edited comment on LUCENE-9621 at 12/11/20, 6:55 PM: ----------------------------------------------------------------- I added a {{printStackTrace}} to {{onTragicEvent}} and got the following: {code:java} java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:125) at org.apache.lucene.index.FieldInfos$Builder.finish(FieldInfos.java:645) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:291) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:480) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:660) at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3899) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3874) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3853) at org.apache.lucene.index.TestIndexWriterDelete.testDeleteAllRepeated(TestIndexWriterDelete.java:499) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1754) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:942) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:978) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:992) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:370) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:819) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:470) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:951) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:887) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:898) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) {code} This is the leak that I called out and fixed in https://issues.apache.org/jira/browse/LUCENE-9617. If we add documents and callĀ {{deleteAll}} on the same {{IndexWriter}} repeatedly, it leaks field numbers and tries allocating a huge array in {{FieldInfos}} to accommodate the largest known field number. was (Author: msfroh): I added a {{printStackTrace}} to {{onTragicEvent}} and got the following: {code:java} java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:125) at org.apache.lucene.index.FieldInfos$Builder.finish(FieldInfos.java:645) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:291) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:480) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:660) at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3899) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3874) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3853) at org.apache.lucene.index.TestIndexWriterDelete.testDeleteAllRepeated(TestIndexWriterDelete.java:499) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1754) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:942) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:978) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:992) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:370) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:819) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:470) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:951) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:887) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:898) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) {code} This is the leak that I called out and fixed in https://issues.apache.org/jira/browse/LUCENE-9617. If we call {{deleteAll}} on the same {{IndexWriter}} repeatedly, it leaks field numbers and tries allocating a huge array in {{FieldInfos}} to accommodate the largest known field number. > pendingNumDocs doesn't match totalMaxDoc if tragedy on flush() > -------------------------------------------------------------- > > Key: LUCENE-9621 > URL: https://issues.apache.org/jira/browse/LUCENE-9621 > Project: Lucene - Core > Issue Type: Bug > Components: core/index > Affects Versions: 8.6.3 > Reporter: Michael Froh > Priority: Major > > While implementing a test to trigger an OutOfMemoryError on flush() in > https://github.com/apache/lucene-solr/pull/2088, I noticed that the OOME was > followed by an assertion failure on rollback with the following stacktrace: > {code:java} > java.lang.AssertionError: pendingNumDocs 1 != 0 totalMaxDoc > at > __randomizedtesting.SeedInfo.seed([ABBF17C4E0FCDEE5:DDC8E99910AFC8FF]:0) > at > org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2398) > at > org.apache.lucene.index.IndexWriter.maybeCloseOnTragicEvent(IndexWriter.java:5196) > at > org.apache.lucene.index.IndexWriter.tragicEvent(IndexWriter.java:5186) > at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3932) > at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3874) > at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3853) > at > org.apache.lucene.index.TestIndexWriterDelete.testDeleteAllRepeated(TestIndexWriterDelete.java:496) > {code} > We should probably look into how exactly we behave with this kind of tragedy > on flush(). -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org