[ https://issues.apache.org/jira/browse/LUCENE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nik Everett updated LUCENE-6046: -------------------------------- Attachment: LUCENE-6046.patch Next version with fixes based on Mike's feedback. > RegExp.toAutomaton high memory use > ---------------------------------- > > Key: LUCENE-6046 > URL: https://issues.apache.org/jira/browse/LUCENE-6046 > Project: Lucene - Core > Issue Type: Bug > Components: core/queryparser > Affects Versions: 4.10.1 > Reporter: Lee Hinman > Assignee: Michael McCandless > Priority: Minor > Attachments: LUCENE-6046.patch, LUCENE-6046.patch, LUCENE-6046.patch > > > When creating an automaton from an org.apache.lucene.util.automaton.RegExp, > it's possible for the automaton to use so much memory it exceeds the maximum > array size for java. > The following caused an OutOfMemoryError with a 32gb heap: > {noformat} > new > RegExp("\\[\\[(Datei|File|Bild|Image):[^]]*alt=[^]|}]{50,200}").toAutomaton(); > {noformat} > When increased to a 60gb heap, the following exception is thrown: > {noformat} > 1> java.lang.IllegalArgumentException: requested array size 2147483624 > exceeds maximum array in java (2147483623) > 1> > __randomizedtesting.SeedInfo.seed([7BE81EF678615C32:95C8057A4ABA5B52]:0) > 1> org.apache.lucene.util.ArrayUtil.oversize(ArrayUtil.java:168) > 1> org.apache.lucene.util.ArrayUtil.grow(ArrayUtil.java:295) > 1> > org.apache.lucene.util.automaton.Automaton$Builder.addTransition(Automaton.java:639) > 1> > org.apache.lucene.util.automaton.Operations.determinize(Operations.java:741) > 1> > org.apache.lucene.util.automaton.MinimizationOperations.minimizeHopcroft(MinimizationOperations.java:62) > 1> > org.apache.lucene.util.automaton.MinimizationOperations.minimize(MinimizationOperations.java:51) > 1> org.apache.lucene.util.automaton.RegExp.toAutomaton(RegExp.java:477) > 1> org.apache.lucene.util.automaton.RegExp.toAutomaton(RegExp.java:426) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org