[ https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091714#comment-17091714 ]
Mike Drob commented on SOLR-14428: ---------------------------------- bq. Maybe there is an elegant way to store a stripped down FuzzyQuery in the cache? I think this runs into problems with auto warming based on the contents of the cache when we open a new searcher. Possibly need a way to rebuild it afterwards, which feels like we're going backwards from LUCENE-9068. Related, [~romseygeek] - should [BASE_RAM_BYTES|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/FuzzyQuery.java#L58] use {{FuzzyQuery.class}} instead of {{AutomatonQuery.class}}? > FuzzyQuery has severe memory usage in 8.5 > ----------------------------------------- > > Key: SOLR-14428 > URL: https://issues.apache.org/jira/browse/SOLR-14428 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Affects Versions: 8.5, 8.5.1 > Reporter: Colvin Cowie > Assignee: Andrzej Bialecki > Priority: Major > Attachments: FuzzyHammer.java, image-2020-04-23-09-18-06-070.png, > screenshot-2.png, screenshot-3.png, screenshot-4.png > > > I sent this to the mailing list > I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors > while running our normal tests. After profiling it was clear that the > majority of the heap was allocated through FuzzyQuery. > LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the > FuzzyQuery's constructor. > I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries > from random UUID strings for 5 minutes > {code} > FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2" > {code} > When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while > the memory usage has increased drastically on 8.5.0 and 8.5.1. > Comparison of heap usage while running the attached test against Solr 8.3.1 > and 8.5.1 with a single (empty) shard and 4GB heap: > !image-2020-04-23-09-18-06-070.png! > And with 4 shards on 8.4.1 and 8.5.0: > !screenshot-2.png! > I'm guessing that the memory might be being leaked if the FuzzyQuery objects > are referenced from the cache, while the FuzzyTermsEnum would not have been. > Query Result Cache on 8.5.1: > !screenshot-3.png! > ~316mb in the cache > QRC on 8.3.1 > !screenshot-4.png! > <1mb > With an empty cache, running this query > _field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory > allocation > {noformat} > 8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed: 1520 > 8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed: 648855 > {noformat} > ~1 gives 98253 and ~0 gives 6339 on 8.5.1. 8.3.1 is constant at 1520 -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org