[ https://issues.apache.org/jira/browse/LUCENE-8557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
David Smiley updated LUCENE-8557: --------------------------------- Attachment: LUCENE-8557.patch > LeafReader.getFieldInfos should always return the same instance > --------------------------------------------------------------- > > Key: LUCENE-8557 > URL: https://issues.apache.org/jira/browse/LUCENE-8557 > Project: Lucene - Core > Issue Type: Bug > Affects Versions: 7.5 > Reporter: Tim Underwood > Assignee: David Smiley > Priority: Major > Attachments: LUCENE-8557.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Most implementations of the LeafReader cache an instance of FieldInfos which > is returned in the LeafReader.getFieldInfos() method. There are a few places > that currently do not and this can cause performance problems. > The most notable example is the lack of caching in Solr's > SlowCompositeReaderWrapper which caused unexpected performance slowdowns when > trying to use Solr's JSON Facets compared to the legacy facets. > This proposed change is mostly relevant to Solr but touches a few Lucene > classes. Specifically: > *1.* Adds a check to TestUtil.checkReader to verify that > LeafReader.getFieldInfos() returns the same instance: > > {code:java} > // FieldInfos should be cached at the reader and always return the same > instance > if (reader.getFieldInfos() != reader.getFieldInfos()) { > throw new RuntimeException("getFieldInfos() returned different instances for > class: "+reader.getClass()); > } > {code} > I'm not entirely sure this is wanted or needed but adding it uncovered most > of the other LeafReader implementations that were not caching FieldInfos. > I'm happy to remove this part of the patch though. > > *2.* Adds a FieldInfos.EMPTY that can be used in a handful of places > > {code:java} > public final static FieldInfos EMPTY = new FieldInfos(new FieldInfo[0]); > {code} > There are several places in the Lucene/Solr tests that were creating empty > instances of FieldInfos which were causing the check in #1 to fail. This > fixes those failures and cleans up the code a bit. > *3.* Fixes a few LeafReader implementations that were not caching FieldInfos > Specifically: > * *MemoryIndex.MemoryIndexReader* - The constructor was already looping over > the fields so it seemed natural to just create the FieldInfos at that time > * *SlowCompositeReaderWrapper* - This was the one causing me trouble. I've > moved the caching of FieldInfos from SolrIndexSearcher to > SlowCompositeReaderWrapper. > * *CollapsingQParserPlugin.ReaderWrapper* - getFieldInfos() is immediately > called twice after this is constructed > * *ExpandComponent.ReaderWrapper* - getFieldInfos() is immediately called > twice after this is constructed > > *4.* Minor Solr tweak to avoid calling SolrIndexSearcher.getSlowAtomicReader > in FacetFieldProcessorByHashDV. This change is now optional since > SlowCompositeReaderWrapper caches FieldInfos. > > As suggested by [~dsmiley] this takes the place of SOLR-12878 since it > touches some Lucene code. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org