Tim Underwood created LUCENE-8557:
-------------------------------------

             Summary: LeafReader.getFieldInfos should always return the same 
instance
                 Key: LUCENE-8557
                 URL: https://issues.apache.org/jira/browse/LUCENE-8557
             Project: Lucene - Core
          Issue Type: Bug
    Affects Versions: 7.5
            Reporter: Tim Underwood


Most implementations of the LeafReader cache an instance of FieldInfos which is 
returned in the LeafReader.getFieldInfos() method.  There are a few places that 
currently do not and this can cause performance problems.

The most notable example is the lack of caching in Solr's 
SlowCompositeReaderWrapper which caused unexpected performance slowdowns when 
trying to use Solr's JSON Facets compared to the legacy facets.

This proposed change is mostly relevant to Solr but touches a few Lucene 
classes.  Specifically:

*1.* Adds a check to TestUtil.checkReader to verify that 
LeafReader.getFieldInfos() returns the same instance:

 
{code:java}
// FieldInfos should be cached at the reader and always return the same instance
 if (reader.getFieldInfos() != reader.getFieldInfos()) {
 throw new RuntimeException("getFieldInfos() returned different instances for 
class: "+reader.getClass());
 }
{code}
I'm not entirely sure this is wanted or needed but adding it uncovered most of 
the other LeafReader implementations that were not caching FieldInfos.  I'm 
happy to remove this part of the patch though.

 

*2.* Adds a FieldInfos.EMPTY that can be used in a handful of places

 
{code:java}
public final static FieldInfos EMPTY = new FieldInfos(new FieldInfo[0]);
{code}
There are several places in the Lucene/Solr tests that were creating empty 
instances of FieldInfos which were causing the check in #1 to fail.  This fixes 
those failures and cleans up the code a bit.

*3.* Fixes a few LeafReader implementations that were not caching FieldInfos

Specifically:
 * *MemoryIndex.MemoryIndexReader* - The constructor was already looping over 
the fields so it seemed natural to just create the FieldInfos at that time
 * *SlowCompositeReaderWrapper* - This was the one causing me trouble.  I've 
moved the caching of FieldInfos from SolrIndexSearcher to 
SlowCompositeReaderWrapper.
 * *CollapsingQParserPlugin.ReaderWrapper* - getFieldInfos() is immediately 
called twice after this is constructed
 * *ExpandComponent.ReaderWrapper* - getFieldInfos() is immediately called 
twice after this is constructed

 

*4.* Minor Solr tweak to avoid calling SolrIndexSearcher.getSlowAtomicReader in 
FacetFieldProcessorByHashDV.  This change is now optional since 
SlowCompositeReaderWrapper caches FieldInfos.

 

As suggested by [~dsmiley] this takes the place of SOLR-12878 since it touches 
some Lucene code.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to