[ https://issues.apache.org/jira/browse/SOLR-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16673290#comment-16673290 ]
Tim Underwood commented on SOLR-12878: -------------------------------------- Sure. I've updated the pull request with what I'm currently playing with: [https://github.com/apache/lucene-solr/pull/473] There are currently 3 commits in there: 1 - The original FacetFieldProcessorByHashDV.java change to avoid calling getSlowAtomicReader 2 - The change requested by [~dsmiley] to move the caching of FieldInfos from SolrIndexSearcher to SlowCompositeReaderWrapper 3 - Adding a check in TestUtil.checkReader to verify that LeafReader.getFieldInfos() returns a cached copy along with the changes required to make that pass. Specifically there are several places that construct an empty FieldInfos instance so I just created a static FieldInfos.EMPTY instance that can be referenced. Also, MemoryIndexReader needed to be modified to cache a copy of its FieldInfos. The constructor was already looping over the fields so I just added it there (vs creating it lazily). What are your thoughts on #3? Is it a good idea to require LeafReader instances to cache their FieldInfos? It seems like something like this is a common pattern across the codebase (both Lucene and Solr): {code:java} reader.getFieldInfos().fieldInfo(field) {code} So it might be desirable to make sure FieldInfos isn't always being recomputed? I'm still verifying that I've checked that all LeafReader.getFieldInfos() implementations perform the caching and that all tests pass (I'm seeing a few failures but they seem unrelated). > FacetFieldProcessorByHashDV is reconstructing FieldInfos on every > instantiation > ------------------------------------------------------------------------------- > > Key: SOLR-12878 > URL: https://issues.apache.org/jira/browse/SOLR-12878 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Facet Module > Affects Versions: 7.5 > Reporter: Tim Underwood > Priority: Major > Labels: performance > Fix For: 7.6, master (8.0) > > Time Spent: 10m > Remaining Estimate: 0h > > The FacetFieldProcessorByHashDV constructor is currently calling: > {noformat} > FieldInfo fieldInfo = > fcontext.searcher.getSlowAtomicReader().getFieldInfos().fieldInfo(sf.getName()); > {noformat} > Which is reconstructing FieldInfos each time. Simply switching it to: > {noformat} > FieldInfo fieldInfo = > fcontext.searcher.getFieldInfos().fieldInfo(sf.getName()); > {noformat} > > causes it to use the cached version of FieldInfos in the SolrIndexSearcher. > On my index the FacetFieldProcessorByHashDV is 2-3 times slower than the > legacy facets without this fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org