[jira] [Commented] (SOLR-10732) potential optimizations in callers of SolrIndexSearcher.numDocs when docset is empty

Michael Gibney (Jira) Tue, 08 Dec 2020 11:35:35 -0800


    [ 
https://issues.apache.org/jira/browse/SOLR-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17246106#comment-17246106
 ]


Michael Gibney commented on SOLR-10732:
---------------------------------------

I'm curious, [~munendrasn] -- were you able to perceive a performance benefit 
with these changes? Where these optimizations are located, afaict they optimize 
edge cases, and the query-building they prevent (if I'm reading right) is 
generally pretty lightweight (e.g., {{TermQuery}} ...).

It seems like it makes most sense to optimize this kind of thing either at the 
leaf level (i.e., in {{SolrIndexSearcher.numDocs(...)}} -- already done in 
SOLR-10727) or maybe also higher up in the program logic, to prune as much 
execution as possible (and when it's clearer how/why we got the point of having 
an empty domain). The changes here seem to be building in mid-level "shot in 
the dark" safeguards, where it's relatively unclear what's going on.

By way of contrast (wrt complexity/benefit tradeoff), at the leaf level it 
looks like {{SolrIndexSearcher.getDocSet(Query, DocSet)}} could be optimized in 
a way analogous to what SOLR-10727 does for {{SolrIndexSearcher.numDocs(Query, 
DocSet)}}, avoiding filterCache pollution ...

> potential optimizations in callers of SolrIndexSearcher.numDocs when docset 
> is empty
> ------------------------------------------------------------------------------------
>
>                 Key: SOLR-10732
>                 URL: https://issues.apache.org/jira/browse/SOLR-10732
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Chris M. Hostetter
>            Priority: Major
>         Attachments: SOLR-10732.patch
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> spin off of SOLR-10727...
> {quote}
> ...why not (also) optimize it slightly higher up and completely avoid the 
> construction of the Query objects? (and in some cases: additional overhead)
> for example: the first usage of {{SolrIndexSearcher.numDocs(Query,DocSet)}} i 
> found was {{RangeFacetProcessor.rangeCount(DocSet subset,...)}} ... if the 
> first line of that method was {{if (0 == subset.size()) return 0}} then we'd 
> not only optimize away the SolrIndexSearcher hit, but also fetching the 
> SchemaField & building the range query (not to mention the much more 
> expensive {{getGroupedFacetQueryCount}} in the grouping case)
> At a glance, most other callers of 
> {{SolrIndexSearcher.numDocs(Query,DocSet)}} could be trivially optimize this 
> way as well -- at a minimum to eliminate Query parsing/construction.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-10732) potential optimizations in callers of SolrIndexSearcher.numDocs when docset is empty

Reply via email to