Re: ScorerSupplier and cost() avoidance

Adrien Grand Thu, 07 Jun 2018 13:53:22 -0700

Got it. I thought you were talking about the class method, not the static
function.


Le jeu. 7 juin 2018 à 22:34, David Smiley <david.w.smi...@gmail.com> a
écrit :

> I'll have to look further with the team on what these queries look like in
> terms of relative cheapness and throughput.
>
> RE MinShouldMatchSumScorer.cost() (static method called by
> Boolean2ScorerSupplier.computeCost) --  it's pretty easy to see that this
> is called with minShouldMatch==0 (or 1).  Set a conditional breakpoint in
> your IDE and run TestBooleanMinShouldMatch.testRandomQueries
>
> On Thu, Jun 7, 2018 at 3:43 PM Adrien Grand <jpou...@gmail.com> wrote:
>
>> I suspect this could only show up as a bottleneck if they run very cheap
>> queries (low cost) at a very high throughput? Is it the case? I've seen a
>> couple workloads like that in the past and profilers suggested that things
>> that usually do not matter were bottleneck like creating scorers or
>> deciding whether a query should be cached. But trying to fix it didn't
>> really help as there are lots of things that we need to do to decide how to
>> run a query that run in O(num_segments * num_clauses)
>>
>> I'm confused why MinShouldMatchSumScorer would be used when
>> minShouldMatch is 0 or 1. DisjunctionSumScorer should be used instead for
>> such values of minShouldMatch?
>>
>> Le jeu. 7 juin 2018 à 19:38, Michael McCandless <
>> luc...@mikemccandless.com> a écrit :
>>
>>> Doesn't BQ rewrite itself if it has only one clause?
>>>
>>> Or maybe if there were more than one clause, and then all but one of
>>> them had null scorers (on SHOULD clauses) you could wind up in that state?
>>>
>>> Mike McCandless
>>>
>>> http://blog.mikemccandless.com
>>>
>>> On Thu, Jun 7, 2018 at 1:21 PM, David Smiley <david.w.smi...@gmail.com>
>>> wrote:
>>>
>>>> I'm working with some folks who did some profiling and noticed that
>>>> ScorerSupplier.cost() can be expensive (as the javadocs say). cost() says
>>>> only to call it if necessary. Unfortunately, a BooleanQuery is going to
>>>> call cost() (via BooleanWeight.scorer() even if ultimately no Query in the
>>>> tree cares what the cost is.  I'm not sure if that's a perf bug or not;
>>>> it's hard to tell.
>>>>
>>>> The expensive part of cost() for Boolean2ScorerSupplier is over in
>>>> MinShouldMatchSumScorer.cost which creates a PriorityQueue every time, even
>>>> if trivially numScorers == 1.  That's a weird case... why do we even need a
>>>> Boolean2ScorerSupplier around one clause; couldn't that clause be returned
>>>> from the outer weight, BooleanWeight.scorerSupplier() close to the end as
>>>> an optimization?  I could file an issue.
>>>>
>>>> ~ David
>>>> --
>>>> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>>>> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
>>>> http://www.solrenterprisesearchserver.com
>>>>
>>>
>>> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>

Re: ScorerSupplier and cost() avoidance

Reply via email to