Re: Boolean Scorer

Adrien Grand Mon, 14 Jun 2021 02:35:18 -0700

Hello Arihant,

The Scorer for disjunctions uses a heap data structure that needs to be
reordered upon every hit. While reordering heaps is efficient as it runs in
logarithmic time, the fact that it needs to run on every document might add
non-negligible overhead. BooleanScorer tries to work around this overhead
by scoring large windows of documents in a more TAAT (term-at-a-time)
fashion so that Lucene only needs to reorder the heap every 2048 doc IDs
(the hardcoded window size).

This paper gives a bit more context:
http://www.savar.se/media/1181/space_optimizations_for_total_ranking.pdf,
see section 4 in particular.

On Sat, Jun 12, 2021 at 5:47 PM Arihant Samar <[email protected]> wrote:

> Hi ,
>
> I am new here . I would like to know what is the exact optimisation
> carried out in “Boolean Scorer.java” code which led to a separate class for
> resolving Boolean Queries in bulk documents. I could not find any material
> in the documentation for this as well, hence I decided to ask here.
>
>
> Thanking you in advance,
>
> Arihant.
>
>
>
> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for
> Windows 10
>
>
>

-- 
Adrien

Re: Boolean Scorer

Reply via email to