Great optimization!

I'm dubious on it being a good contribution to Lucene itself however,
because what you propose fits cleanly above Lucene.  Even at a ES/Solr
layer (which I know you don't use, but hypothetically speaking), I'm
dubious there as well.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Dec 14, 2020 at 2:37 PM Michael Froh <[email protected]> wrote:

> My team at work has a neat feature that we've built on top of Lucene that
> has provided a substantial (20%+) increase in maximum qps and some
> reduction in query latency.
>
> Basically, we run a training process that looks at historical queries to
> find frequently co-occurring combinations of required clauses, say "+A +B
> +C +D". Then at indexing time, if a document satisfies one of these known
> combinations, we add a new term to the doc, like "opto:ABCD". At query
> time, we can then replace the required clauses with a single TermQuery for
> the "optimized" term.
>
> It adds a little bit of extra work at indexing time and requires the
> offline training step, but we've found that it yields a significant boost
> at query time.
>
> We're interested in open-sourcing this feature. Is it something worth
> adding to Lucene? Since it doesn't require any core changes, maybe as a
> module?
>

Reply via email to