+1 I would suggest that this be an independent project hosted on Github (there have been similar projects in the past that have seen success that way)
On Tue, 15 Dec 2020, 09:37 David Smiley, <[email protected]> wrote: > Great optimization! > > I'm dubious on it being a good contribution to Lucene itself however, > because what you propose fits cleanly above Lucene. Even at a ES/Solr > layer (which I know you don't use, but hypothetically speaking), I'm > dubious there as well. > > ~ David Smiley > Apache Lucene/Solr Search Developer > http://www.linkedin.com/in/davidwsmiley > > > On Mon, Dec 14, 2020 at 2:37 PM Michael Froh <[email protected]> wrote: > >> My team at work has a neat feature that we've built on top of Lucene that >> has provided a substantial (20%+) increase in maximum qps and some >> reduction in query latency. >> >> Basically, we run a training process that looks at historical queries to >> find frequently co-occurring combinations of required clauses, say "+A +B >> +C +D". Then at indexing time, if a document satisfies one of these known >> combinations, we add a new term to the doc, like "opto:ABCD". At query >> time, we can then replace the required clauses with a single TermQuery for >> the "optimized" term. >> >> It adds a little bit of extra work at indexing time and requires the >> offline training step, but we've found that it yields a significant boost >> at query time. >> >> We're interested in open-sourcing this feature. Is it something worth >> adding to Lucene? Since it doesn't require any core changes, maybe as a >> module? >> >
