In that case, I would be interested to know if this can be merged into Luwak.
On Tue, 15 Dec 2020, 21:50 Adrien Grand, <jpou...@gmail.com> wrote: > I like this idea. I can think of several users who have a priori knowledge > of frequently used filters and would appreciate having Lucene take care of > transparently optimizing the execution of such filters instead of having to > do it manually. > > I'm not sure a separate project is the best option, it makes it more > challenging to keep up-to-date with releases, more challenging for users to > find it, etc. I'd rather add this feature to the Lucene repository, as a > new module or as part of an existing module? > > > On Tue, Dec 15, 2020 at 4:41 PM Michael Sokolov <msoko...@gmail.com> > wrote: > >> I feel like there could be some considerable overlap with features >> provided by Luwak, which was contributed to Lucene fairly recently, >> and I think does the query inversion work required for this; maybe >> more of it already exists here? I don't know if that module handles >> the query rewriting, or the term indexing you're talking about though. >> >> On Mon, Dec 14, 2020 at 11:25 PM Atri Sharma <a...@apache.org> wrote: >> > >> > +1 >> > >> > I would suggest that this be an independent project hosted on Github >> (there have been similar projects in the past that have seen success that >> way) >> > >> > On Tue, 15 Dec 2020, 09:37 David Smiley, <dsmi...@apache.org> wrote: >> >> >> >> Great optimization! >> >> >> >> I'm dubious on it being a good contribution to Lucene itself however, >> because what you propose fits cleanly above Lucene. Even at a ES/Solr >> layer (which I know you don't use, but hypothetically speaking), I'm >> dubious there as well. >> >> >> >> ~ David Smiley >> >> Apache Lucene/Solr Search Developer >> >> http://www.linkedin.com/in/davidwsmiley >> >> >> >> >> >> On Mon, Dec 14, 2020 at 2:37 PM Michael Froh <msf...@gmail.com> wrote: >> >>> >> >>> My team at work has a neat feature that we've built on top of Lucene >> that has provided a substantial (20%+) increase in maximum qps and some >> reduction in query latency. >> >>> >> >>> Basically, we run a training process that looks at historical queries >> to find frequently co-occurring combinations of required clauses, say "+A >> +B +C +D". Then at indexing time, if a document satisfies one of these >> known combinations, we add a new term to the doc, like "opto:ABCD". At >> query time, we can then replace the required clauses with a single >> TermQuery for the "optimized" term. >> >>> >> >>> It adds a little bit of extra work at indexing time and requires the >> offline training step, but we've found that it yields a significant boost >> at query time. >> >>> >> >>> We're interested in open-sourcing this feature. Is it something worth >> adding to Lucene? Since it doesn't require any core changes, maybe as a >> module? >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> > > -- > Adrien >