Op Saturday 30 August 2008 18:22:50 schreef Matt Ronge: > On Aug 30, 2008, at 6:13 AM, Paul Elschot wrote: > > Op Saturday 30 August 2008 03:34:01 schreef Matt Ronge: > >> Hi all, > >> > >> I am working on implementing a new Query, Weight and Scorer that > >> is expensive to run. I'd like to limit the number of documents I > >> run this query on by first building a candidate set of documents > >> with a boolean query. Once I have that candidate set, I was hoping > >> I could build a filter off of it, and issue that along with my > >> expensive query. However, after reading the code I see that > >> filtering is done during the search, and not before hand. > > > > Correct. I suppose you mean the filtering code in IndexSearcher? > > Yes, that's exactly what I mean. > > >> So my initial boolean query > >> won't help in limiting the number of documents scored by my > >> expensive query. > > > > The trick of filtering is the use of skipTo() on both the filter > > and the scorer to skip superfluous work as much as possible. > > So when you make your scorer implement skipTo() efficiently, > > filtering it should reduce the amount of scoring done. > > > > Implementing skipTo() efficiently is normally done by using > > TermScorer.skipTo() on the leafs of a scorer structure. So, > > in case you implement your own TermScorer, take a serious > > look at TermScorer.skipTo(). > > > > Normally, score value computations are not the bottleneck, > > but accessing the index is, and this is where skipTo() does > > the real work. At the moment avoiding score value computations > > is a nice extra. > > I was not aware of this. Where can I find the code that uses the > filter to determine what values to feed to skipTo (I'm trying to get > a better understand of the Lucene source)?
It's the same code in IndexSearcher. ConjunctionScorer.skipTo() does the much the same thing for any number of scorers. > > >> Or should I just implement something myself in a custom scorer? > > > > In case you have a better way than skipTo(), or something > > to improve on this issue to allow a Filter as clause to > > BooleanQuery: https://issues.apache.org/jira/browse/LUCENE-1345 > > let us know. > > Thanks, if the skipTo approach doesn't work, I'll take a look at > this. For the moment, Andrzej's suggestion to use FilteredQuery as a clause could well be good enough. Btw. FilteredQuery also contains a filtering scorer under the hood, you could take a look there, too. Regards, Paul Elschot --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]