On Fri, Mar 25, 2011 at 12:02 PM, Yonik Seeley
<[email protected]> wrote:

> Currently, supplying a filter to IndexSearcher.search() assumes that
> it's cheaper to run than the main query.

Wait, where do we assume that?

Today, the filter & scorer "leap frog" each other.  Ie we alternate
between them, calling .advance() on each, until they finally land on
the same docID.

...at which point, we collect that doc, and then call .nextDoc() on
the filter and .advance on the scorer.

We still have to do LUCENE-1536, which would certainly assume testing
the filter (random access) is cheaper than query scoring.

> We should add the following method to IndexSearcher:
>  public void search(Query query, Filter beforeFilter, Filter
> afterFilter, Collector results)
>
> beforeFilter would be skipped first when possible, afterFilter would
> be skipped last when possible, and the scorer would be in the middle.

Hmm... but this foists the responsibility of optimizing the
order-of-execution to the caller?

Also, why stop at 2 filters?  Ie I may have 3 filters plus a query to
AND, and I want to control their order.

If we did this, it seems like we should put it into BooleanQuery?  Or
maybe a new, expert OrderedAndQuery or something.

But, better, this'd be an impl detail inside Lucene, ie we can somehow
ask each Filter/Query being ANDd what the "cost" is for nextDoc,
advance, and also measure how "restrictive" each is (BQ has a
heuristic for this today) and optimize query execution accordingly,
though obviously that's a long ways off so maybe an expert API is a
good stop gap for today.

What's the use case behind this...?

Mike

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to