[ 
https://issues.apache.org/jira/browse/LUCENE-5460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Khludnev updated LUCENE-5460:
-------------------------------------

    Attachment: TestSlowQuery.java

see TestSlowQuery.java attached.

SampleSlowQuery verifies documents by checking stored field in 
SlowQueryScorer.confirm(int)

the key thing is to prohibit advance, just because it is inefficient per se:
{code}
SlowQueryScorer.advance(int) {
throw new UnsupportedOperationException(this + " doesn't support advancing");
}
{code}

so far, nothing special. The tricky thing is to handle filtering. I propose to 
make FilteredQuery.rewrite() aware about such 'slow' queries. see 
SlowQuery.rewriteFilteredQuery(IndexReader, FilteredQuery)

FilteredQuery(SlowQuery(coreQuery)) => SlowQuery(FilteredQuery(coreQuery))

I suppose we can introduce such sort of 'slow' queries in Lucene, make 
FilteredQuery.rewrite aware about them, as well as BooleanQuery.rewrite (I can 
provide the prototype, if you wish to look at). 




> Allow driving a query by sparse filters
> ---------------------------------------
>
>                 Key: LUCENE-5460
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5460
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/search
>            Reporter: Shai Erera
>         Attachments: TestSlowQuery.java
>
>
> Today if a filter is very sparse we execute the query in sort of a leap-frog 
> manner between the query and filter. If the query is very expensive to 
> compute, and/or matching few docs only too, calling scorer.advance(doc) just 
> to discover the doc it landed on isn't accepted by the filter, is a waste of 
> time. Since Filter is always the "final ruler", I wonder if we had something 
> like {{boolean DISI.advanceExact(doc)}} we could use it instead, in some 
> cases.
> There are many combinations in which I think we'd want to use/not-use this 
> API, and they depend on: Filter's complexity, Filter.cost(), Scorer.cost(), 
> query complexity (span-near, many clauses) etc.
> I open an issue so we can discuss. DISI.advanceExact(doc) is just a 
> preliminary proposal, to get an API we could experiment with. The default 
> implementation should be fairly easy and straightforward, and we could 
> override where we can offer a more optimized imp.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to