[ 
https://issues.apache.org/jira/browse/LUCENE-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12704711#action_12704711
 ] 

Michael McCandless commented on LUCENE-1252:
--------------------------------------------

Here's a simple example that might drive this issue forward:

   +"h1n1 flu" +"united states"

Ideally, to score this query, you'd want to first AND all 4 terms
together, and only for docs matching that, consult the positions of
each pair of terms.

But we fail to do this today.

It's like somehow "Weight.scorer()" needs to be able to return a
"cheap" and an "expensive" scorer (which must be AND'd).  I think
PhraseQuery would somehow return cheap/expensive scorers that under
the hood share the same SegmentTermDocs/Positions iterators, such that
after cheap.next() has run, cheap.expensive only needs to "check the
current doc".  So in fact maybe the expensive scorer should not be a
Scorer but some other simple "passes or doesn't" API.

Or maybe it returns say a TwoStageScorer, which adds a
"reallyPasses()" (needs better name) method to otherwise normal the
Scorer (DISI) API.

Or something else....

> Avoid using positions when not all required terms are present
> -------------------------------------------------------------
>
>                 Key: LUCENE-1252
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1252
>             Project: Lucene - Java
>          Issue Type: Wish
>          Components: Search
>            Reporter: Paul Elschot
>            Priority: Minor
>
> In the Scorers of queries with (lots of) Phrases and/or (nested) Spans, 
> currently next() and skipTo() will use position information even when other 
> parts of the query cannot match because some required terms are not present.
> This could be avoided by adding some methods to Scorer that relax the 
> postcondition of next() and skipTo() to something like "all required terms 
> are present, but no position info was checked yet", and implementing these 
> methods for Scorers that do conjunctions: BooleanScorer, PhraseScorer, and 
> SpanScorer/NearSpans.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to