[jira] [Commented] (LUCENE-7055) Better execution path for costly queries

Jim Ferenczi (JIRA) Mon, 26 Dec 2016 02:46:24 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15778113#comment-15778113
 ]


Jim Ferenczi commented on LUCENE-7055:
--------------------------------------

{quote}
I think this problem was solved with the two-phase iteration API: if you put a 
DocValuesNumbersQuery in a conjunction, ConjunctionScorer will make sure to use 
the two-phase iteration API on the DocValuesNumbersQuery, so it will never make 
it search for the next matching doc.
{quote}

Thanks for the explanation, I did not notice that RandomAccessWeight was meant 
to do that.

{quote}
I am fine either way. I started with your idea but later switched to a boolean 
since I thought it would be easier to test and would open this API to a couple 
more use-cases in addition to conjunctions, in particular facets on filters 
(since filters are consumed in a random-access fashion in that case) and 
disjunctions (MUST_NOT clauses).
{quote}

I agree, I was not sure about using the DocValuesNumbersQuery when the cost is 
big and the conjunction with another clause is sparse but as you mentioned the 
two phase iteration API should optimize this case efficiently. So +1 to keep 
the boolean if it simplifies the logic.


> Better execution path for costly queries
> ----------------------------------------
>
>                 Key: LUCENE-7055
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7055
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>         Attachments: LUCENE-7055.patch
>
>
> In Lucene 5.0, we improved the execution path for queries that run costly 
> operations on a per-document basis, like phrase queries or doc values 
> queries. But we have another class of costly queries, that return fine 
> iterators, but these iterators are very expensive to build. This is typically 
> the case for queries that leverage DocIdSetBuilder, like TermsQuery, 
> multi-term queries or the new point queries. Intersecting such queries with a 
> selective query is very inefficient since these queries build a doc id set of 
> matching documents for the entire index.
> Is there something we could do to improve the execution path for these 
> queries?
> One idea that comes to mind is that most of these queries could also run on 
> doc values, so maybe we could come up with something that would help decide 
> how to run a query based on other parts of the query? (Just thinking out 
> loud, other ideas are very welcome)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7055) Better execution path for costly queries

Reply via email to