[jira] [Commented] (LUCENE-7641) Speed up point ranges that match most documents

Adrien Grand (JIRA) Wed, 18 Jan 2017 00:40:15 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-7641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15827630#comment-15827630
 ]


Adrien Grand commented on LUCENE-7641:
--------------------------------------

I guess I wanted to stay on the safe side since point count estimation tends to 
be overestimated. Maybe we should improve the formula to be more accurate 
instead of checking the inverse cost. For instance, maybe we should count 
{{maxPointsInLeafNode/2}} when the relation is {{CELL_CROSSES_QUERY}} on a leaf 
cell as well as make BKDReader record the maximum number of points that have 
been put in a leaf in practice rather than the configuration parameter that was 
passed to {{BKDWriter}}, since the latter can be up to 2x the actual number of 
points in leaf nodes in the N-dims case?

> Speed up point ranges that match most documents
> -----------------------------------------------
>
>                 Key: LUCENE-7641
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7641
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-7461.patch
>
>
> If a point range matches most documents and  every document has exactly one 
> value, then we could make things faster by computing the set of documents 
> that do NOT match the range instead.
> It was not possible until recently since figuring out whether a range query 
> matches most documents was not possible, but we can now use the new 
> {{PointValues.estimatePointcount}} API to do that: we could just check 
> whether the cost of the inverse visitor is lower than the cost of the regular 
> range visitor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-7641) Speed up point ranges that match most documents

Reply via email to