[ 
https://issues.apache.org/jira/browse/LUCENE-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16866868#comment-16866868
 ] 

Ignacio Vera commented on LUCENE-8867:
--------------------------------------

{quote}
Right, this is what I had in mind when I said this is only a problem if you 
have data dimensions. Because if you don't, then you could call 
IntersectVisitor.compare(A, A) as a way to know whether value A matches, and we 
wouldn't need any new API?
{quote}

True, that would not work when you have data dimensions. In addition 
IntersectVisitor.compare(A, A) is intended to compare the query with a range 
which normally is more expensive that a comparison with a point so it would 
defeat the purpose of the optimisation.

I propose to break this change in two so we can work in the storage 
optimisation first and then we can think in the right API and make 
IntersectVisitor more efficient in these cases.

> Optimise BKD tree for low cardinality leaves
> --------------------------------------------
>
>                 Key: LUCENE-8867
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8867
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Ignacio Vera
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently if a leaf on the BKD tree contains only few values, then the leaf 
> is treated the same way as it all values are different. It many cases it can 
> be much more efficient to store the distinct values with the cardinality.
> In addition, in this case the method IntersectVisitor#visit(docId, byte[]) is 
> called n times with the same byte array but different docID. This issue 
> proposes to add a new method to the interface that accepts an array of docs 
> so it can be override by implementors and gain search performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to