[
https://issues.apache.org/jira/browse/LUCENE-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16866830#comment-16866830
]
Adrien Grand commented on LUCENE-8867:
--------------------------------------
Sorry, reading my comment again I realize it wasn't clear. I see two distinct
changes in the pull request. One is about adding a new storage strategy for the
case that a leaf only has a handful of unique values, I'm +1 on it. The second
one is about taking advantage of this special case to not compute a relation
with the same byte[] over and over again, the solution is a bit more
controversial in my opinion.
bq. another option would be to change more radically the interface and add a
matches(byte[]) method that returns a boolean and then use the visit(docID)
method.
Right, this is what I had in mind when I said this is only a problem if you
have data dimensions. Because if you don't, then you could call
IntersectVisitor.compare(A, A) as a way to know whether value A matches, and we
wouldn't need any new API?
> Optimise BKD tree for low cardinality leaves
> --------------------------------------------
>
> Key: LUCENE-8867
> URL: https://issues.apache.org/jira/browse/LUCENE-8867
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Ignacio Vera
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Currently if a leaf on the BKD tree contains only few values, then the leaf
> is treated the same way as it all values are different. It many cases it can
> be much more efficient to store the distinct values with the cardinality.
> In addition, in this case the method IntersectVisitor#visit(docId, byte[]) is
> called n times with the same byte array but different docID. This issue
> proposes to add a new method to the interface that accepts an array of docs
> so it can be override by implementors and gain search performance.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]