[ https://issues.apache.org/jira/browse/LUCENE-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16866868#comment-16866868 ]
Ignacio Vera commented on LUCENE-8867: -------------------------------------- {quote} Right, this is what I had in mind when I said this is only a problem if you have data dimensions. Because if you don't, then you could call IntersectVisitor.compare(A, A) as a way to know whether value A matches, and we wouldn't need any new API? {quote} True, that would not work when you have data dimensions. In addition IntersectVisitor.compare(A, A) is intended to compare the query with a range which normally is more expensive that a comparison with a point so it would defeat the purpose of the optimisation. I propose to break this change in two so we can work in the storage optimisation first and then we can think in the right API and make IntersectVisitor more efficient in these cases. > Optimise BKD tree for low cardinality leaves > -------------------------------------------- > > Key: LUCENE-8867 > URL: https://issues.apache.org/jira/browse/LUCENE-8867 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Ignacio Vera > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Currently if a leaf on the BKD tree contains only few values, then the leaf > is treated the same way as it all values are different. It many cases it can > be much more efficient to store the distinct values with the cardinality. > In addition, in this case the method IntersectVisitor#visit(docId, byte[]) is > called n times with the same byte array but different docID. This issue > proposes to add a new method to the interface that accepts an array of docs > so it can be override by implementors and gain search performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org