[ https://issues.apache.org/jira/browse/LUCENE-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16867256#comment-16867256 ]
Michael McCandless commented on LUCENE-8867: -------------------------------------------- +1 to both of these optimizations – I suspect many use cases will have such duplicate values and we could see big reduction on index usage for the leaf blocks, and speedup if we do the comparison once per unique value instead of once per all values. > Optimise BKD tree for low cardinality leaves > -------------------------------------------- > > Key: LUCENE-8867 > URL: https://issues.apache.org/jira/browse/LUCENE-8867 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Ignacio Vera > Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Currently if a leaf on the BKD tree contains only few values, then the leaf > is treated the same way as it all values are different. It many cases it can > be much more efficient to store the distinct values with the cardinality. > In addition, in this case the method IntersectVisitor#visit(docId, byte[]) is > called n times with the same byte array but different docID. This issue > proposes to add a new method to the interface that accepts an array of docs > so it can be override by implementors and gain search performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org