Re: [PR] Use binary search partitioning in ReaderUtil#partitionByLeaf [lucene]

via GitHub Wed, 08 Apr 2026 06:45:09 -0700


gsmiller commented on PR #15938:
URL: https://github.com/apache/lucene/pull/15938#issuecomment-4206699682


   > Given we know the number of matching documents and leaves upfront, I am 
wondering if it makes sense to keep existing logic for sparse cases?
   
   Great question! So I did play with this a bit in benchmarking. I forked the 
logic based on whether-or-not there were more leaves than docs (essentially 
handling the outlier cases where there are lots of leaves and very few docs 
with the current linear scan). I ultimately shelved that idea for two reasons: 
(1) I questioned the trade-off of having two separate implementations to handle 
outlier cases that I don't think are super likely, and (2) the tuning heuristic 
of when to use linear isn't quite as clean as checking which is larger, docs or 
leaves (it seemed to be hardware dependent). Also, for the most part, the cases 
where linear scan is more performant are already very fast anyway (because you 
need to have very few docs), so I wasn't sure it was worth further optimizing 
those cases with a forked implementation when the practical difference would 
likely be small anyway. One thing I like about the binary search approach is 
that it makes the "slow cases" (i.e., lots of docs) much 
 faster.
   
   But stepping back, I'm not opposed to having two implementations if there's 
support for it. We do that sort of thing in other places. I was trying to start 
simple though and avoid feedback in the other direction (e.g., "why are you 
overcomplicating this with two implementations?" :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Use binary search partitioning in ReaderUtil#partitionByLeaf [lucene]

Reply via email to