costin opened a new pull request, #16172:
URL: https://github.com/apache/lucene/pull/16172

   When a sorted or sorted-set doc-values field has a skip index, route range 
queries through a new BatchDocValuesOrdinalRangeIterator that implements 
intoBitSet() for bulk evaluation.
   
   The iterator processes skip blocks in bulk:
     - YES blocks: bitSet.set(start, end): entire range at once
     - YES_IF_PRESENT blocks: only checks doc existence (no ordinal comparison)
     - MAYBE blocks: checks ordinals inline
   
   This replaces the per-doc TwoPhaseIterator (approximation + confirmation) 
path when a skip index is available.
   
   ###  Benchmark
   
    AMD EPYC c5a.2xlarge, JDK 25, SortedSetDocValuesField.newSlowRangeQuery, 
~25% selectivity (256 of 1024 ordinals match):
   
     | docCount | valuesPerDoc | baseline (ops/s) | candidate (ops/s) | speedup 
|
     |----------|-------------|------------------|-------------------|---------|
     | 100K | 1 | 1,210.8 ± 5.0 | 4,071.9 ± 11.6 | **3.36x** |
     | 100K | 2 | 465.1 ± 1.0 | 558.7 ± 1.9 | **1.20x** |
     | 1M | 1 | 122.1 ± 0.6 | 297.7 ± 4.1 | **2.44x** |
     | 1M | 2 | 46.8 ± 0.3 | 55.8 ± 0.6 | **1.19x** |
   
    Single-valued fields see the largest gain (2.4–3.4x) because the singleton 
unwrap avoids multi-valued ordinal iteration in YES_IF_PRESENT/MAYBE blocks. 
Multi-valued fields still
     benefit (1.2x) from the YES block bulk-set optimization.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to