gsmiller commented on issue #15934:
URL: https://github.com/apache/lucene/issues/15934#issuecomment-4199711524

   I forgot to mention that the above two benchmarks were on a MacBook Pro (M3 
chip). For fun, I ran benchmarks on two Linux boxes as well (x86 and ARM). 
Capturing benchmark results from all three machines here (some of this is 
duplicate from previous comments, but wanted to capture it in a single place). 
Leading with an AI-produced summary of results and dropping the raw jmh output 
at the bottom for reference.
   
   **Approaches**
   Both approaches use Arrays#sort for sorting. The difference is only in the 
partition step (mapping sorted docIDs to index segments):
   - **Linear**: scan sorted doc IDs, advance leaf pointer on boundary crossings
   - **BinarySearch**: iterate leaf boundaries, binary search for each in 
sorted doc IDs (with O(1) empty-leaf skip and early termination)
   
   **General Observations**
   These observations generally hold true across all hardware.
   - At 10K+ doc IDs, the sort dominates — partition strategy makes no/small 
measurable difference.
   - At smaller doc counts with few leaves, binary search is faster.
   - At small doc counts with many leaves (50+), linear scan wins.
   
   **Benchmark Hardware**
   |                | MacBook Pro       | x86 Linux              | ARM Linux    
        |
   
|----------------|-------------------|------------------------|----------------------|
   | CPU            | Apple M3 Pro      | Xeon Platinum 8175M    | Neoverse-N1 
(Graviton2) |
   | L1d cache      | 64 KB             | 32 KB                  | 64 KB        
        |
   | L2 cache       | 4 MB              | 1 MB                   | 1 MB         
        |
   
   **MacBook Pro Results**
   | numDocIds | numLeaves | Linear (ops/ms) | BinarySearch (ops/ms) | 
Difference |
   
|-----------|-----------|------------------|-----------------------|------------|
   | 100       | 5         | 1,195            | 1,444                 | BS +21% 
✅ |
   | 100       | 10        | 1,069            | 1,210                 | BS +13% 
✅ |
   | 100       | 20        | 805              | 1,141                 | BS +42% 
✅ |
   | 100       | 50        | 1,199            | 885                   | Linear 
+35% ❌ |
   | 100       | 200       | 742              | 549                   | Linear 
+35% ❌ |
   | 1,000     | 5         | 80               | 101                   | BS +26% 
✅ |
   | 1,000     | 10        | 84               | 101                   | BS +21% 
✅ |
   | 1,000     | 20        | 101              | 99                    | ~tie    
   |
   | 1,000     | 50        | 98               | 97                    | ~tie    
   |
   | 1,000     | 200       | 91               | 80                    | Linear 
+14% ❌ |
   | 10,000    | 5         | 5.6              | 6.0                   | ~tie    
   |
   | 10,000    | 10        | 5.8              | 6.1                   | ~tie    
   |
   | 10,000    | 20        | 5.2              | 6.4                   | ~tie    
   |
   | 10,000    | 50        | 5.4              | 5.4                   | ~tie    
   |
   | 10,000    | 200       | 5.6              | 5.6                   | ~tie    
   |
   | 100,000   | 5         | 0.252            | 0.262                 | ~tie    
   |
   | 100,000   | 10        | 0.261            | 0.247                 | ~tie    
   |
   | 100,000   | 20        | 0.258            | 0.262                 | ~tie    
   |
   | 100,000   | 50        | 0.255            | 0.261                 | ~tie    
   |
   | 100,000   | 200       | 0.257            | 0.257                 | ~tie    
   |
   
   **x86 Linux Results**
   | numDocIds | numLeaves | Linear (ops/ms) | BS (ops/ms) | Difference |
   |-----------|-----------|------------------|-------------|------------|
   | 100       | 5         | 1,452            | 1,845       | BS +27% ✅ |
   | 100       | 10        | 1,454            | 1,412       | ~tie       |
   | 100       | 20        | 1,186            | 1,131       | ~tie       |
   | 100       | 50        | 845              | 758         | Linear +12% ❌ |
   | 100       | 200       | 493              | 375         | Linear +31% ❌ |
   | 1,000     | 5         | 140              | 169         | BS +21% ✅ |
   | 1,000     | 10        | 127              | 166         | BS +31% ✅ |
   | 1,000     | 20        | 132              | 154         | BS +17% ✅ |
   | 1,000     | 50        | 128              | 150         | BS +17% ✅ |
   | 1,000     | 200       | 112              | 94          | Linear +20% ❌ |
   | 10,000    | 5         | 10.6             | 11.8        | BS +11% ✅ |
   | 10,000    | 10        | 11.0             | 12.0        | BS +9% ✅ |
   | 10,000    | 20        | 11.1             | 11.7        | BS +6% ✅ |
   | 10,000    | 50        | 11.4             | 11.8        | ~tie       |
   | 10,000    | 200       | 10.0             | 10.7        | BS +7% ✅ |
   | 100,000   | 5         | 0.781            | 0.767       | ~tie       |
   | 100,000   | 10        | 0.775            | 0.824       | BS +6% ✅ |
   | 100,000   | 20        | 0.780            | 0.833       | BS +7% ✅ |
   | 100,000   | 50        | 0.791            | 0.845       | BS +7% ✅ |
   | 100,000   | 200       | 0.778            | 0.810       | BS +4% ✅ |
   
   **ARM Linux Results**
   | numDocIds | numLeaves | Linear (ops/ms) | BS (ops/ms) | Difference |
   |-----------|-----------|------------------|-------------|------------|
   | 100       | 5         | 512              | 571         | BS +12% ✅ |
   | 100       | 10        | 497              | 492         | ~tie       |
   | 100       | 20        | 448              | 415         | Linear +8% ❌ |
   | 100       | 50        | 399              | 296         | Linear +35% ❌ |
   | 100       | 200       | 266              | 208         | Linear +28% ❌ |
   | 1,000     | 5         | 34               | 38          | BS +12% ✅ |
   | 1,000     | 10        | 34               | 36          | BS +5% ✅ |
   | 1,000     | 20        | 31               | 37          | BS +19% ✅ |
   | 1,000     | 50        | 33               | 34          | ~tie       |
   | 1,000     | 200       | 32               | 27          | Linear +19% ❌ |
   | 10,000    |          | ~1.65            | ~1.69       | ~tie       |
   | 100,000   |          | ~0.132           | ~0.131      | ~tie       |
   
   **Full Benchmark Output**
   *MacBook Pro*
   ```
   Benchmark                                                 (numDocIds)  
(numLeaves)   Mode  Cnt     Score     Error   Units
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition          100        
    5  thrpt   15  1444.092 ±  79.023  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition          100        
   10  thrpt   15  1209.812 ± 145.064  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition          100        
   20  thrpt   15  1141.308 ±  78.308  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition          100        
   50  thrpt   15   885.316 ±  63.670  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition          100        
  200  thrpt   15   549.317 ±  31.922  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition         1000        
    5  thrpt   15   101.334 ±   6.847  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition         1000        
   10  thrpt   15   101.028 ±   2.527  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition         1000        
   20  thrpt   15    99.230 ±   7.228  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition         1000        
   50  thrpt   15    97.103 ±   7.316  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition         1000        
  200  thrpt   15    79.708 ±   2.740  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition        10000        
    5  thrpt   15     5.995 ±   0.387  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition        10000        
   10  thrpt   15     6.134 ±   0.801  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition        10000        
   20  thrpt   15     6.402 ±   0.561  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition        10000        
   50  thrpt   15     5.425 ±   0.267  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition        10000        
  200  thrpt   15     5.632 ±   0.548  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition       100000        
    5  thrpt   15     0.262 ±   0.005  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition       100000        
   10  thrpt   15     0.247 ±   0.008  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition       100000        
   20  thrpt   15     0.262 ±   0.008  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition       100000        
   50  thrpt   15     0.261 ±   0.005  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition       100000        
  200  thrpt   15     0.257 ±   0.008  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                           100        
    5  thrpt   15  1194.994 ±  78.870  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                           100        
   10  thrpt   15  1068.638 ±  52.030  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                           100        
   20  thrpt   15   805.194 ±  42.816  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                           100        
   50  thrpt   15  1198.694 ± 104.503  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                           100        
  200  thrpt   15   741.903 ±  40.591  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                          1000        
    5  thrpt   15    80.241 ±   3.126  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                          1000        
   10  thrpt   15    83.524 ±   5.633  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                          1000        
   20  thrpt   15   101.379 ±   1.832  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                          1000        
   50  thrpt   15    98.004 ±   4.570  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                          1000        
  200  thrpt   15    91.082 ±   3.440  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                         10000        
    5  thrpt   15     5.556 ±   0.474  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                         10000        
   10  thrpt   15     5.763 ±   0.511  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                         10000        
   20  thrpt   15     5.240 ±   0.415  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                         10000        
   50  thrpt   15     5.410 ±   0.390  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                         10000        
  200  thrpt   15     5.632 ±   0.530  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                        100000        
    5  thrpt   15     0.252 ±   0.005  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                        100000        
   10  thrpt   15     0.261 ±   0.004  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                        100000        
   20  thrpt   15     0.258 ±   0.002  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                        100000        
   50  thrpt   15     0.255 ±   0.006  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                        100000        
  200  thrpt   15     0.257 ±   0.010  ops/ms
   ```
   
   *x86 Linux*
   ```
   Benchmark                                                 (numDocIds)  
(numLeaves)   Mode  Cnt     Score     Error   Units
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition          100        
    5  thrpt   15  1844.918 ± 176.890  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition          100        
   10  thrpt   15  1412.154 ± 324.330  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition          100        
   20  thrpt   15  1131.319 ±  77.487  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition          100        
   50  thrpt   15   757.905 ±  96.660  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition          100        
  200  thrpt   15   375.193 ±   2.563  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition         1000        
    5  thrpt   15   169.033 ±   6.231  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition         1000        
   10  thrpt   15   166.043 ±   4.689  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition         1000        
   20  thrpt   15   153.623 ±  21.208  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition         1000        
   50  thrpt   15   149.702 ±   6.099  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition         1000        
  200  thrpt   15    93.763 ±   1.001  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition        10000        
    5  thrpt   15    11.756 ±   0.062  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition        10000        
   10  thrpt   15    12.018 ±   0.295  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition        10000        
   20  thrpt   15    11.703 ±   0.246  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition        10000        
   50  thrpt   15    11.843 ±   0.394  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition        10000        
  200  thrpt   15    10.685 ±   0.180  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition       100000        
    5  thrpt   15     0.767 ±   0.101  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition       100000        
   10  thrpt   15     0.824 ±   0.020  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition       100000        
   20  thrpt   15     0.833 ±   0.018  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition       100000        
   50  thrpt   15     0.845 ±   0.029  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition       100000        
  200  thrpt   15     0.810 ±   0.085  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                           100        
    5  thrpt   15  1451.994 ± 197.622  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                           100        
   10  thrpt   15  1453.638 ± 227.219  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                           100        
   20  thrpt   15  1185.798 ± 223.198  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                           100        
   50  thrpt   15   845.484 ±  29.405  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                           100        
  200  thrpt   15   492.919 ±  15.587  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                          1000        
    5  thrpt   15   140.073 ±   5.135  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                          1000        
   10  thrpt   15   127.405 ±   5.352  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                          1000        
   20  thrpt   15   132.117 ±   9.990  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                          1000        
   50  thrpt   15   128.341 ±   3.279  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                          1000        
  200  thrpt   15   112.499 ±   0.516  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                         10000        
    5  thrpt   15    10.605 ±   0.483  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                         10000        
   10  thrpt   15    10.997 ±   0.491  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                         10000        
   20  thrpt   15    11.051 ±   0.469  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                         10000        
   50  thrpt   15    11.441 ±   0.592  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                         10000        
  200  thrpt   15     9.976 ±   1.625  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                        100000        
    5  thrpt   15     0.781 ±   0.020  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                        100000        
   10  thrpt   15     0.775 ±   0.022  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                        100000        
   20  thrpt   15     0.780 ±   0.022  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                        100000        
   50  thrpt   15     0.791 ±   0.025  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                        100000        
  200  thrpt   15     0.778 ±   0.023  ops/ms
   ```
   
   *ARM Linux*
   ```
   Benchmark                                                 (numDocIds)  
(numLeaves)   Mode  Cnt    Score     Error   Units
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition          100        
    5  thrpt   15  570.680 ± 106.464  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition          100        
   10  thrpt   15  491.883 ±  58.971  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition          100        
   20  thrpt   15  414.814 ±  28.513  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition          100        
   50  thrpt   15  296.198 ±  10.976  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition          100        
  200  thrpt   15  207.925 ±  17.364  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition         1000        
    5  thrpt   15   38.223 ±   4.308  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition         1000        
   10  thrpt   15   35.501 ±   2.138  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition         1000        
   20  thrpt   15   37.333 ±   4.049  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition         1000        
   50  thrpt   15   33.573 ±   0.910  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition         1000        
  200  thrpt   15   26.707 ±   0.242  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition        10000        
    5  thrpt   15    1.693 ±   0.025  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition        10000        
   10  thrpt   15    1.701 ±   0.017  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition        10000        
   20  thrpt   15    1.687 ±   0.016  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition        10000        
   50  thrpt   15    1.683 ±   0.027  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition        10000        
  200  thrpt   15    1.673 ±   0.027  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition       100000        
    5  thrpt   15    0.131 ±   0.002  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition       100000        
   10  thrpt   15    0.131 ±   0.002  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition       100000        
   20  thrpt   15    0.133 ±   0.002  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition       100000        
   50  thrpt   15    0.132 ±   0.002  ops/ms
   PartitionByLeafBenchmark.arraysSortBinarySearchPartition       100000        
  200  thrpt   15    0.130 ±   0.003  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                           100        
    5  thrpt   15  511.695 ±  70.070  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                           100        
   10  thrpt   15  496.877 ±  29.344  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                           100        
   20  thrpt   15  447.930 ±  96.207  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                           100        
   50  thrpt   15  398.911 ±   8.159  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                           100        
  200  thrpt   15  266.050 ±  23.467  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                          1000        
    5  thrpt   15   33.924 ±   3.368  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                          1000        
   10  thrpt   15   33.691 ±   1.337  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                          1000        
   20  thrpt   15   31.348 ±   1.457  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                          1000        
   50  thrpt   15   32.600 ±   0.374  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                          1000        
  200  thrpt   15   31.894 ±   2.135  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                         10000        
    5  thrpt   15    1.637 ±   0.014  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                         10000        
   10  thrpt   15    1.667 ±   0.019  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                         10000        
   20  thrpt   15    1.670 ±   0.023  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                         10000        
   50  thrpt   15    1.657 ±   0.020  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                         10000        
  200  thrpt   15    1.641 ±   0.007  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                        100000        
    5  thrpt   15    0.133 ±   0.001  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                        100000        
   10  thrpt   15    0.130 ±   0.001  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                        100000        
   20  thrpt   15    0.133 ±   0.002  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                        100000        
   50  thrpt   15    0.131 ±   0.001  ops/ms
   PartitionByLeafBenchmark.arraysSortOnly                        100000        
  200  thrpt   15    0.132 ±   0.002  ops/ms
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to