gsmiller commented on issue #15905: URL: https://github.com/apache/lucene/issues/15905#issuecomment-4184071194
Hmm... actually option 3 might be worse than I thought. I put together a micro-benchmark (#15933) and keeping track of the parallel ordinal array adds quite a bit of overhead. Here's the raw output of the benchmark: ``` Benchmark (numDocIds) (numLeaves) Mode Cnt Score Error Units PartitionByLeafBenchmark.arraysSortOnly 100 5 thrpt 15 1213.898 ± 23.702 ops/ms PartitionByLeafBenchmark.arraysSortOnly 100 50 thrpt 15 1146.033 ± 30.084 ops/ms PartitionByLeafBenchmark.arraysSortOnly 100 200 thrpt 15 849.219 ± 10.757 ops/ms PartitionByLeafBenchmark.arraysSortOnly 1000 5 thrpt 15 84.469 ± 0.637 ops/ms PartitionByLeafBenchmark.arraysSortOnly 1000 50 thrpt 15 65.685 ± 3.071 ops/ms PartitionByLeafBenchmark.arraysSortOnly 1000 200 thrpt 15 88.932 ± 5.194 ops/ms PartitionByLeafBenchmark.arraysSortOnly 10000 5 thrpt 15 5.031 ± 0.394 ops/ms PartitionByLeafBenchmark.arraysSortOnly 10000 50 thrpt 15 5.562 ± 0.467 ops/ms PartitionByLeafBenchmark.arraysSortOnly 10000 200 thrpt 15 5.691 ± 0.560 ops/ms PartitionByLeafBenchmark.arraysSortOnly 100000 5 thrpt 15 0.265 ± 0.008 ops/ms PartitionByLeafBenchmark.arraysSortOnly 100000 50 thrpt 15 0.265 ± 0.007 ops/ms PartitionByLeafBenchmark.arraysSortOnly 100000 200 thrpt 15 0.265 ± 0.007 ops/ms PartitionByLeafBenchmark.introSortWithOrdinals 100 5 thrpt 15 911.904 ± 31.288 ops/ms PartitionByLeafBenchmark.introSortWithOrdinals 100 50 thrpt 15 866.220 ± 22.404 ops/ms PartitionByLeafBenchmark.introSortWithOrdinals 100 200 thrpt 15 690.300 ± 17.029 ops/ms PartitionByLeafBenchmark.introSortWithOrdinals 1000 5 thrpt 15 63.449 ± 3.054 ops/ms PartitionByLeafBenchmark.introSortWithOrdinals 1000 50 thrpt 15 64.223 ± 2.953 ops/ms PartitionByLeafBenchmark.introSortWithOrdinals 1000 200 thrpt 15 69.654 ± 0.704 ops/ms PartitionByLeafBenchmark.introSortWithOrdinals 10000 5 thrpt 15 3.443 ± 0.234 ops/ms PartitionByLeafBenchmark.introSortWithOrdinals 10000 50 thrpt 15 3.471 ± 0.283 ops/ms PartitionByLeafBenchmark.introSortWithOrdinals 10000 200 thrpt 15 3.200 ± 0.322 ops/ms PartitionByLeafBenchmark.introSortWithOrdinals 100000 5 thrpt 15 0.174 ± 0.006 ops/ms PartitionByLeafBenchmark.introSortWithOrdinals 100000 50 thrpt 15 0.174 ± 0.005 ops/ms PartitionByLeafBenchmark.introSortWithOrdinals 100000 200 thrpt 15 0.166 ± 0.011 ops/ms ``` And a summary of the results: | numDocIds | numLeaves | Arrays.sort (ops/ms) | IntroSorter+ordinals (ops/ms) | Overhead | |-----------|-----------|---------------------|-------------------------------|----------| | 100 | 5 | 1214 | 912 | ~25% | | 100 | 50 | 1146 | 866 | ~24% | | 100 | 200 | 849 | 690 | ~19% | | 1,000 | 5 | 84 | 63 | ~25% | | 1,000 | 50 | 66 | 64 | ~3% | | 1,000 | 200 | 89 | 70 | ~22% | | 10,000 | 5 | 5.0 | 3.4 | ~32% | | 10,000 | 50 | 5.6 | 3.5 | ~38% | | 10,000 | 200 | 5.7 | 3.2 | ~44% | | 100,000 | 5 | 0.265 | 0.174 | ~34% | | 100,000 | 50 | 0.265 | 0.174 | ~34% | | 100,000 | 200 | 0.265 | 0.166 | ~37% | That said... I wonder how much we'd care about this in practice? This operation would probably be done once in the handling of a given search, and I suspect the difference in performance here would really show up in a meaningful way relative to the other work being done. But maybe I'm wrong? I dunno. WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
