mayya-sharipova commented on issue #1351: LUCENE-9280: Collectors to skip noncompetitive documents URL: https://github.com/apache/lucene-solr/pull/1351#issuecomment-606259561 @msokolov Sorry again for reporting incorrect benchmarking results. Below are are my latest results, and I feel quite confident in their correctness. First about the benchmarking setup. 1. [Here](https://github.com/mayya-sharipova/luceneutil/commit/e0d86b24053cc8a68796abd9f0fd08dbac899779) are the changes made to `luceneutil` 2. `patch` folder is checkout as this PR 3. `trunk` folder is checkout as this PR as well with a modification. As there is no `LongDocValuesPointSortField` in master, I can't benchmark [sorting using this field](https://github.com/mayya-sharipova/luceneutil/commit/e0d86b24053cc8a68796abd9f0fd08dbac899779#diff-58e50bb4a8f0be480df656bcd84d5b77R76) on master. What I did is just is on `trunk` folder delegated sorting to the traditional sorting on a long field like this: ```java public class LongDocValuesPointSortField extends SortField { public LongDocValuesPointSortField(String field) { super(field, SortField.Type.LONG); } public LongDocValuesPointSortField(String field, boolean reverse) { super(field, SortField.Type.LONG, reverse); } } ``` So basically I was benchmarking a traditional long sort VS a long sort using a new field `LongDocValuesPointSortField`. wikimedium10m: 10 millon docs, up to 2x speedups ``` TaskQPS baseline StdDevQPS patch StdDev Pct diff TermDTSort 64.53 (6.4%) 155.29 (42.3%) 140.7% ( 86% - 202%) HighTermDayOfYearSort 47.63 (5.4%) 50.47 (6.8%) 6.0% ( -5% - 19%) HighTermMonthSort 110.07 (7.3%) 121.13 (6.8%) 10.0% ( -3% - 26%) WARNING: cat=TermDTSort: hit counts differ: 754451 vs 1669+ ``` wikimediumall: about 33 million docs, up to 3.5 x speedups ``` TaskQPS baseline StdDevQPS patch StdDev Pct diff TermDTSort 28.96 (4.3%) 108.45 (56.9%) 274.5% ( 204% - 350%) HighTermDayOfYearSort 9.69 (5.1%) 9.56 (6.1%) -1.3% ( -11% - 10%) HighTermMonthSort 39.41 (4.7%) 47.99 (10.0%) 21.8% ( 6% - 38%) WARNING: cat=TermDTSort: hit counts differ: 1474717 vs 1070+ ``` Please let me know if these results and methodology make sense.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org