romseygeek commented on PR #15436:
URL: https://github.com/apache/lucene/pull/15436#issuecomment-3759829391
I modified luceneutil[1] to test this out in the following ways:
- add a monotonically increasing ordinal value to the wikimedium data set,
to mimic the behaviour of a timestamp field that's correlated with index order.
- updated the Indexer and TaskParser to add this as an index field and make
it available for sorting
- added some search tasks for both Term and MatchAll searches, sorted by
ordinal
I get the following results:
``` TaskQPS baseline StdDevQPS
my_modified_version StdDev Pct diff p-value
TermDTSort 2108.03 (6.1%) 1324.05
(6.9%) -37.2% ( -47% - -25%) 0.000
MatchAllDateTimeSort 51.97 (7.1%) 50.99
(7.2%) -1.9% ( -15% - 13%) 0.404
PKLookup 377.71 (2.1%) 419.54
(15.3%) 11.1% ( -6% - 29%) 0.001
TermDateTimeDescSort 16.33 (7.8%) 1502.04
(1044.4%) 9099.4% (7461% - 11015%) 0.000
MatchAllDateTimeDescSort 0.52 (10.7%) 304.68
(4863.0%)59018.8% (48922% - 71528%) 0.000
```
Descending sorts get a massive speedup, but fast searches using the index
sort get penalised by the new fixed overhead. I think maybe the answer here is
to skip segment re-sorting if the Sort defined on the query is the same as the
index sort? Because in that case early termination is always going to be fast.
[1] https://github.com/romseygeek/luceneutil/tree/ordinal-index-sort
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]