[ 
https://issues.apache.org/jira/browse/LUCENE-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051640#comment-17051640
 ] 

Michael Sokolov commented on LUCENE-8929:
-----------------------------------------

OK, hm I now believe the results posted above are somewhat suspect since 
`luceneutil` had a bug when measuring timing of multiple tasks under concurrent 
search. Since the use of multiple threads enables the test harness to consume 
all system resources, there can be contention among multiple tasks run 
concurrently, leading to starvation of the fast ones. I pushed a patch that 
works around this issue in luceneutil and did a new round of measurements. I 
also updated the PR (opened a new PR actually- #1316, since it had been so long 
it was hard to re-use the old one). Updates since the previous PR include:
 # Added tiebreak by docid to ensure we collect the right hits (there doesn't 
seem to be a test for this? Probably we should add one)
 # No longer terminate by max min scores; the impact is largely superseded by 
this change. I measured having both and it seemed to hurt in most cases. 
Possibly we should terminate based on max-min score when 0 < N < 100, and using 
this approach (min-min score) for N > 100
 #  This now bounds the thread count used to determine update frquency by the 
number of slices
 # I added some paranoid protections against a race condition that probably 
didn't exist, but I was going crazy with inconsistent results and this seemed 
like a good idea at the time.Might want to revisit this and remove some 
unneeded code in MaxScoreTerminator.

Here's a better description of what this PR is:

Performs early termination in TopFieldCollector (when query sort matches index 
sort) based on the minimum minimum score across all leaf collectos, providing 
nice gains for queries with large N. Once we have globally collected the number 
of hits required to satisfy the query (which is max(N, 1000)) then the minimum 
minimum score across threads is a global lower bound on the score that must be 
met by any hit. We already have sufficient hits to satisfy the query with score 
better than that, so any later hit with a worse score will be discarded, and 
any Collector retrieving such a hit can be terminated. Note that this is a 
looser bound than the maximum minimum computed in LUCENE-8978, but that bound 
can only be applied once a collector has collected N hits and all collectors 
together have collected 1000. When N is relatively large, this weaker bound 
often applies before N hits have been collected. Further, once a collector 
terminates, the same logic applies to the remaining collectors, which can 
result in raising the bound further. More precisely, the termination bound used 
is the minimum minimum score among the top collectors (ranked by their minimum 
scores) that together have at least max(N, 1000) hits.

h2. N=1000
||Task|| QPS before|| StdDev|| QPS after|| StdDev|| Pct diff||
|LowTermDayOfYearSort|      216.94|      (0.3%)|      549.10|     (15.9%)|  
153.1% ( 136% -  169%)|
|HighTermDayOfYearSort|      277.36|      (0.9%)|     1107.12|     (25.9%)|  
299.2% ( 269% -  328%)|

h2.  N=500
||Task|| QPS before|| StdDev|| QPS after|| StdDev|| Pct diff||
|LowTermDayOfYearSort|  385.47|  (5.4%)|  649.01|  (4.9%)   | 68.4% (  55% -   
83%)|
|HighTermDayOfYearSort|  499.03|  (6.5%)| 1133.98| (27.5%)  |127.2% (  87% -  
172%)|

h2. N=100
||Task|| QPS before|| StdDev|| QPS after|| StdDev|| Pct diff||
|LowTermDayOfYearSort|  578.19|  (1.4%)|  589.86|  (2.0%)|2.0% (  -1% -    5%)|
|HighTermDayOfYearSort| 1444.25|  (1.8%)| 1508.98|  (9.6%)|4.5% (  -6% -   16%)|

h2. N=20
||Task|| QPS before|| StdDev|| QPS after|| StdDev|| Pct diff||
|HighTermDayOfYearSort| 1857.14 |(2.4%)|1766.53 |(2.0%)|-4.9% ( -9% - 0%)|
|LowTermDayOfYearSort| 628.71| (2.3%)| 620.24| (1.6%) |-1.3% ( -5% - 2%)|



 

> Early Terminating CollectorManager
> ----------------------------------
>
>                 Key: LUCENE-8929
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8929
>             Project: Lucene - Core
>          Issue Type: Sub-task
>            Reporter: Atri Sharma
>            Priority: Major
>          Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> We should have an early terminating collector manager which accurately tracks 
> hits across all of its collectors and determines when there are enough hits, 
> allowing all the collectors to abort.
> The options for the same are:
> 1) Shared total count : Global "scoreboard" where all collectors update their 
> current hit count. At the end of each document's collection, collector checks 
> if N > threshold, and aborts if true
> 2) State Reporting Collectors: Collectors report their total number of counts 
> collected periodically using a callback mechanism, and get a proceed or abort 
> decision.
> 1) has the overhead of synchronization in the hot path, 2) can collect 
> unnecessary hits before aborting.
> I am planning to work on 2), unless objections



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to