mikemccand commented on issue #823: LUCENE-8939: Introduce Shared Count Early 
Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#issuecomment-527203682
 
 
   > > I'm trying to understand the behavior change Lucene users will see with 
this, when using concurrent searching for one query (passing `ExecutorService` 
to `IndexSearcher`):
   > > It looks like with the change such users will see their search precisely 
when the total collected hits exceeds the limit (1000 by default?), versus 
today where we will try to collect 1000 per segment and then reduce that to the 
top 1000 overall? So this means the results will change depending on thread 
execution/timing?
   > 
   > Looking at the documentation around `TOTAL_HITS_THRESHOLD`, I see that it 
intends to restrict the number of documents scored in total before the query is 
early terminated. If we do a single threaded search today, that is the behavior 
we get. However, for concurrent search, we actually look at N * 
`TOTAL_HITS_THRESHOLD`, where N is the number of slices. So, I believe that we 
are not doing the advertised behavior for concurrent searches in the status 
quo. This change should fix that.
   > 
   > However, you are correct that thread timing will come into play here -- 
different slices may have different contributions to the overall number of 
hits. However, since we are anyways not scoring all documents, I do not believe 
we offer any guarantees on the documents that we return -- even today, the best 
documents might be the ones which just came in and hence are on the last 
segments to be traversed, so never even get looked. WDYT?
   
   OK that makes sense @atris -- it seems that which specific top hits you'll 
get back is intentionally not defined in the API and so we have the freedom to 
make improvements like this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to