[ 
https://issues.apache.org/jira/browse/LUCENE-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844848#comment-16844848
 ] 

Atri Sharma commented on LUCENE-8757:
-------------------------------------

[~jpountz] Essentially, the idea is to maintain the previous leaf's maxDoc 
outside the scope of per leaf collector and move it to AssertingCollector's 
state, right? 

If I understood you correctly, attached patch should fix this. I verified that 
the test the previous iteration added specifically for the out of order docIDs 
catches this issue, but agree that AssertingCollector should have the right 
assertions in place.

 
{quote}Looking at the AssertingCollector again, it has a check that doc IDs are 
collected in doc ID order, so I wonder why this assertion didn't trip with the 
earlier version of your patch that sorted leaves by decreasing maxDoc. Maybe we 
just got lucky? 
{quote}
Do you think similar assertions/checks would make sense in IndexSearcher too? 
If AssertingCollector missed this issue, maybe we should make IndexSearcher's 
input arguments validation more robust as well. WDYT?

> Better Segment To Thread Mapping Algorithm
> ------------------------------------------
>
>                 Key: LUCENE-8757
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8757
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Atri Sharma
>            Assignee: Simon Willnauer
>            Priority: Major
>         Attachments: LUCENE-8757.patch, LUCENE-8757.patch, LUCENE-8757.patch, 
> LUCENE-8757.patch, LUCENE-8757.patch, LUCENE-8757.patch, LUCENE-8757.patch, 
> LUCENE-8757.patch, LUCENE-8757.patch
>
>
> The current segments to threads allocation algorithm always allocates one 
> thread per segment. This is detrimental to performance in case of skew in 
> segment sizes since small segments also get their dedicated thread. This can 
> lead to performance degradation due to context switching overheads.
>  
> A better algorithm which is cognizant of size skew would have better 
> performance for realistic scenarios



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to