[ https://issues.apache.org/jira/browse/LUCENE-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16838363#comment-16838363 ]
Adrien Grand commented on LUCENE-8757: -------------------------------------- Yes. Top-docs collectors are expected to tie-break by doc ID in case documents compare equal. Things like TopDocs#merge compare doc IDs explicitly for that purpose, but Collector#collect implementations just rely on the fact that documents are collected in order to ignore documents that compare equal to the current k-th best hit. So we need to sort segments within a slice by docBase in order to get the same top hits regardless of how slices have been constructed. > Better Segment To Thread Mapping Algorithm > ------------------------------------------ > > Key: LUCENE-8757 > URL: https://issues.apache.org/jira/browse/LUCENE-8757 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Atri Sharma > Assignee: Simon Willnauer > Priority: Major > Attachments: LUCENE-8757.patch, LUCENE-8757.patch, LUCENE-8757.patch, > LUCENE-8757.patch > > > The current segments to threads allocation algorithm always allocates one > thread per segment. This is detrimental to performance in case of skew in > segment sizes since small segments also get their dedicated thread. This can > lead to performance degradation due to context switching overheads. > > A better algorithm which is cognizant of size skew would have better > performance for realistic scenarios -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org