[ https://issues.apache.org/jira/browse/LUCENE-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16835481#comment-16835481 ]
Simon Willnauer commented on LUCENE-8757: ----------------------------------------- Thanks for the additional iteration, now that we simplified this can we remove the sorting? I don't necessearily see how the sort makes things simpler. If we see a segment > threshold we can just add it as a group? I though you did that already and hence my comment about the assertion. WDYT? I also want to suggest to beef up testing a bit with a randomized version of this like this: {code} diff --git a/lucene/test-framework/src/java/org/apache/lucene/util/LuceneTestCase.java b/lucene/test-framework/src/java/org/apache/lucene/util/LuceneTestCase.java index 7c63a817adb..76ccca64ee7 100644 --- a/lucene/test-framework/src/java/org/apache/lucene/util/LuceneTestCase.java +++ b/lucene/test-framework/src/java/org/apache/lucene/util/LuceneTestCase.java @@ -1933,6 +1933,14 @@ public abstract class LuceneTestCase extends Assert { ret = random.nextBoolean() ? new AssertingIndexSearcher(random, r, ex) : new AssertingIndexSearcher(random, r.getContext(), ex); + } else if (random.nextBoolean()) { + int maxDocPerSlice = 1 + random.nextInt(100000); + ret = new IndexSearcher(r, ex) { + @Override + protected LeafSlice[] slices(List<LeafReaderContext> leaves) { + return slices(leaves, maxDocPerSlice); + } + }; } else { ret = random.nextBoolean() ? new IndexSearcher(r, ex) {code} > Better Segment To Thread Mapping Algorithm > ------------------------------------------ > > Key: LUCENE-8757 > URL: https://issues.apache.org/jira/browse/LUCENE-8757 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Atri Sharma > Priority: Major > Attachments: LUCENE-8757.patch, LUCENE-8757.patch, LUCENE-8757.patch > > > The current segments to threads allocation algorithm always allocates one > thread per segment. This is detrimental to performance in case of skew in > segment sizes since small segments also get their dedicated thread. This can > lead to performance degradation due to context switching overheads. > > A better algorithm which is cognizant of size skew would have better > performance for realistic scenarios -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org