[ https://issues.apache.org/jira/browse/LUCENE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300197#comment-15300197 ]
Dawid Weiss commented on LUCENE-7299: ------------------------------------- No problem at all, Adrien. Sorry for not reading your patch in detail -- I just skimmed through quickly since it's something we had worked on before. bq. [...] except the redistribution logic and the fact that your imp has the ability to parallelize in a ForkJoinPool. I tried to replace the redistribution logic out of curiosity but performance was the same. Both are needed because our sorted sets are much, much larger than typical Lucene buffers. When you have millions of (smallish) entries to sort the redistribution index took a lot of extra space -- it was never a performance win, it was a memory conservative strategy. bq. I think it should be fine with BytesRefHash since it just returns a BytesRef that points to an internal structure rather than copying bytes. Again, this was a significant performance boost in our case because of the size of structures we sort -- we also didn't copy the content of strings, but even filling in the pointer and length in a reused "pointer-like" class (much like BytesRef) was quite costly. There is also a related issue of avoiding extra allocations in BytesRefHash that I filed a while ago -- if you're working on that piece of code you may be interested in looking at it (LUCENE-5854). > BytesRefHash.sort() should use radix sort? > ------------------------------------------ > > Key: LUCENE-7299 > URL: https://issues.apache.org/jira/browse/LUCENE-7299 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Adrien Grand > Assignee: Adrien Grand > Priority: Minor > Attachments: ByteBlockListSorter.java, LUCENE-7299.patch, > LUCENE-7299.patch > > > Switching DocIdSetBuilder to radix sort helped make things significantly > faster. We should be able to do the same with BytesRefHash.sort()? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org