[ https://issues.apache.org/jira/browse/LUCENE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300126#comment-15300126 ]
Adrien Grand commented on LUCENE-7299: -------------------------------------- Thanks Dawid for sharing your implementation and experience! It looks very similar to the patch except the redistribution logic and the fact that your imp has the ability to parallelize in a ForkJoinPool. I tried to replace the redistribution logic out of curiosity but performance was the same. bq. when you're descending into same prefix blocks you can disregard those prefixes in comparisons The patch already does this, I agree this is an important optimization. bq. There is also a hook inside byte block list to allow you to retrieve a single byte at a given offset so there's no need to copy keys over and over again (.byteAt). I think it should be fine with BytesRefHash since it just returns a BytesRef that points to an internal structure rather than copying bytes. Adding a byteAt method might help further optimize it but I'd rather not have to add APIs to BytesRefHash for now. > BytesRefHash.sort() should use radix sort? > ------------------------------------------ > > Key: LUCENE-7299 > URL: https://issues.apache.org/jira/browse/LUCENE-7299 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Adrien Grand > Assignee: Adrien Grand > Priority: Minor > Attachments: ByteBlockListSorter.java, LUCENE-7299.patch > > > Switching DocIdSetBuilder to radix sort helped make things significantly > faster. We should be able to do the same with BytesRefHash.sort()? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org