[ https://issues.apache.org/jira/browse/LUCENE-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14614654#comment-14614654 ]
Adrien Grand commented on LUCENE-6645: -------------------------------------- Thanks for having a look! bq. Not sure it will help that much, since most of the cells are visited via addAll... Indeed I did not try to optimize here since my profiler did not see it as a hotspot. bq. Maybe we should try to contain the added hairiness to BKDTreeReader instead of DocIdSetBuilder if indeed this is the only user of this API that is so strange (or of tons of tiny sorted docID blocks) I agree the API is a bit crazy now. :) It was a way to avoid checking the array length on every call to add(). I'll see how much it costs to remove this 'Adder' class in favour of a grow() method . However, adding tons of tiny sorted doc ID blocks is something that MultiTermQueries can do all the time too so I'm quite happy that your benchmark revealed this inefficiency. > BKD tree queries should use BitDocIdSet.Builder > ----------------------------------------------- > > Key: LUCENE-6645 > URL: https://issues.apache.org/jira/browse/LUCENE-6645 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Michael McCandless > Attachments: LUCENE-6645.patch, LUCENE-6645.patch > > > When I was iterating on BKD tree originally I remember trying to use this > builder (which makes a sparse bit set at first and then upgrades to dense if > enough bits get set) and being disappointed with its performance. > I wound up just making a FixedBitSet every time, but this is obviously > wasteful for small queries. > It could be the perf was poor because I was always .or'ing in DISIs that had > 512 - 1024 hits each time (the size of each leaf cell in the BKD tree)? I > also had to make my own DISI wrapper around each leaf cell... maybe that was > the source of the slowness, not sure. > I also sort of wondered whether the SmallDocSet in spatial module (backed by > a SentinelIntSet) might be faster ... though it'd need to be sorted in the > and after building before returning to Lucene. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org