[ https://issues.apache.org/jira/browse/LUCENE-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14623219#comment-14623219 ]
David Smiley commented on LUCENE-6645: -------------------------------------- bq. When I looked at IntersectsRPTVerifyQuery, I saw it was using the produced bits so I thought it actually need bit sets, but maybe it doesn't and we could just use advance()? It doesn't need bit sets (random-access). I did a little playing around just now and saw DocIdSetBuilder plugged in easily except for isDefinitelyEmpty(). ... bq. Regarding isDefinitelyEmpty, I'm wondering if we could keep the builders initially empty and then instantiate them on the first time than we need to add data? Then we could use a null check to know whether they have any content at all, would it work? We could, but that's a _little_ more error prone (null check) & more code than simply having an isDefinitelyEmpty() method. In fact it would simply be isEmpty() for DocIdSetBuilder as it has a definitive answer. Nonetheless if you feel this method is somehow a bad idea then we can proceed with your suggestion. RE RoaringDocIdSet -- that's very interesting; thanks for the background. Perhaps a comment in QueryBitSetProducer would clarify why this choice is made. > BKD tree queries should use BitDocIdSet.Builder > ----------------------------------------------- > > Key: LUCENE-6645 > URL: https://issues.apache.org/jira/browse/LUCENE-6645 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Michael McCandless > Fix For: 5.3, Trunk > > Attachments: LUCENE-6645.patch, LUCENE-6645.patch, LUCENE-6645.patch, > LUCENE-6645.patch, LUCENE-6645.patch, LUCENE-6645.patch > > > When I was iterating on BKD tree originally I remember trying to use this > builder (which makes a sparse bit set at first and then upgrades to dense if > enough bits get set) and being disappointed with its performance. > I wound up just making a FixedBitSet every time, but this is obviously > wasteful for small queries. > It could be the perf was poor because I was always .or'ing in DISIs that had > 512 - 1024 hits each time (the size of each leaf cell in the BKD tree)? I > also had to make my own DISI wrapper around each leaf cell... maybe that was > the source of the slowness, not sure. > I also sort of wondered whether the SmallDocSet in spatial module (backed by > a SentinelIntSet) might be faster ... though it'd need to be sorted in the > and after building before returning to Lucene. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org