[ 
https://issues.apache.org/jira/browse/LUCENE-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14623219#comment-14623219
 ] 

David Smiley commented on LUCENE-6645:
--------------------------------------

bq. When I looked at IntersectsRPTVerifyQuery, I saw it was using the produced 
bits so I thought it actually need bit sets, but maybe it doesn't and we could 
just use advance()?

It doesn't need bit sets (random-access).  I did a little playing around just 
now and saw DocIdSetBuilder plugged in easily except for isDefinitelyEmpty(). 
...

bq. Regarding isDefinitelyEmpty, I'm wondering if we could keep the builders 
initially empty and then instantiate them on the first time than we need to add 
data? Then we could use a null check to know whether they have any content at 
all, would it work?

We could, but that's a _little_ more error prone (null check) & more code than 
simply having an isDefinitelyEmpty() method.  In fact it would simply be 
isEmpty() for DocIdSetBuilder as it has a definitive answer.  Nonetheless if 
you feel this method is somehow a bad idea then we can proceed with your 
suggestion.  

RE RoaringDocIdSet -- that's very interesting; thanks for the background.  
Perhaps a comment in QueryBitSetProducer would clarify why this choice is made.

> BKD tree queries should use BitDocIdSet.Builder
> -----------------------------------------------
>
>                 Key: LUCENE-6645
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6645
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>             Fix For: 5.3, Trunk
>
>         Attachments: LUCENE-6645.patch, LUCENE-6645.patch, LUCENE-6645.patch, 
> LUCENE-6645.patch, LUCENE-6645.patch, LUCENE-6645.patch
>
>
> When I was iterating on BKD tree originally I remember trying to use this 
> builder (which makes a sparse bit set at first and then upgrades to dense if 
> enough bits get set) and being disappointed with its performance.
> I wound up just making a FixedBitSet every time, but this is obviously 
> wasteful for small queries.
> It could be the perf was poor because I was always .or'ing in DISIs that had 
> 512 - 1024 hits each time (the size of each leaf cell in the BKD tree)?  I 
> also had to make my own DISI wrapper around each leaf cell... maybe that was 
> the source of the slowness, not sure.
> I also sort of wondered whether the SmallDocSet in spatial module (backed by 
> a SentinelIntSet) might be faster ... though it'd need to be sorted in the 
> and after building before returning to Lucene.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to