[ 
https://issues.apache.org/jira/browse/LUCENE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178653#comment-14178653
 ] 

Robert Muir commented on LUCENE-5441:
-------------------------------------

{quote}
require the cost to be provided explicitely, so that you can use the actual set 
cardinality if you already know it, or use cardinality() if it makes sense. 
This should help make better decisions when there are bitsets involved.
{quote}

+1 to fixing this. I would even make a default/sugar ctor that just forwards 
cardinality(). 

Currently, people are making the wrong tradeoffs. they are so afraid to call 
cardinality up-front a single time, and instead pay the price with bad 
execution over and over again for e.g. cached filters.


> Decouple DocIdSet from OpenBitSet and FixedBitSet
> -------------------------------------------------
>
>                 Key: LUCENE-5441
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5441
>             Project: Lucene - Core
>          Issue Type: Task
>          Components: core/other
>    Affects Versions: 4.6.1
>            Reporter: Uwe Schindler
>             Fix For: Trunk
>
>         Attachments: LUCENE-5441.patch, LUCENE-5441.patch, LUCENE-5441.patch, 
> LUCENE-5441.patch
>
>
> Back from the times of Lucene 2.4 when DocIdSet was introduced, we somehow 
> kept the stupid "filters can return a BitSet directly" in the code. So lots 
> of Filters return just FixedBitSet, because this is the superclass (ideally 
> interface) of FixedBitSet.
> We should decouple that and *not* implement that abstract interface directly 
> by FixedBitSet. This leads to bugs e.g. in BlockJoin, because it used Filters 
> in a wrong way, just because it was always returning Bitsets. But some 
> filters actually don't do this.
> I propose to let FixedBitSet (only in trunk, because that a major backwards 
> break) just have a method {{asDocIdSet()}}, that returns an anonymous 
> instance of DocIdSet: bits() returns the FixedBitSet itsself, iterator() 
> returns a new Iterator (like it always did) and the cost/cacheable methods 
> return static values.
> Filters in trunk would need to be changed like that:
> {code:java}
> FixedBitSet bits = ....
> ...
> return bits;
> {code}
> gets:
> {code:java}
> FixedBitSet bits = ....
> ...
> return bits.asDocIdSet();
> {code}
> As this methods returns an anonymous DocIdSet, calling code can no longer 
> rely or check if the implementation behind is a FixedBitSet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to