[ 
https://issues.apache.org/jira/browse/LUCENE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-5441:
----------------------------------

    Attachment: LUCENE-5441.patch

New patch after LUCENE-5440 was committed. This includes the following extra 
features:
- FixedBitDocIdSet was added instead of the asDocIdSet() method.
- removed the OpenBitSetDISI optimizations.
- The and/or/andNot/xor(DISI) methods are no longer available in FixedBitSet. 
Those are only available in FixedBitDocIdSet. This leads t some additional 
wrapping or less wrapping depending where it was used before. I did not review 
all the automatic changes I did, surely there could be some private method 
signatures changed in ChainedFilter/BooleanFilter to reduce wrapping.
- I also optimized FixedBitDocIdSet.xor(DISI) to use bitwise XOR, if the 
iterator is a FixedBitSet one. This was missing in Shai's patch.

The current code does not change the DocIdSet abstract interface to support 
inplace and/or/... (especially as this is only supported by bitsets, but not 
the other DIS impls?!). I am also not yet happy with the current state of this 
DIS wrapping. In any case - FixedBitSet is now clean from any DocIdSet uses! 
It's just a BitSet, nothing more - like the Long one.

> Decouple DocIdSet from OpenBitSet and FixedBitSet
> -------------------------------------------------
>
>                 Key: LUCENE-5441
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5441
>             Project: Lucene - Core
>          Issue Type: Task
>          Components: core/other
>    Affects Versions: 4.6.1
>            Reporter: Uwe Schindler
>             Fix For: 5.0
>
>         Attachments: LUCENE-5441.patch, LUCENE-5441.patch, LUCENE-5441.patch
>
>
> Back from the times of Lucene 2.4 when DocIdSet was introduced, we somehow 
> kept the stupid "filters can return a BitSet directly" in the code. So lots 
> of Filters return just FixedBitSet, because this is the superclass (ideally 
> interface) of FixedBitSet.
> We should decouple that and *not* implement that abstract interface directly 
> by FixedBitSet. This leads to bugs e.g. in BlockJoin, because it used Filters 
> in a wrong way, just because it was always returning Bitsets. But some 
> filters actually don't do this.
> I propose to let FixedBitSet (only in trunk, because that a major backwards 
> break) just have a method {{asDocIdSet()}}, that returns an anonymous 
> instance of DocIdSet: bits() returns the FixedBitSet itsself, iterator() 
> returns a new Iterator (like it always did) and the cost/cacheable methods 
> return static values.
> Filters in trunk would need to be changed like that:
> {code:java}
> FixedBitSet bits = ....
> ...
> return bits;
> {code}
> gets:
> {code:java}
> FixedBitSet bits = ....
> ...
> return bits.asDocIdSet();
> {code}
> As this methods returns an anonymous DocIdSet, calling code can no longer 
> rely or check if the implementation behind is a FixedBitSet.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to