[
https://issues.apache.org/jira/browse/LUCENE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler updated LUCENE-5441:
----------------------------------
Attachment: LUCENE-5441.patch
New patch after LUCENE-5440 was committed. This includes the following extra
features:
- FixedBitDocIdSet was added instead of the asDocIdSet() method.
- removed the OpenBitSetDISI optimizations.
- The and/or/andNot/xor(DISI) methods are no longer available in FixedBitSet.
Those are only available in FixedBitDocIdSet. This leads t some additional
wrapping or less wrapping depending where it was used before. I did not review
all the automatic changes I did, surely there could be some private method
signatures changed in ChainedFilter/BooleanFilter to reduce wrapping.
- I also optimized FixedBitDocIdSet.xor(DISI) to use bitwise XOR, if the
iterator is a FixedBitSet one. This was missing in Shai's patch.
The current code does not change the DocIdSet abstract interface to support
inplace and/or/... (especially as this is only supported by bitsets, but not
the other DIS impls?!). I am also not yet happy with the current state of this
DIS wrapping. In any case - FixedBitSet is now clean from any DocIdSet uses!
It's just a BitSet, nothing more - like the Long one.
> Decouple DocIdSet from OpenBitSet and FixedBitSet
> -------------------------------------------------
>
> Key: LUCENE-5441
> URL: https://issues.apache.org/jira/browse/LUCENE-5441
> Project: Lucene - Core
> Issue Type: Task
> Components: core/other
> Affects Versions: 4.6.1
> Reporter: Uwe Schindler
> Fix For: 5.0
>
> Attachments: LUCENE-5441.patch, LUCENE-5441.patch, LUCENE-5441.patch
>
>
> Back from the times of Lucene 2.4 when DocIdSet was introduced, we somehow
> kept the stupid "filters can return a BitSet directly" in the code. So lots
> of Filters return just FixedBitSet, because this is the superclass (ideally
> interface) of FixedBitSet.
> We should decouple that and *not* implement that abstract interface directly
> by FixedBitSet. This leads to bugs e.g. in BlockJoin, because it used Filters
> in a wrong way, just because it was always returning Bitsets. But some
> filters actually don't do this.
> I propose to let FixedBitSet (only in trunk, because that a major backwards
> break) just have a method {{asDocIdSet()}}, that returns an anonymous
> instance of DocIdSet: bits() returns the FixedBitSet itsself, iterator()
> returns a new Iterator (like it always did) and the cost/cacheable methods
> return static values.
> Filters in trunk would need to be changed like that:
> {code:java}
> FixedBitSet bits = ....
> ...
> return bits;
> {code}
> gets:
> {code:java}
> FixedBitSet bits = ....
> ...
> return bits.asDocIdSet();
> {code}
> As this methods returns an anonymous DocIdSet, calling code can no longer
> rely or check if the implementation behind is a FixedBitSet.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]