Hi,

Would it be OK to add one method in Filter class that
returns DocNrSkipper interface from Pauls's "Compact
sparse Filter" in jira LUCENE-328

This would be the first step for: 
- smooth integration of compact representations of the
underlaying BitSet in Filter (VInt and sorted int[]).
They are often faster for and/or operations. 
- ChainedFilter (see contrib from Hoss) enhancement
that operates on DocNrSkipper (see And(Or)DocNrSkipper
in Paul's work) 

Compatibility problems do not exist, only BitSet has
to be constructed in bits() method, the same as today
 
The reasoning that justifies effort in this direction
is that distribution of tokens in typical collection
is perfect for these 3 representations of BitVectors
(Very Low freq tokens in sorted int[],  Very HF tokens
in VInt and the rest in BitSet )

To put it another way, Filter forces us to use BitSet,
which is rather inefficient way to store a few
documents from the big collection.

Any feedback appreceated, could easily happen that I
overlooked something essential.

Cheers, e.


        
        
                
___________________________________________________________ 
Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with voicemail 
http://uk.messenger.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to