[ 
https://issues.apache.org/jira/browse/LUCENE-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12913740#action_12913740
 ] 

J.J. Larrea commented on LUCENE-2649:
-------------------------------------

I only just waded through this thread, so apologies in advance if this is 
redundant or off-topic...

It seems to me that there could and should be a standalone enhancement to 
FieldCache/FCImpl to support Boolean-valued fields. 

Since there is no native array-of-bits in Java, it could have the signature:

    BitSet getBits(IndexReader reader, String field, BooleanParser parser)  
[implementation returning an OpenBitSet for efficiency]

A pre-supplied BooleanParser implementation StringMatchBooleanParser could map 
any of one of a set of uncased strings to true, and a default subclass eg. 
DefaultStringMatchBooleanParser could supply { "T", "TRUE", "1", "Y", "YES" } 
for the set of strings.  So the defaulted and typical case getBits( ir, "field" 
) would do what one typically expects of boolean-valued fields.

With that in place, then couldn't one simply define a parser that indicates 
value present for a docID regardless of what the term value is:

    public static BooleanParser AlwaysReturnTrueBooleanParser = new 
BooleanParser() { public boolean parseByte(BytesRef term) { return true; } }

    BitSet getValueExists(IndexReader reader, String field) {
       return  getBits( ir, field, AlwaysReturnTrueBooleanParser );
    }
 
Then a client (e.g. FieldComparator implementation) interested in ValueExists 
values could ask for them, and they would be independently cached from whatever 
other field type cache(s) were requested on that field by the same or different 
clients.  The only cost would be iterating the Term/docID iterators a second 
time (as for additional cache variants on the same field) - minor.

Does this make sense?

> FieldCache should include a BitSet for matching docs
> ----------------------------------------------------
>
>                 Key: LUCENE-2649
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2649
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Ryan McKinley
>             Fix For: 4.0
>
>         Attachments: LUCENE-2649-FieldCacheWithBitSet.patch, 
> LUCENE-2649-FieldCacheWithBitSet.patch, 
> LUCENE-2649-FieldCacheWithBitSet.patch, 
> LUCENE-2649-FieldCacheWithBitSet.patch, LUCENE-2649-FieldCacheWithBitSet.patch
>
>
> The FieldCache returns an array representing the values for each doc.  
> However there is no way to know if the doc actually has a value.
> This should be changed to return an object representing the values *and* a 
> BitSet for all valid docs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to