[ 
https://issues.apache.org/jira/browse/LUCENE-3504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13124789#comment-13124789
 ] 

Simon Willnauer commented on LUCENE-3504:
-----------------------------------------

mike let me explain my intention here. You are right we used to do this here 
but:
 * IDV is a strickly dense storage ie. each document has a value, that is the 
basic assumption.
 * if you want a default value you should specify it. if you don't specify it 
we provide best effort to do this for you.
 * consistency is very important here, all variants return a value for every 
doc. For numerics its 0 / 0.0 for bytes its BytesRef initialized with the 
default depending on the variant var/fixed.
 * the null invariant forces users to do a check for every document which makes 
no sense based on the first assumption
 * if you have a numeric value you can't check for mission values since those 
values are primitives, again consistency
 
I think we should not copy the behavior from FC here for the above reasons. 
what we should rather do is make this absolutely clear and remove the return 
value from getBytes(BR) and document that the BR will always be filled. if you 
want to have some "missing value" behavior you should make sure you add the 
right values. The sort missing last/first stuff seems like something born from 
the fact that we build FC by uninverting an indexed field and IDV doesn't have 
this limitation.
                
> DocValues: deref/sorted bytes types shouldn't return empty byte[] when doc 
> didn't have a value
> ----------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3504
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3504
>             Project: Lucene - Java
>          Issue Type: Bug
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>
> I'm looking at making a FieldComparator that uses DV's SortedSource to
> sort by string field (ie just like TermOrdValComparator, except using
> DV instead of FieldCache).  We already have comparators for DV int and
> float DV fields.
> But one thing I noticed is we can't detect documents that didn't have
> any value indexed vs documents that had empty byte[] indexed.
> This is easy to fix (and we used to do this), because these types are
> deref'd (ie, each doc stores an address, and then separately looks up
> the byte[] at that address), we can reserve ord/address 0 to mean "doc
> didn't have the field".  Then we should return null when you retrieve
> the BytesRef value for that field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to