[jira] [Commented] (LUCENE-6006) Replace FieldInfo.normsType with FieldInfo.hasNorms boolean

Michael McCandless (JIRA) Tue, 14 Oct 2014 03:48:07 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-6006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170764#comment-14170764
 ]


Michael McCandless commented on LUCENE-6006:
--------------------------------------------

I think that because 1) default codec now handles sparse norms well, and 2) 
this is an "exotic" case, we should in fact just drop the "hasNorms" in trunk.  
We must keep it in 5.x because 4.x indices have such segments.

bq. Relying on the codec to do sparse compression in such a case is a little 
awkward.

Well, we can't keep compromising Lucene's internal design for the "least common 
denominator" of codecs out there.  If you are one of the apps hitting this 
exotic use case, you'll need to use a codec that can sparse-encode your norms...

bq.  A .. much more difficult... alternative would be for merging to not set 
indexed=true unless it actually sees a posting.

Hmm that's true ... but I think such an optimization is not worth the added 
code complexity for the exotic case where you indexed only some documents and 
then later deleted them all.

> Replace FieldInfo.normsType with FieldInfo.hasNorms boolean
> -----------------------------------------------------------
>
>                 Key: LUCENE-6006
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6006
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 5.0, Trunk
>
>         Attachments: LUCENE-6006.patch
>
>
> I came across this precursor while working on LUCENE-6005:
> I think FieldInfo.normsType can only be null (field did not index
> norms) or DocValuesType.NUMERIC (it did).  I'd like to simplify to
> just boolean hasNorms.
> This is a strange boolean, though: in theory it should be derived from
> {{indexed && omitNorms == false}}, but we have it for the exceptions
> case where every document in a segment hit an exception and never
> added norms.  I think this is the only reason it exists?  (In theory,
> such cases should result in 100% deleted segments, which IW should
> then drop ... but seems dangerous to "rely" on that).
> So I changed the indexing chain to just fill in the default (0) norms
> for all documents in such exceptional cases; this way going forward
> (starting with 5.0 indices) we really don't need this hasNorms.  But
> we still need it for pre-5.0 indices...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6006) Replace FieldInfo.normsType with FieldInfo.hasNorms boolean

Reply via email to