[
https://issues.apache.org/jira/browse/LUCENE-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186342#comment-13186342
]
Robert Muir commented on LUCENE-3687:
-------------------------------------
{quote}
yeah we can do that I will look into it but I am not sure if we should rather
let that patch bake in for a bit and then do that change in a second issue.
Would make debugging simpler if we run into problems.
{quote}
I agree, that case is crazy today and it shouldn't block nor confuse the issue.
I just wanted us to have a plan for the file format. Otherwise there is no
point in writing OMIT_NORMS bit in the fieldinfoswriter because it could be
represented by normValueType of 0.
{quote}
I agree, maybe its better to get this right in this patch already I can still
move the stupid file checks removed in a second issue. But I should really
handle null types ie. Sims that don't set a value, currently I have tons of
asserts that enforce a value.
{quote}
This isn't a huge deal though, its mostly just curiousity. Previously you
always had to return something, we didnt even have the option for a sim (like
basic tf * idf) to not encode any length normalization information. The way you
had to do that before was to return a bogus byte in computeNorm and ensure you
always did omitNorms for the field.
If its tricky or messy, in my opinion we could even just add an assertion for
now and document "you must set something" in Similarity, because its a lower
level API than it was in previous release (most people would generally extend
higher level stuff like BM25Similarity, TFIDFSimilarity, or even
DefaultSimilarity that do not expose this stuff).
> Allow similarity to encode norms other than a single byte
> ---------------------------------------------------------
>
> Key: LUCENE-3687
> URL: https://issues.apache.org/jira/browse/LUCENE-3687
> Project: Lucene - Java
> Issue Type: New Feature
> Components: core/index, core/search
> Affects Versions: 4.0
> Reporter: Simon Willnauer
> Fix For: 4.0
>
> Attachments: LUCENE-3687.patch, LUCENE-3687.patch, LUCENE-3687.patch
>
>
> LUCENE-3628 cut over norms to docvalues. This removes the long standing
> limitation that norms are a single byte. Yet, we still need to expose this
> functionality to Similarity to write / encode norms in a different format.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]