[ 
https://issues.apache.org/jira/browse/LUCENE-1360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987824#action_12987824
 ] 

Robert Muir commented on LUCENE-1360:
-------------------------------------

The only issue i have with the floatToByte52 is its a 'trap' so to speak,
that if you use it on a too-long field (or maybe too-small boost), you end
out with a norm of zero.

In my opinion, the whole purpose of per-field support is so that you don't
have to make these sort of tradeoffs, but i imagine someone could
use an inappropriate similarity/schema sometime (misconfiguration)

to degrade better in this case, I suggest this change, which decodes 0-byte 
norms
as if they were 1-byte, so that scores won't be zeroed in the misconfiguration 
case...

change:

{noformat}
  static {
    for (int i = 0; i < 256; i++)
      NORM_TABLE[i] = SmallFloat.byte52ToFloat((byte)i);
  }
{noformat}

to:

{noformat}
  static {
    NORM_TABLE[0] = SmallFloat.byte52ToFloat((byte)1);
    for (int i = 1; i < 256; i++)
      NORM_TABLE[i] = SmallFloat.byte52ToFloat((byte)i);
  }
{noformat}


> A Similarity class which has unique length norms for numTerms <= 10
> -------------------------------------------------------------------
>
>                 Key: LUCENE-1360
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1360
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Query/Scoring
>            Reporter: Sean Timm
>            Assignee: Otis Gospodnetic
>            Priority: Trivial
>         Attachments: LUCENE-1360.patch, LUCENE-1380 visualization.pdf, 
> ShortFieldNormSimilarity.java
>
>
> A Similarity class which extends DefaultSimilarity and simply overrides 
> lengthNorm.  lengthNorm is implemented as a lookup for numTerms <= 10, else 
> as {{1/sqrt(numTerms)}}. This is to avoid term counts below 11 from having 
> the same lengthNorm after stored as a single byte in the index.
> This is useful if your search is only on short fields such as titles or 
> product descriptions.
> See mailing list discussion: 
> http://www.nabble.com/How-to-boost-the-score-higher-in-case-user-query-matches-entire-field-value-than-just-some-words-within-a-field-td19079221.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to