[ https://issues.apache.org/jira/browse/LUCENE-5221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yubin Kim updated LUCENE-5221: ------------------------------ Description: {{SimilarityBase.computeNorm}} Javadoc indicates that the doc length should be encoded in the same way as {{TFIDFSimilarity}}. However, when {{discountOverlaps}} is {{false}}, what gets encoded is {{SmallFloat.floatToByte315((boost / (float) Math.sqrt(docLen / boost)));}} rather than {{SmallFloat.floatToByte315((boost / (float) Math.sqrt(length)));}} due to the extra {{/ state.getBoost()}} term in {{SimilarityBase.computeNorm}}: final float numTerms; if (discountOverlaps) numTerms = state.getLength() - state.getNumOverlap(); else numTerms = state.getLength() */ state.getBoost();* return encodeNormValue(state.getBoost(), numTerms); was: {{SimilarityBase.computeNorm}} Javadoc indicates that the doc length should be encoded in the same way as {{TFIDFSimilarity}}. However, when {{discountOverlaps}} is {{false}}, what gets encoded is {{SmallFloat.floatToByte315((boost / (float) Math.sqrt(docLen / boost)));}} rather than {{SmallFloat.floatToByte315((boost / (float) Math.sqrt(length)));}} due to the extra {{/ state.getBoost()}} term in {{SimilarityBase.computeNorm}}: numTerms = state.getLength() */ state.getBoost();* > SimilarityBase.computeNorm is inconsistent with TFIDFSimilarity > --------------------------------------------------------------- > > Key: LUCENE-5221 > URL: https://issues.apache.org/jira/browse/LUCENE-5221 > Project: Lucene - Core > Issue Type: Bug > Components: core/search > Affects Versions: 4.4 > Reporter: Yubin Kim > Labels: normalize, search, similarity > > {{SimilarityBase.computeNorm}} Javadoc indicates that the doc length should > be encoded in the same way as {{TFIDFSimilarity}}. However, when > {{discountOverlaps}} is {{false}}, what gets encoded is > {{SmallFloat.floatToByte315((boost / (float) Math.sqrt(docLen / boost)));}} > rather than {{SmallFloat.floatToByte315((boost / (float) > Math.sqrt(length)));}} due to the extra {{/ state.getBoost()}} term in > {{SimilarityBase.computeNorm}}: > final float numTerms; > if (discountOverlaps) > numTerms = state.getLength() - state.getNumOverlap(); > else > numTerms = state.getLength() */ state.getBoost();* > return encodeNormValue(state.getBoost(), numTerms); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org