: Simply trying to understand why these strings generated such scores, and as : far as I can understand, the only difference between them is the field : norms, as all the other results maintain themselves. ... : Well, if this is true, the field norm for my first document should be 0.5 : (1/sqrt(4)) as "Livro - IPAD - O Guia do Profissional" ends up with the : terms "livro|ipad|guia|profissional" as tokens. ... : 3.6808658 = (MATCH) fieldWeight(itemName:ipad in 102507), product of: : 1.0 = tf(termFreq(itemName:ipad)=1) : 8.413407 = idf(docFreq=165, maxDocs=275239) : 0.4375 = fieldNorm(field=itemName, doc=102507)
fieldNorms are encoded into a compact byte representation which looses some precision... http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/search/Similarity.html#encodeNorm%28float%29 -Hoss