Scoring by document size

blopez Tue, 17 Sep 2013 04:42:31 -0700

Hi all,

I have some doubts about the Solr scoring function. I'm using all default
configuration, but I'm facing a wired issue with the retrieved scores.


In the schema, I'm going to focus in the only field I'm interested in. Its
definition is:

*<fieldType name="text" class="solr.TextField" sortMissingLast="true"
omitNorms="false">
                        <analyzer type="index">
                                <tokenizer 
class="solr.WhitespaceTokenizerFactory"/> 
                                <filter class="solr.LowerCaseFilterFactory"/>
                                <filter class="solr.ASCIIFoldingFilterFactory"/>
                        </analyzer>
                        <analyzer type="query">
                                <tokenizer 
class="solr.WhitespaceTokenizerFactory"/> 
                                <filter class="solr.LowerCaseFilterFactory"/>
                                <filter class="solr.ASCIIFoldingFilterFactory"/>
                        </analyzer>
</fieldType>

<field name="myField" type="text" indexed="true" stored="true"
required="false" />*

(omitNorms="false", if not, the document size is not taken into account to
the final score)

Then, I index some documents, with the following text in the 'myField'
field:

doc1 = "A B C"
doc2 = "A B C D"
doc3 = "A B C D E"
doc4 = "A B C D E F"
doc5 = "A B C D E F G H"
doc6 = "A B C D E F G H I"

Finally, I perform the query 'myField:("A" "B" "C")' in order to recover all
the documents, but with different scoring (doc1 is more similar to the query
than doc2, which is more similar than doc3, ...).

All the documents are retrieved (OK), but the scores are like this:

*doc1 = 2,590214
doc2 = 2,590214*
doc3 = 2,266437
*doc4 = 1,94266
doc5 = 1,94266*
doc6 = 1,618884

So in conclussion, as you can see the score goes down, but not the way I'd
like. Doc1 is getting the same scoring than Doc2, even when Doc1 matches 3/3
tokens, and Doc2 matches 3/4 tokens.

Is this the normal Solr behaviour? Is there any way to get my expected
behaviour?

Thanks a lot,
Borja.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Scoring-by-document-size-tp4090523.html
Sent from the Solr - User mailing list archive at Nabble.com.

Scoring by document size

Reply via email to