RE: Result scoring question

Armbrust, Daniel C. Wed, 14 Apr 2004 13:10:08 -0700

I should have remembered that.

Here are the 3 explanations for the top 3 documents returned (contents below)


3.3513687 = product of:
  6.7027373 = weight(preferred_designation:"renal calculus" in 48270), product of:
    0.8114604 = queryWeight(preferred_designation:"renal calculus"), product of:
      18.88021 = idf(preferred_designation: renal=1111 calculus=37)
      0.04297941 = queryNorm
    8.260092 = fieldWeight(preferred_designation:"renal calculus" in 48270), product 
of:
      1.0 = tf(phraseFreq=1.0)
      18.88021 = idf(preferred_designation: renal=1111 calculus=37)
      0.4375 = fieldNorm(field=preferred_designation, doc=48270)
  0.5 = coord(1/2)

2.8726017 = product of:
  5.7452035 = weight(preferred_designation:"renal calculus" in 514631), product of:
    0.8114604 = queryWeight(preferred_designation:"renal calculus"), product of:
      18.88021 = idf(preferred_designation: renal=1111 calculus=37)
      0.04297941 = queryNorm
    7.080079 = fieldWeight(preferred_designation:"renal calculus" in 514631), product 
of:
      1.0 = tf(phraseFreq=1.0)
      18.88021 = idf(preferred_designation: renal=1111 calculus=37)
      0.375 = fieldNorm(field=preferred_designation, doc=514631)
  0.5 = coord(1/2)

2.4832542 = product of:
  4.9665084 = weight(other_designation:"renal calculus" in 481129), product of:
    0.58440757 = queryWeight(other_designation:"renal calculus"), product of:
      13.5973835 = idf(other_designation: renal=8560 calculus=971)
      0.04297941 = queryNorm
    8.498364 = fieldWeight(other_designation:"renal calculus" in 481129), product of:
      1.0 = tf(phraseFreq=1.0)
      13.5973835 = idf(other_designation: renal=8560 calculus=971)
      0.625 = fieldNorm(field=other_designation, doc=481129)
  0.5 = coord(1/2) 


Is there anything that I can do in my query construction, to ensure that if a query 
exactly matches a document, it will be the top result?

Thanks, 

Dan


-----Original Message-----
From: Erik Hatcher [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, April 14, 2004 12:17 PM
To: Lucene Users List
Subject: Re: Result scoring question

Try using IndexSearcher.explain (and then a toString on the resulting 
Explanation object) to see the details of why things are scoring how 
they are.  This can be most enlightening!

        Erik


On Apr 14, 2004, at 12:16 PM, Armbrust, Daniel C. wrote:

> I know that the lucene scoring algorithm is pretty complicated, I know 
> I don't understand all the pieces.  But given these documents:
>
> A) - <preferred_designation> left renal calculus
> B) - <other_designation> renal calculus
>
> Should a query of
>
> other_designation:("renal calculus") OR preferred_designation:("renal 
> calculus")
>
> Score document B higher than document A?
>
> Those documents are a made up example.  Here are the documents and 
> scores I am getting back from the query on my real index:
>
> Score 1.0 - Document<Text<first_word:left> 
> Text<preferred_designation:left renal calculus in calyceal 
> diverticulum> Unindexed<frequency:4> Text<codeTokenized:M00004001> 
> Keyword<code:M00004001> 
> Keyword<UNIQUE_DOCUMENT_IDENTIFIER_FIELD:48270>>
>
> Score 0.85714287 - 
> Document<Keyword<UNIQUE_DOCUMENT_IDENTIFIER_FIELD:514631> 
> Keyword<code:M00035214> Text<codeTokenized:M00035214> 
> Unindexed<frequency:4> Text<preferred_designation:left renal calculus 
> in a solitary left kidney> Text<first_word:left>>
>
> Score 0.7409672 - Document<Text<first_word:renal> 
> Text<other_designation:renal calculus> Unindexed<frequency:3> 
> Text<codeTokenized:M00032753> Keyword<code:M00032753> 
> Keyword<UNIQUE_DOCUMENT_IDENTIFIER_FIELD:481129>>
>
>
> Am I just making a dumb mistake somewhere?
>
> Thanks,
>
> Dan
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Result scoring question

Reply via email to