I should have remembered that. Here are the 3 explanations for the top 3 documents returned (contents below)
3.3513687 = product of: 6.7027373 = weight(preferred_designation:"renal calculus" in 48270), product of: 0.8114604 = queryWeight(preferred_designation:"renal calculus"), product of: 18.88021 = idf(preferred_designation: renal=1111 calculus=37) 0.04297941 = queryNorm 8.260092 = fieldWeight(preferred_designation:"renal calculus" in 48270), product of: 1.0 = tf(phraseFreq=1.0) 18.88021 = idf(preferred_designation: renal=1111 calculus=37) 0.4375 = fieldNorm(field=preferred_designation, doc=48270) 0.5 = coord(1/2) 2.8726017 = product of: 5.7452035 = weight(preferred_designation:"renal calculus" in 514631), product of: 0.8114604 = queryWeight(preferred_designation:"renal calculus"), product of: 18.88021 = idf(preferred_designation: renal=1111 calculus=37) 0.04297941 = queryNorm 7.080079 = fieldWeight(preferred_designation:"renal calculus" in 514631), product of: 1.0 = tf(phraseFreq=1.0) 18.88021 = idf(preferred_designation: renal=1111 calculus=37) 0.375 = fieldNorm(field=preferred_designation, doc=514631) 0.5 = coord(1/2) 2.4832542 = product of: 4.9665084 = weight(other_designation:"renal calculus" in 481129), product of: 0.58440757 = queryWeight(other_designation:"renal calculus"), product of: 13.5973835 = idf(other_designation: renal=8560 calculus=971) 0.04297941 = queryNorm 8.498364 = fieldWeight(other_designation:"renal calculus" in 481129), product of: 1.0 = tf(phraseFreq=1.0) 13.5973835 = idf(other_designation: renal=8560 calculus=971) 0.625 = fieldNorm(field=other_designation, doc=481129) 0.5 = coord(1/2) Is there anything that I can do in my query construction, to ensure that if a query exactly matches a document, it will be the top result? Thanks, Dan -----Original Message----- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Wednesday, April 14, 2004 12:17 PM To: Lucene Users List Subject: Re: Result scoring question Try using IndexSearcher.explain (and then a toString on the resulting Explanation object) to see the details of why things are scoring how they are. This can be most enlightening! Erik On Apr 14, 2004, at 12:16 PM, Armbrust, Daniel C. wrote: > I know that the lucene scoring algorithm is pretty complicated, I know > I don't understand all the pieces. But given these documents: > > A) - <preferred_designation> left renal calculus > B) - <other_designation> renal calculus > > Should a query of > > other_designation:("renal calculus") OR preferred_designation:("renal > calculus") > > Score document B higher than document A? > > Those documents are a made up example. Here are the documents and > scores I am getting back from the query on my real index: > > Score 1.0 - Document<Text<first_word:left> > Text<preferred_designation:left renal calculus in calyceal > diverticulum> Unindexed<frequency:4> Text<codeTokenized:M00004001> > Keyword<code:M00004001> > Keyword<UNIQUE_DOCUMENT_IDENTIFIER_FIELD:48270>> > > Score 0.85714287 - > Document<Keyword<UNIQUE_DOCUMENT_IDENTIFIER_FIELD:514631> > Keyword<code:M00035214> Text<codeTokenized:M00035214> > Unindexed<frequency:4> Text<preferred_designation:left renal calculus > in a solitary left kidney> Text<first_word:left>> > > Score 0.7409672 - Document<Text<first_word:renal> > Text<other_designation:renal calculus> Unindexed<frequency:3> > Text<codeTokenized:M00032753> Keyword<code:M00032753> > Keyword<UNIQUE_DOCUMENT_IDENTIFIER_FIELD:481129>> > > > Am I just making a dumb mistake somewhere? > > Thanks, > > Dan > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]