So, can we say that if you have something that gives you the "how many query
terms matched" info, will that satisfy your requirement?
Query: term1 term2
Doc1: term1 term2 => n=2 => %100
Doc2: term1 term2 term3 term4 => n=2 => %100
Doc3: term1 term1 term3 => n=1 => %50
Doc4: term2 term3 term4 => n=1 => %50
If yes Explanation will you give that info in coord part. For example
coord(1/3) means one query term matched and there are total 3 query terms.
Here is an example Explanation:
0.013397463 = (MATCH) product of:
0.040192388 = (MATCH) sum of:
0.040192388 = (MATCH) weight(pagetext:para in 34930), product of:
0.46250778 = queryWeight(pagetext:para), product of:
3.1780937 = idf(docFreq=5546, maxDocs=48977)
0.14552994 = queryNorm
0.086901 = (MATCH) fieldWeight(pagetext:para in 34930), product of:
1.0 = tf(termFreq(pagetext:para)=1)
3.1780937 = idf(docFreq=5546, maxDocs=48977)
0.02734375 = fieldNorm(field=pagetext, doc=34930)
0.33333334 = coord(1/3)
--- On Mon, 1/3/11, Amr ElAdawy <[email protected]> wrote:
> From: Amr ElAdawy <[email protected]>
> Subject: Re: Search Score percentage, Should not be relative to the highest
> score
> To: [email protected]
> Date: Monday, January 3, 2011, 3:09 PM
>
> Consider the following.
>
> Query: term1 term2
> Doc1: term1 term2
> Doc2: term1 term2 term3 term4
> Doc3: term1 term1 term3
> Doc4: term3 term4
>
> For the above documents, Doc1 and Doc2 will b exact match (
> as they contain
> all the terms in the search Query). Doc3 is partially match
> as it contains
> term1 only (we neglect the term frequency tf always 1
>
>
> The score percentage ( calculated by Lucene in Hits.java
> line 133) and will
> be
>
> Doc1: 100%
> Doc2: 100%
> Doc3: 80%
>
> This is not a problem at all, the problem occurs when there
> is no exact
> matching document as following:
>
> Query: term1 term2
> Doc1: term1 term3
> Doc2: term2 term3 term4
> Doc3: term1 term1 term3
> Doc4: term3 term4
>
>
> The score will be calculated as
>
> Doc1: 100%
> Doc2: 100%
> Doc3: 50%
>
> You can see that Doc1 and Doc2 got 100% despite that they
> are not exact
> match. but as they got the highest score, Lucene considers
> them 100% match.
>
> This is my problem
>
> All I need is to make the percentage correct in the second
> case so it will
> be something as
>
> Doc1: 50%
> Doc2: 50%
> Doc3: 30%
>
> I hope I made myself clear.
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Search-Score-percentage-Should-not-be-relative-to-the-highest-score-tp2183420p2184613.html
> Sent from the Lucene - Java Users mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]