Hi erick,
Thanks for your prompt reply...
Let me explain what i m doing....
There is lucene query which returns relevant result when i am searching
through Hits object.
But when i m using same query using DocCollector ( I want this way because
want to remove duplicate records at search time )
.. Its giving results which is not relevant although its printing score in
descending order.
Here is what i am doing in DocCollector...
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
public void collect(int doc, float score)
{
Document document = reader.document(doc);
String photoid = document.get("photoid");
if (!uniquelist.contains(photoid))
{
uniquelist.add(photoid);
hq.insert(new ScoreDoc(doc, score));
minScore = ((ScoreDoc)hq.top()).score; // maintain minScore
}
}
public TopDocs topDocs() {
ScoreDoc[] scoreDocs = new ScoreDoc[hq.size()];
for (int i = hq.size()-1; i >= 0; i--) // put docs in array
scoreDocs[i] = (ScoreDoc)hq.pop();
float maxScore = (totalHits==0)
? Float.NEGATIVE_INFINITY
: scoreDocs[0].score;
return new TopDocs(totalHits, scoreDocs, maxScore);
}
public ArrayList getAllDocIds()
{
ArrayList docidlist = new ArrayList();
ArrayList mainlist = new ArrayList();
TopDocs tc = topDocs();
ScoreDoc[] scoredoc = tc.scoreDocs;
for (int i=0;i<scoredoc.length;i++)
{
doclist.add(new Integer(scoredoc[i].doc).toString());
}
return doclist;
}
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
Is this a proper way to find duplicate records ??? If yes please let me
know where i am wrong.. ???
Note: In this case, i can not handle duplicate records at index time...
Thanks.
Bhavin pandya
----- Original Message -----
From: "Erick Erickson" <[EMAIL PROTECTED]>
To: <java-user@lucene.apache.org>; "Bhavin Pandya" <[EMAIL PROTECTED]>
Sent: Thursday, July 19, 2007 7:21 PM
Subject: Re: Where exact score is getting calculate?
I don't think you can using a HitCollector. If you used a TopDocs instead,
you have access to the maximum score and can normalize the
scores to between 0 and 1, but I don't know if that suits your needs.
Erick
On 7/19/07, Bhavin Pandya <[EMAIL PROTECTED]> wrote:
Hi,
The score i am getting in DocCollector is raw score... which is not
necessary between 0 and 1.
Where lucene exactly calculating the final score...? Or
what if i want final score in DocCollector ??? How to ???
Regards.
Bhavin pandya
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]