Hi erick,
Thanks for your prompt reply...

Let me explain what i m doing....

There is lucene query which returns relevant result when i am searching through Hits object. But when i m using same query using DocCollector ( I want this way because want to remove duplicate records at search time ) .. Its giving results which is not relevant although its printing score in descending order.

Here is what i am doing in DocCollector...

///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
public void collect(int doc, float score)
{

   Document document = reader.document(doc);
   String photoid = document.get("photoid");
   if (!uniquelist.contains(photoid))
   {
       uniquelist.add(photoid);
       hq.insert(new ScoreDoc(doc, score));
       minScore = ((ScoreDoc)hq.top()).score; // maintain minScore
   }
}

public TopDocs topDocs() {

   ScoreDoc[] scoreDocs = new ScoreDoc[hq.size()];
   for (int i = hq.size()-1; i >= 0; i--)      // put docs in array
     scoreDocs[i] = (ScoreDoc)hq.pop();

   float maxScore = (totalHits==0)
     ? Float.NEGATIVE_INFINITY
     : scoreDocs[0].score;

   return new TopDocs(totalHits, scoreDocs, maxScore);
 }


public ArrayList getAllDocIds()
 {
  ArrayList docidlist = new ArrayList();
  ArrayList mainlist = new ArrayList();
  TopDocs tc = topDocs();
  ScoreDoc[] scoredoc = tc.scoreDocs;

  for (int i=0;i<scoredoc.length;i++)
  {
       doclist.add(new Integer(scoredoc[i].doc).toString());
   }
   return doclist;
}
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

Is this a proper way to find duplicate records ??? If yes please let me know where i am wrong.. ???
Note: In this case, i can not handle duplicate records at index time...

Thanks.
Bhavin pandya




----- Original Message ----- From: "Erick Erickson" <[EMAIL PROTECTED]>
To: <java-user@lucene.apache.org>; "Bhavin Pandya" <[EMAIL PROTECTED]>
Sent: Thursday, July 19, 2007 7:21 PM
Subject: Re: Where exact score is getting calculate?


I don't think you can using a HitCollector. If you used a TopDocs instead,
you have access to the maximum score and can normalize the
scores to between 0 and 1, but I don't know if that suits your needs.

Erick

On 7/19/07, Bhavin Pandya <[EMAIL PROTECTED]> wrote:

Hi,

The score i am getting in DocCollector is raw score... which is not
necessary between 0 and 1.
Where lucene exactly calculating the final score...? Or
what if i want final score in DocCollector ??? How to ???

Regards.
Bhavin pandya



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to