I could be mistaken, but I think the earlier answer was right; a document 
with no terms matching has a score of 0, so you can assume that all 
documents NOT returned by the query have a score of 0. If you look at the 
scoring formula on this page, it is hard to see how you can get a negative 
score. 

http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/search/Similarity.html



HAIDUC SONIA <[EMAIL PROTECTED]> wrote on 11/19/2007 01:25:13 PM:

> I am trying to order all the documents in the index according to 
> their similarity to a given query. I am interested in having a 
> complete list of *all* the documents in the index with their score. 
> From what I understood by reading some documentation, Lucene 
> internally assigns scores to all the documents in the index 
> according to their similarity to the query, but when returning the 
> hits, all the scores that are less than 0 are rounded to 0 and only 
> the documents with the score > 0 are returned as hits. But what I 
> would like to get is the list before this intermediate processing, 
> so the list of all the documents with their raw score. I am trying 
> to compare Lucene with LSI and for the comparison I want to do, I 
> need the entire list of documents. Is there a way that I can get 
> that with Lucene?
> I hope I explained it clearly this time. If you need more details let me 
know.
> 
> Thank you,
> Sonia
> 
> ----- Original Message ----
> From: Erick Erickson <[EMAIL PROTECTED]>
> To: java-user@lucene.apache.org
> Sent: Monday, November 19, 2007 11:55:00 AM
> Subject: Re: Scoring for all the documents in the index relative to a 
query
> 
> 
> Could you explain a bit more what problem you're trying to solve?
> The reason I ask is that your question doesn't make sense to me,
> since I have no idea what you expect by the term "negative score".
> 
> My simplistic view has been that all the docs returned via Hits
> or HitCollector have scores > 0, and all the rest have scores of 0,
> and this view is supported by the explanation of
> HitCollector.collect
> 
> " Called once for every non-zero scoring document, with the
> document number and its score."
> 
> You might also get value from this page:
> http://lucene.apache.org/java/docs/scoring.html#Scoring
> 
> Best
> Erick
> 
> On Nov 19, 2007 11:05 AM, HAIDUC SONIA <[EMAIL PROTECTED]> wrote:
> 
> > Hi everyone,
> >
> > I am trying to obtain the score for each document in the index
>  relative to
> > a given query. For example, if I have the query "search file", I am
>  trying
> > to get the list of all documents in the index and their scores
>  relative to
> > the given query. I tried first using Hits, which gave me the
>  normalized
> > score. I thought that I don't see the whole list of documents and
>  their
> > scores because of the normalization, so I tried using HitsCollector.
>  But
> > even after using HitsCollector, I get the same number of matching
>  documents,
> > so the normalization didn't exclude documents because of negative
>  scoring.
> > Does Lucene actually compute the score for all the documents in the
>  index or
> > just for matching documents? I really need to have the scores for all
>  the
> > documents in the index relative to the query (even if negative), not
>  just
> > the ones that contain the query terms(this is what Lucene considers
> > "matching documents", right?). Is this possible using Lucene?
> >
> > I really appreciate your time and effort!
> > Thanks,
> > Sonia
> >
> >
> >
> >
> >
> >
> 
> 
____________________________________________________________________________________
> > Get easy, one-click access to your favorites.
> > Make Yahoo! your homepage.
> > http://www.yahoo.com/r/hs
> >
> 
> 
> 
> 
> 
> 
> 
> 
____________________________________________________________________________________
> Never miss a thing.  Make Yahoo your home page. 
> http://www.yahoo.com/r/hs

Reply via email to