I could be mistaken, but I think the earlier answer was right; a document with no terms matching has a score of 0, so you can assume that all documents NOT returned by the query have a score of 0. If you look at the scoring formula on this page, it is hard to see how you can get a negative score.
http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/search/Similarity.html HAIDUC SONIA <[EMAIL PROTECTED]> wrote on 11/19/2007 01:25:13 PM: > I am trying to order all the documents in the index according to > their similarity to a given query. I am interested in having a > complete list of *all* the documents in the index with their score. > From what I understood by reading some documentation, Lucene > internally assigns scores to all the documents in the index > according to their similarity to the query, but when returning the > hits, all the scores that are less than 0 are rounded to 0 and only > the documents with the score > 0 are returned as hits. But what I > would like to get is the list before this intermediate processing, > so the list of all the documents with their raw score. I am trying > to compare Lucene with LSI and for the comparison I want to do, I > need the entire list of documents. Is there a way that I can get > that with Lucene? > I hope I explained it clearly this time. If you need more details let me know. > > Thank you, > Sonia > > ----- Original Message ---- > From: Erick Erickson <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Monday, November 19, 2007 11:55:00 AM > Subject: Re: Scoring for all the documents in the index relative to a query > > > Could you explain a bit more what problem you're trying to solve? > The reason I ask is that your question doesn't make sense to me, > since I have no idea what you expect by the term "negative score". > > My simplistic view has been that all the docs returned via Hits > or HitCollector have scores > 0, and all the rest have scores of 0, > and this view is supported by the explanation of > HitCollector.collect > > " Called once for every non-zero scoring document, with the > document number and its score." > > You might also get value from this page: > http://lucene.apache.org/java/docs/scoring.html#Scoring > > Best > Erick > > On Nov 19, 2007 11:05 AM, HAIDUC SONIA <[EMAIL PROTECTED]> wrote: > > > Hi everyone, > > > > I am trying to obtain the score for each document in the index > relative to > > a given query. For example, if I have the query "search file", I am > trying > > to get the list of all documents in the index and their scores > relative to > > the given query. I tried first using Hits, which gave me the > normalized > > score. I thought that I don't see the whole list of documents and > their > > scores because of the normalization, so I tried using HitsCollector. > But > > even after using HitsCollector, I get the same number of matching > documents, > > so the normalization didn't exclude documents because of negative > scoring. > > Does Lucene actually compute the score for all the documents in the > index or > > just for matching documents? I really need to have the scores for all > the > > documents in the index relative to the query (even if negative), not > just > > the ones that contain the query terms(this is what Lucene considers > > "matching documents", right?). Is this possible using Lucene? > > > > I really appreciate your time and effort! > > Thanks, > > Sonia > > > > > > > > > > > > > > ____________________________________________________________________________________ > > Get easy, one-click access to your favorites. > > Make Yahoo! your homepage. > > http://www.yahoo.com/r/hs > > > > > > > > > > ____________________________________________________________________________________ > Never miss a thing. Make Yahoo your home page. > http://www.yahoo.com/r/hs