Using a zero boost to prevent term for effecting score

Tim Nufire Sun, 10 Jul 2005 20:19:31 -0700

Hello,

I am new to Lucene so my apologies in advance if what I am trying to dodoes not make sense or has been discussed before. I searched the listarchives but couldn't find an answer....

First a bit of background.... I have a collection of documents which areindexed by SourceID and Content. In the UI, documents are displayed infolders which map to SourceIDs and by default all documents in a givensource are displayed using a query like "+(source:1 source:2)". I alsowant to let users search for text in the Content and display resultsranked by their Lucene score. Unfortunately, including the SourceIDterms in my query effects the score I get back which in the context ofmy app does not make sense. I have thought about turning the SourceIDsterms into a QueryFilter but couldn't figure out how to get Lucene toreturn all of the documents in the filtered collection since emptyqueries are not allowed. As an alternative, I tried setting the boost onthe SourceID terms to zero which seems to work -- my queries looksomething like "+((source:1 source:2)^0.0) +content:google".

So, my question is whether this approach is a supported method forgetting the scorer to ignore a field in its calculations? If it is, thenI may have found a bug in IndexSearcher.explain() which return "0.0 =match required" when asked to explain why a result got the score it diddespite the fact that a non-zero score was passed to my hit collectorfor that item. Tracing through the code, it looks like theIndexSearcher.explain() method is unhappy with a required clause havinga zero score. Since the core search algorithms don't prevent this, I wassurprised to see this in IndexSearcher.explain(). The other problem thatI am having with the searcher.explain() method is that I can't pass itthe DateFilter that I use on some of my queries. Since that filtereffects the score for documents in the results, it would be nice ifIndexSearcher.explain() was able to take the filter into account. Thiswould also be a problem if I moved the SourceIDs term into a filter as Ihave been considering.


Any help or insight on this issue will be greatly appreciated!

Thanks,
Tim

Using a zero boost to prevent term for effecting score

Reply via email to