Hello,

I am new to Lucene so my apologies in advance if what I am trying to do does not make sense or has been discussed before. I searched the list archives but couldn't find an answer....

First a bit of background.... I have a collection of documents which are indexed by SourceID and Content. In the UI, documents are displayed in folders which map to SourceIDs and by default all documents in a given source are displayed using a query like "+(source:1 source:2)". I also want to let users search for text in the Content and display results ranked by their Lucene score. Unfortunately, including the SourceID terms in my query effects the score I get back which in the context of my app does not make sense. I have thought about turning the SourceIDs terms into a QueryFilter but couldn't figure out how to get Lucene to return all of the documents in the filtered collection since empty queries are not allowed. As an alternative, I tried setting the boost on the SourceID terms to zero which seems to work -- my queries look something like "+((source:1 source:2)^0.0) +content:google".

So, my question is whether this approach is a supported method for getting the scorer to ignore a field in its calculations? If it is, then I may have found a bug in IndexSearcher.explain() which return "0.0 = match required" when asked to explain why a result got the score it did despite the fact that a non-zero score was passed to my hit collector for that item. Tracing through the code, it looks like the IndexSearcher.explain() method is unhappy with a required clause having a zero score. Since the core search algorithms don't prevent this, I was surprised to see this in IndexSearcher.explain(). The other problem that I am having with the searcher.explain() method is that I can't pass it the DateFilter that I use on some of my queries. Since that filter effects the score for documents in the results, it would be nice if IndexSearcher.explain() was able to take the filter into account. This would also be a problem if I moved the SourceIDs term into a filter as I have been considering.

Any help or insight on this issue will be greatly appreciated!

Thanks,
Tim

Reply via email to