Hi Peter, On 11/06/2008 at 4:25 PM, Peter Keegan wrote: > I've discovered another flaw in using this technique: > > (+contents:petroleum +contents:engineer +contents:refinery) > (+boost:petroleum +boost:engineer +boost:refinery) > > It's possible that the first clause will produce a matching > doc and none of the terms in the second clause are used to > score that doc. Yet another reason to use BoostingTermQuery.
I think you could address this, without BTQ, using something like: boost:(+petroleum +engineer +refinery) (+contents:(+petroleum +engineer +refinery) +((*:* -boost:petroleum) (*:* -boost:engineer) (*:* -boost:refinery))) The last three lines gives you the set of documents that are missing at least one of the terms in the "boost" field. The *:* thingy, indicating a MatchAllDocsQuery, is necessary to get all documents that don't have a given term; Lucene's (sub-)query document exclusion operation needs a non-empty set on which to operate. On 11/06/2008 at 1:08 PM, Peter Keegan wrote: > Then, at search time, a query for "petroleum engineer" gets rewritten > to: (+contents:petroleum +contents:engineer) (+boost:petroleum > +boost:engineer). Note that the two clauses are OR'd so that a term that > exists in both fields will get a higher weight in the 'boost' field. > This works quite well at boosting documents with terms that exist in the > boosted fields. However, it doesn't work properly if excluded terms are > added, for example: > > (+contents:petroleum +contents:engineer -contents:drilling) > (+boost:petroleum +boost:engineer -boost:drilling) > > If a document contains the term 'drilling' in the 'body' > field, but not in the 'title' or 'city' field, a false hit occurs. I think you could address this problem like this: +(boost:(+petroleum +engineer) (+contents:(+petroleum +engineer) +((*:* -boost:petroleum) (*:* -boost:engineer)))) -contents:drilling You don't have to include "-boost:drilling", because this condition is entailed by "-contents:drilling". Steve --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]