I have a situation where each doc is described by a tag field with multiple
tags. Tags come pairs. So when one tag is added to the field, it means that
the opposite tag in the pair is rejected for the document. Tags are also
optional, so two documents may be described by different set of tags. When I
match these documents, with document sharing the same tags rank higher, and
documents with opposite tags rank lower, even lower than documents that
share a small number of comment tags. An example of this:

Document 1: Red, Big, Heavy, ...
Document 2: Red,        Heavy, ...
Document 3: Red, Small, ...

(Red/Green is a pair, Big/Small is a pair, Heavy/Light is a pair. There may
be many more pairs of tags. this is just an example.)

Then when I match a new Document with "Red, Big", Document 1 should be top,
Document 2 in the middle, and Document 3 in the bottom. But I still want
Document 3 to show up in result because it still matches on Red.

If I simply add opposite tags in the query with <1 boost (search for "Red
Big Small^0.1", e.g.), it still contribute positively to the final score,
document 3 will be higher than document 2.

If I use "-" on the opposite terms (fieldName: (Red Big) -fieldName:Small)
I'll lose document 3 altogether.

What is the best strategy for implementing this? If there is nothing out of
box supporting this, where should I go to modify the server itself?

Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Penalize-certain-keywords-but-not-completely-forbid-them-tp3559425p3559425.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to