I have a index which has got fields like title : content :
If I search for, lets say obama fly , then the documents having obama and fly should be given high scores irrespective of the number of times they may occur. This requirement is for fields - title and content. The implementation which I did with a simple OR query will score high the documents for e.g. having more occurrence of 'obama' even if it has no occurrence 'fly' word in it. The tf for 'obama' here in this case is more; so even if 'fly' word is not present the document is scored higher. Expected behaviour is that - (a) documents having 'obama' and 'fly' both should be scored higher in order of their tf . (b) documents having either of terms should be given scores but less than those matched in (a) I tried by overiding the the coord() in a Custom Similarity implementation and boosting it if multiple terms match, but what I see is that coord() is gets boosted even if same word matches in multiple fields (say obama is present in title: and content: ). Searching for solutions, I have not got any results which talk about similar requirement... I guess I am not using right keywords.... Thanks Chandrakant K. -- View this message in context: http://www.nabble.com/Query-which-gives-high-score-proportional-to-%27distinct-term-matches%27-tp24276724p24276724.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org