Re: Scoring on multi-valued fields
: The other would be to somehow control the scores of each id. So a document : with 2 ids matching should be worth more then the document with only 1 id : matching (This is how it works now) but a document with 7 ids matching : shouldn't be worth more, or at least not a lot more, then a document that : matches only 3 ids (this is not how it works). this is all drive by the "coord factor" of the outermost BooleanQuery ... you can provide a custom Similarity class thta generates differnet values based on the field/number of clauses, or if you are already generating the BooleanQuery via custom code (ie: your own QParser or what not) you can override the SImilartiy there. : The reason this would be ideal for us is that we don't have any control over : how many ids will be in the query and we don't want documents that have lots : of ids to have an unnatural advantage over those with just a few. If you put 'omitNorms="false"' on the field in question, then the length normalization (which rewards shorter documents) should help offset this -- no custom code required. -Hoss
Re: Scoring on multi-valued fields
Well that does take care of some cases. How about if we still want a hit on a tag to contribute to the weight though? There would be 2 options. One is the one I described in the original post, which is to grab the highest score of a set of ids. The other would be to somehow control the scores of each id. So a document with 2 ids matching should be worth more then the document with only 1 id matching (This is how it works now) but a document with 7 ids matching shouldn't be worth more, or at least not a lot more, then a document that matches only 3 ids (this is not how it works). The reason this would be ideal for us is that we don't have any control over how many ids will be in the query and we don't want documents that have lots of ids to have an unnatural advantage over those with just a few. -- View this message in context: http://lucene.472066.n3.nabble.com/Scoring-on-multi-valued-fields-tp1017624p1020504.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Scoring on multi-valued fields
On Tue, Aug 3, 2010 at 3:16 PM, oleg.gnatovskiy wrote: > > Sorry guess I messed up my example query. The query should look like this: > > name:pizza AND id:(10 OR 20 OR 30) > > Thus if I do name:pizza^10 AND id:(10 OR 20 OR 30)^0 wouldn't a document > that has all the ids (10,20, and 30) still come up higher then a document > that has just one? No, because the whole id:(10 OR 20 OR 30)^0 clause will contribute 0 to the final score. Another way to get the same effect would be to pull it out as a filter: q=name:pizza&fq=id:(10 OR 20 OR 30) -Yonik http://www.lucidimagination.com
Re: Scoring on multi-valued fields
Sorry guess I messed up my example query. The query should look like this: name:pizza AND id:(10 OR 20 OR 30) Thus if I do name:pizza^10 AND id:(10 OR 20 OR 30)^0 wouldn't a document that has all the ids (10,20, and 30) still come up higher then a document that has just one? -- View this message in context: http://lucene.472066.n3.nabble.com/Scoring-on-multi-valued-fields-tp1017624p1020234.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Scoring on multi-valued fields
On Tue, Aug 3, 2010 at 2:42 PM, oleg.gnatovskiy wrote: > > Oh sorry guys, I didn't correctly submit my original post to the mailing > list. The original message was this: > " > Hello all. We are having some trouble with queries similar to the type shown > below: > > name: pizza OR (id:10 OR id:20 OR id:30) (id is a multi-valued field) > > With the above query, we will always get documents with pizza in the name, > and any document with id values of 10, 20, and 30 will always come up first. > What we would like is to have a document with only id 10 to be weighted the > same as a document with ids 10, 20, and 30. How do you want pizza weighted against 10, 20, or 30? If pizza can always come first, you can boost the second clause to zero: pizza OR (id:10 OR id:20 OR id:30)^0 > What happens is that the sums of all the hits on ID are added up. Is there a > way to only grab the first score? There is a way to grab only the highest score from a set of options (DisjunctionMaxQuery) but unfortunately there is no general query parser syntax to support that yet. -Yonik http://www.lucidimagination.com
Re: Scoring on multi-valued fields
Oh sorry guys, I didn't correctly submit my original post to the mailing list. The original message was this: " Hello all. We are having some trouble with queries similar to the type shown below: name: pizza OR (id:10 OR id:20 OR id:30) (id is a multi-valued field) With the above query, we will always get documents with pizza in the name, and any document with id values of 10, 20, and 30 will always come up first. What we would like is to have a document with only id 10 to be weighted the same as a document with ids 10, 20, and 30. Is this possible with Lucene/Solr? Thanks in advance for any assistance you might be able to offer. " -- View this message in context: http://lucene.472066.n3.nabble.com/Scoring-on-multi-valued-fields-tp1017624p1020181.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Scoring on multi-valued fields
I checked the explain query. What happens is that the sums of all the hits on ID are added up. Is there a way to only grab the first score? Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Scoring-on-multi-valued-fields-tp1017624p1020150.html Sent from the Solr - User mailing list archive at Nabble.com.