Re: Surprising score?
Not a problem. index time boosts are boosts made _when you're indexing_, not when you're querying so omitting norms should stil have your query boosting work. Also, try adding debug=all and examining the results, it'll show you exactly how scores were calculated. It does take a bit to work though I'll admit... Best Erick On Fri, Jul 5, 2013 at 8:47 AM, Lochschmied, Alexander alexander.lochschm...@vishay.com wrote: Thanks Jeroen and Upayavira! I read the warning about losing the ability to use index time boosts when I disable length normalization. And we actually use it; at least if it means having a boost field in the index and doing queries like this: {!boost b=boost}( series:RCWP^10 OR otherFileds:queries^2) Is there a way to omitNorms and still be able to use {!boost b=boost} ? Thanks, Alexander -Ursprüngliche Nachricht- Von: Upayavira [mailto:u...@odoko.co.uk] Gesendet: Donnerstag, 4. Juli 2013 13:07 An: solr-user@lucene.apache.org Betreff: Re: Surprising score? And be sure to re-index your content. Upayavira On Thu, Jul 4, 2013, at 11:28 AM, Jeroen Steggink wrote: Hi Alexander, This is because you have length normalization enabled for that field. http://ir.dcs.gla.ac.uk/wiki/Length_Normalisation If you want it disabled set the following: fieldType name=series class=solr.TextField positionIncrementGap=100 omitNorms=true Jeroen On 4-7-2013 11:10, Lochschmied, Alexander wrote: Hi Solr people! querying for series:RCWP returns me the response below. Why does RCWP Moisture Resistant score worse than D/CRCW-P e3 with the field definition below? OK, we are ignoring dashes and spaces, but I would have expected that matches towards the beginning score better. Can I change this behavior (in Solr 4)? -- result doc str name=seriesRCWP/str float name=score3.2698402/float /doc doc str name=seriesD/CRCW-P e3/str float name=score1.3624334/float /doc doc str name=seriesRCWP Moisture Resistant/str float name=score0.5449734/float /doc /result -- fieldType name=series class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.PatternReplaceCharFilterFactory pattern=[\-\s]+ replacement=/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.NGramFilterFactory minGramSize=2 maxGramSize=50/ /analyzer analyzer type=query charFilter class=solr.PatternReplaceCharFilterFactory pattern=[\-\s]+ replacement=/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType Thanks, Alexander
Re: Surprising score?
Also considering using the SweetSpotSimilarityFactory class which allows to to still engage normalization but control how intrusive it is. This, combined with the ability to set a custom Similarity class on a per-fieldType basis may be extremely useful. More info: http://lucene.apache.org/solr/4_3_1/solr-core/org/apache/solr/search/similarities/SweetSpotSimilarityFactory.html Jason On Jul 5, 2013, at 5:59 AM, pravesh suyalprav...@yahoo.com wrote: Is there a way to omitNorms and still be able to use {!boost b=boost} ? OR you could let /omitNorms=false/ as usual and have your custom Similarity implementation with the length normalization method overridden for using a constant value of 1. Regards Pravesh -- View this message in context: http://lucene.472066.n3.nabble.com/Surprising-score-tp4075436p4075722.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Surprising score?
Hi Alexander, This is because you have length normalization enabled for that field. http://ir.dcs.gla.ac.uk/wiki/Length_Normalisation If you want it disabled set the following: fieldType name=series class=solr.TextField positionIncrementGap=100 omitNorms=true Jeroen On 4-7-2013 11:10, Lochschmied, Alexander wrote: Hi Solr people! querying for series:RCWP returns me the response below. Why does RCWP Moisture Resistant score worse than D/CRCW-P e3 with the field definition below? OK, we are ignoring dashes and spaces, but I would have expected that matches towards the beginning score better. Can I change this behavior (in Solr 4)? -- result doc str name=seriesRCWP/str float name=score3.2698402/float /doc doc str name=seriesD/CRCW-P e3/str float name=score1.3624334/float /doc doc str name=seriesRCWP Moisture Resistant/str float name=score0.5449734/float /doc /result -- fieldType name=series class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.PatternReplaceCharFilterFactory pattern=[\-\s]+ replacement=/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.NGramFilterFactory minGramSize=2 maxGramSize=50/ /analyzer analyzer type=query charFilter class=solr.PatternReplaceCharFilterFactory pattern=[\-\s]+ replacement=/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType Thanks, Alexander
Re: Surprising score?
And be sure to re-index your content. Upayavira On Thu, Jul 4, 2013, at 11:28 AM, Jeroen Steggink wrote: Hi Alexander, This is because you have length normalization enabled for that field. http://ir.dcs.gla.ac.uk/wiki/Length_Normalisation If you want it disabled set the following: fieldType name=series class=solr.TextField positionIncrementGap=100 omitNorms=true Jeroen On 4-7-2013 11:10, Lochschmied, Alexander wrote: Hi Solr people! querying for series:RCWP returns me the response below. Why does RCWP Moisture Resistant score worse than D/CRCW-P e3 with the field definition below? OK, we are ignoring dashes and spaces, but I would have expected that matches towards the beginning score better. Can I change this behavior (in Solr 4)? -- result doc str name=seriesRCWP/str float name=score3.2698402/float /doc doc str name=seriesD/CRCW-P e3/str float name=score1.3624334/float /doc doc str name=seriesRCWP Moisture Resistant/str float name=score0.5449734/float /doc /result -- fieldType name=series class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.PatternReplaceCharFilterFactory pattern=[\-\s]+ replacement=/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.NGramFilterFactory minGramSize=2 maxGramSize=50/ /analyzer analyzer type=query charFilter class=solr.PatternReplaceCharFilterFactory pattern=[\-\s]+ replacement=/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType Thanks, Alexander