Re: Surprising score?

2013-07-06 Thread Erick Erickson
Not a problem. index time boosts are boosts made
_when you're indexing_, not when you're querying
so omitting norms should stil have your query boosting
work.

Also, try adding debug=all and examining the results,
it'll show you exactly how scores were calculated. It does
take a bit to work though I'll admit...

Best
Erick


On Fri, Jul 5, 2013 at 8:47 AM, Lochschmied, Alexander 
alexander.lochschm...@vishay.com wrote:

 Thanks Jeroen and Upayavira!

 I read the warning about losing the ability to use index time boosts when
 I disable length normalization. And we actually use it; at least if it
 means having a boost field in the index and doing queries like this:

 {!boost b=boost}( series:RCWP^10 OR otherFileds:queries^2)

 Is there a way to omitNorms and still be able to use {!boost b=boost} ?

 Thanks,
 Alexander


 -Ursprüngliche Nachricht-
 Von: Upayavira [mailto:u...@odoko.co.uk]
 Gesendet: Donnerstag, 4. Juli 2013 13:07
 An: solr-user@lucene.apache.org
 Betreff: Re: Surprising score?

 And be sure to re-index your content.

 Upayavira

 On Thu, Jul 4, 2013, at 11:28 AM, Jeroen Steggink wrote:
  Hi Alexander,
 
  This is because you have length normalization enabled for that field.
  http://ir.dcs.gla.ac.uk/wiki/Length_Normalisation
 
  If you want it disabled set the following:
 
  fieldType name=series class=solr.TextField
  positionIncrementGap=100 omitNorms=true
 
 
Jeroen
 
  On 4-7-2013 11:10, Lochschmied, Alexander wrote:
   Hi Solr people!
  
   querying for series:RCWP returns me the response below. Why does
 RCWP Moisture Resistant score worse than D/CRCW-P e3 with the field
 definition below? OK, we are ignoring dashes and spaces, but I would have
 expected that matches towards the beginning score better. Can I change this
 behavior (in Solr 4)?
  
  
 --
   result
   doc
   str name=seriesRCWP/str
   float name=score3.2698402/float
   /doc
   doc
   str name=seriesD/CRCW-P e3/str
   float name=score1.3624334/float
   /doc
   doc
   str name=seriesRCWP Moisture Resistant/str
   float name=score0.5449734/float
   /doc
   /result
  
 --
  
   fieldType name=series class=solr.TextField
 positionIncrementGap=100
   analyzer type=index
   charFilter class=solr.PatternReplaceCharFilterFactory
 pattern=[\-\s]+ replacement=/
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.NGramFilterFactory minGramSize=2
 maxGramSize=50/
   /analyzer
   analyzer type=query
   charFilter class=solr.PatternReplaceCharFilterFactory
 pattern=[\-\s]+ replacement=/
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.LowerCaseFilterFactory/
   /analyzer
   /fieldType
  
   Thanks,
   Alexander
 
 



Re: Surprising score?

2013-07-05 Thread Jason Hellman
Also considering using the SweetSpotSimilarityFactory class which allows to to 
still engage normalization but control how intrusive it is.  This, combined 
with the ability to set a custom Similarity class on a per-fieldType basis may 
be extremely useful.

More info:

http://lucene.apache.org/solr/4_3_1/solr-core/org/apache/solr/search/similarities/SweetSpotSimilarityFactory.html

Jason

On Jul 5, 2013, at 5:59 AM, pravesh suyalprav...@yahoo.com wrote:

 Is there a way to omitNorms and still be able to use {!boost b=boost} ? 
 
 OR you could let /omitNorms=false/  as usual and have your custom
 Similarity implementation with the length normalization method overridden
 for using a constant value of 1.
 
 
 Regards
 Pravesh
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Surprising-score-tp4075436p4075722.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Surprising score?

2013-07-04 Thread Jeroen Steggink

Hi Alexander,

This is because you have length normalization enabled for that field.
http://ir.dcs.gla.ac.uk/wiki/Length_Normalisation

If you want it disabled set the following:

fieldType name=series class=solr.TextField positionIncrementGap=100 
omitNorms=true


 Jeroen

On 4-7-2013 11:10, Lochschmied, Alexander wrote:

Hi Solr people!

querying for series:RCWP returns me the response below. Why does RCWP Moisture 
Resistant score worse than D/CRCW-P e3 with the field definition below? OK, we are ignoring 
dashes and spaces, but I would have expected that matches towards the beginning score better. Can I change 
this behavior (in Solr 4)?

--
result
doc
str name=seriesRCWP/str
float name=score3.2698402/float
/doc
doc
str name=seriesD/CRCW-P e3/str
float name=score1.3624334/float
/doc
doc
str name=seriesRCWP Moisture Resistant/str
float name=score0.5449734/float
/doc
/result
--

fieldType name=series class=solr.TextField positionIncrementGap=100
analyzer type=index
charFilter class=solr.PatternReplaceCharFilterFactory pattern=[\-\s]+ 
replacement=/
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt enablePositionIncrements=true/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.NGramFilterFactory minGramSize=2 
maxGramSize=50/
/analyzer
analyzer type=query
charFilter class=solr.PatternReplaceCharFilterFactory pattern=[\-\s]+ 
replacement=/
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
/analyzer
/fieldType

Thanks,
Alexander





Re: Surprising score?

2013-07-04 Thread Upayavira
And be sure to re-index your content.

Upayavira

On Thu, Jul 4, 2013, at 11:28 AM, Jeroen Steggink wrote:
 Hi Alexander,
 
 This is because you have length normalization enabled for that field.
 http://ir.dcs.gla.ac.uk/wiki/Length_Normalisation
 
 If you want it disabled set the following:
 
 fieldType name=series class=solr.TextField
 positionIncrementGap=100 omitNorms=true
 
 
   Jeroen
 
 On 4-7-2013 11:10, Lochschmied, Alexander wrote:
  Hi Solr people!
 
  querying for series:RCWP returns me the response below. Why does RCWP 
  Moisture Resistant score worse than D/CRCW-P e3 with the field 
  definition below? OK, we are ignoring dashes and spaces, but I would have 
  expected that matches towards the beginning score better. Can I change this 
  behavior (in Solr 4)?
 
  --
  result
  doc
  str name=seriesRCWP/str
  float name=score3.2698402/float
  /doc
  doc
  str name=seriesD/CRCW-P e3/str
  float name=score1.3624334/float
  /doc
  doc
  str name=seriesRCWP Moisture Resistant/str
  float name=score0.5449734/float
  /doc
  /result
  --
 
  fieldType name=series class=solr.TextField positionIncrementGap=100
  analyzer type=index
  charFilter class=solr.PatternReplaceCharFilterFactory 
  pattern=[\-\s]+ replacement=/
  tokenizer class=solr.KeywordTokenizerFactory/
  filter class=solr.StopFilterFactory ignoreCase=true 
  words=stopwords.txt enablePositionIncrements=true/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.NGramFilterFactory minGramSize=2 
  maxGramSize=50/
  /analyzer
  analyzer type=query
  charFilter class=solr.PatternReplaceCharFilterFactory 
  pattern=[\-\s]+ replacement=/
  tokenizer class=solr.KeywordTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
  /analyzer
  /fieldType
 
  Thanks,
  Alexander