Re: some scores to 0 using omitNorns=false

Raimon Bosch Thu, 18 Feb 2010 07:13:44 -0800


I am not an expert in lucene scoring formula, but omintNorms=false makes the
scoring formula a little bit more complex, taking into account boosting for
fields and documents. If I'm not wrong (if I am please, correct me) I think
that with omitNorms=false take into account the queryNorm(q) and norm(t,d)
from formula: score(q,d)   =   coord(q,d)  ·  queryNorm(q)  ·            ∑      
 ( 
tf(t in d)  ·  idf(t)2  ·  t.getBoost() ·  norm(t,d)  ) so the formula will
be more complex.


See
http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html,
and
http://old.nabble.com/scores-are-the-same-for-many-diferent-documents-td27623039.html#a27623039

multiValued option is used to create fields with multiple values.

We use it one of our indexed modifying the schema.xml, adding a new field

...
<field name="s_similar_name"         type="text"            indexed="true"  
stored="true" multiValued="true"/>
...

This field is processed in a specific UpdateRequestProcessorFactory (write
by us) from a comma separated field called 's_similar_names':
...
public void processAdd(AddUpdateCommand cmd) throws IOException {
    SolrInputDocument doc = cmd.getSolrInputDocument();

    String v = (String)doc.getFieldValue( "s_similar_names" );
    if( v != null ) {
      String s_similar_names[] = v.split(",");
      for(String s_similar_name : s_similar_names){
        if(!s_similar_name.equals(""))
            doc.addField( "s_similar_name", s_similar_name );
      }
    }

    // pass it up the chain
    super.processAdd(cmd);
  }
...

A processofactory is specified in solrconfig.xml

...
# <updateRequestProcessorChain name="mychain">    
#     <processor
class="org.apache.solr.update.processor.MyUpdateProcessorFactory"/>  
#     <processor class="solr.LogUpdateProcessorFactory" />  
#     <processor class="solr.RunUpdateProcessorFactory" />  
#   </updateRequestProcessorChain>
...

and adding this chain to XmlUpdateRequestHandler in solrconfig.xml:

...
# <requestHandler name="/update" class="solr.XmlUpdateRequestHandler" >  
#     <lst name="defaults">      
#        <str name="update.processor">mychain</str>      
#      </lst>  
#   </requestHandler>
...

termVector is used to save more info about terns of a document in the index
and save computational time in functions like MoreLikeThis.
http://wiki.apache.org/solr/TermVectorComponent. We don't use it.


adeelmahmood wrote:
> 
> I was gonna ask a question about this but you seem like you might have the
> answer for me .. wat exactly is the omitNorms field do (or is expected to
> do) .. also if you could please help me understand what termVectors and
> multiValued options do ??
> Thanks for ur help
> 
> 
> Raimon Bosch wrote:
>> 
>> 
>> Hi,
>> 
>> We did some tests with omitNorms=false. We have seen that in the last
>> result's page we have some scores set to 0.0. This scores setted to 0 are
>> problematic to our sorters.
>> 
>> It could be some kind of bug?
>> 
>> Regrads,
>> Raimon Bosch.
>> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/some-scores-to-0-using-omitNorns%3Dfalse-tp27637436p27637827.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: some scores to 0 using omitNorns=false

Reply via email to