[ 
https://issues.apache.org/jira/browse/SOLR-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Mattozzi updated SOLR-2304:
--------------------------------

    Attachment: SOLR-2304.patch

Patch to lucene trunk to apply field level boosts before query terms are 
selected in MoreLikeThis. 

> MoreLikeThis: Apply field level boosts before query terms are selected
> ----------------------------------------------------------------------
>
>                 Key: SOLR-2304
>                 URL: https://issues.apache.org/jira/browse/SOLR-2304
>             Project: Solr
>          Issue Type: Improvement
>          Components: MoreLikeThis
>    Affects Versions: 1.4.2
>            Reporter: Mike Mattozzi
>            Priority: Minor
>             Fix For: 1.4.2
>
>         Attachments: SOLR-2304.patch
>
>
> MoreLikeThis provides the ability to set field level boosts to weight the 
> importance of fields in selecting similar documents. Currently, in trunk, 
> these field level boosts are applied after the query terms have been selected 
> from the priority queue of interesting terms in MoreLIkeThis. This can give 
> unexpected results when used in combination with mlt.maxqt to limit the 
> number of query terms. For example, if you use fields fieldA and fieldB and 
> boost them "fieldA^0.5 fieldB^2.0" with a maxqt parameter of 20, if the terms 
> in fieldA have relatively higher tf-idf scores than fieldB, only 20 fieldA 
> terms will be selected as the basis for the MoreLikeThis query... even if 
> after boosting, there are terms in fieldB with a higher overall score. 
> I encountered this while using document descriptive text and document tags 
> (comedy, action, etc) as the basis for MoreLIkeThis. I wanted to boost the 
> tags higher, however the less common document text terms were always selected 
> as the query terms while the more common tag terms were eliminated by the 
> maxqt parameter before their scores were boosted. 
> I believe the code was originally written as it was so that the bulk of the 
> work could be done in the MoreLikeThisHandler without modifying the 
> MoreLikeThis class in the lucene project. Now that the projects are merged, I 
> think this modification makes sense. I will be attaching a simple patch to 
> trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to