[jira] [Commented] (LUCENE-3749) Similarity.java javadocs and simplifications for 4.0

Neil Hooey (Commented) (JIRA) Sun, 04 Mar 2012 15:17:24 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-3749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222055#comment-13222055
 ]


Neil Hooey commented on LUCENE-3749:
------------------------------------

This change breaks per-field similarity configuration in Solr. Specifically 
with this commit:

{code}
commit 5d371928263d8d78d0e52781340ae95506bd9bf6
Author: Robert Muir <[email protected]>
Date:   Mon Feb 6 12:48:01 2012 +0000

    LUCENE-3749: replace SimilarityProvider with PerFieldSimilarityWrapper
    
    git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1241001 
13f79535-47bb-0310-9956-ffa450edef68
{code}

I have the following configuration in my schema.xml:

{code}
<fieldtype name="payloads" stored="false" indexed="true" class="solr.TextField" 
>
  <analyzer>
    <tokenizer 
class="com.foo.lucene.analysis.core.PayloadTermTokenizerFactory"/>
    <filter class="solr.DelimitedPayloadTokenFilterFactory" encoder="float"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
  <similarity class="com.foo.lucene.search.PayloadSimilarity" />
</fieldtype>
{code}

But when I build against and use a version of a Solr with the commit mentioned 
above, my similarity class is no longer executed. I've confirmed this by 
putting prints in the scorePayload(), tf() and idf() functions and noticing 
they print before and don't print after including that commit.

It seems this is intentional, based on Robert Muir's comments, but how can you 
get per-field similarity to work in Solr with this new code?
                
> Similarity.java javadocs and simplifications for 4.0
> ----------------------------------------------------
>
>                 Key: LUCENE-3749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3749
>             Project: Lucene - Java
>          Issue Type: Task
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>             Fix For: 4.0
>
>         Attachments: LUCENE-3749.patch, LUCENE-3749_part2.patch
>
>
> As part of adding additional scoring systems to lucene, we made a lower-level 
> Similarity
> and the existing stuff became e.g. TFIDFSimilarity which extends it.
> However, I always feel bad about the complexity introduced here (though I do 
> feel there
> are some "excuses", that its a difficult challenge).
> In order to try to mitigate this, we also exposed an easier API 
> (SimilarityBase) on top of 
> it that makes some assumptions (and trades off some performance) to try to 
> provide something 
> consumable for e.g. experiments.
> Still, we can cleanup a few things with the low-level api: fix outdated 
> documentation and
> shoot for better/clearer naming etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-3749) Similarity.java javadocs and simplifications for 4.0

Reply via email to