[jira] Updated: (SOLR-1279) ApostropheTokenizer

Noble Paul (JIRA) Wed, 15 Jul 2009 02:11:40 -0700

     [ 
https://issues.apache.org/jira/browse/SOLR-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Noble Paul updated SOLR-1279:
-----------------------------

    Fix Version/s:     (was: 1.4)
                   1.5

at this point we are not entertaining new features for 1.4

> ApostropheTokenizer
> -------------------
>
>                 Key: SOLR-1279
>                 URL: https://issues.apache.org/jira/browse/SOLR-1279
>             Project: Solr
>          Issue Type: New Feature
>          Components: Analysis
>            Reporter: Sergey Borisov
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: ApostropheTokenizer.zip
>
>
> ApostropheTokenizer creates extra tokens during the analysis stage for the 
> fields containing apostrophes. The reason for adding this is to ensure that 
> documents that differ only by apostrophe have the same relevancy score. 
> For example, if the document contains string "McDonald's", it will be 
> tokenized as "McDonald's McDonalds". This way when the search is performed 
> against "McDonald's" or "McDonalds" will produce similar score.
> This code handles up to two apostrophes in a token.
> To use this tokenizer add the following line in schema.xml
> <analyzer type="index">
>       <filter class="org.apache.lucene.analysis.ApostropheTokenFactory"/>
> ...
> </analyzer>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1279) ApostropheTokenizer

Reply via email to