[ 
https://issues.apache.org/jira/browse/SOLR-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641757#action_12641757
 ] 

Koji Sekiguchi commented on SOLR-815:
-------------------------------------

Thank you for your great patch on this ticket (and SOLR-814) and I'm sorry for 
late reply. There is definitely such requirements in Japan.

I think this type of normalization can be done in Reader, not TokenFilter. And 
in my project, I'm using extended Tokenizer which
reads chars from "MappingReader", the MappingReader solves this type of 
character mapping (and mapping rule can be read from mapping.txt etc).

I've been thinking my methods (hard coded) can be more general. I'll open a new 
ticket for it soon.

> Add new Japanese half-width/full-width normalizaton Filter and Factory
> ----------------------------------------------------------------------
>
>                 Key: SOLR-815
>                 URL: https://issues.apache.org/jira/browse/SOLR-815
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 1.3
>            Reporter: Todd Feak
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>         Attachments: SOLR-815.patch, SOLR-815.patch
>
>
> Japanese Katakana and  Latin alphabet characters exist as both a "half-width" 
> and "full-width" version. This new Filter normalizes to the full-width 
> version to allow searching and indexing using both.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to