[
https://issues.apache.org/jira/browse/LUCENE-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12718291#action_12718291
]
Robert Muir commented on LUCENE-1466:
-------------------------------------
just as an alternative, i have a different mechanism as part of lucene-1488
patch I am working on. But maybe its good to have options, as it depends on the
ICU library.
below is excerpt from javadoc.
A TokenFilter that transforms text with ICU.
ICU provides text-transformation functionality via its Transliteration API.
Although script conversion is its most common use, a transliterator can
actually perform a more general class of tasks.
...
Some useful transformations for search are built-in:
* Conversion from Traditional to Simplified Chinese characters
* Conversion from Hiragana to Katakana
* Conversion from Fullwidth to Halfwidth forms.
...
Example usage:
* stream = new ICUTransformFilter(stream,
Transliterator.getInstance("Traditional-Simplified"));
> CharFilter - normalize characters before tokenizer
> --------------------------------------------------
>
> Key: LUCENE-1466
> URL: https://issues.apache.org/jira/browse/LUCENE-1466
> Project: Lucene - Java
> Issue Type: New Feature
> Components: Analysis
> Affects Versions: 2.4
> Reporter: Koji Sekiguchi
> Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1466.patch, LUCENE-1466.patch
>
>
> This proposes to import CharFilter that has been introduced in Solr 1.4.
> Please see for the details:
> - SOLR-822
> - http://www.nabble.com/Proposal-for-introducing-CharFilter-to20327007.html
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]