[
https://issues.apache.org/jira/browse/LUCY-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marvin Humphrey updated LUCY-191:
---------------------------------
Fix Version/s: 0.3.0 (incubating)
Assignee: Marvin Humphrey
I've looked through the first third or so of this patch and it looks
really good so far! I'll give it a thorough review soon -- I'm really
looking forward to integrating this!
> Unicode normalization
> ---------------------
>
> Key: LUCY-191
> URL: https://issues.apache.org/jira/browse/LUCY-191
> Project: Lucy
> Issue Type: New Feature
> Components: Analysis
> Reporter: Nick Wellnhofer
> Assignee: Marvin Humphrey
> Priority: Minor
> Labels: patch
> Fix For: 0.3.0 (incubating)
>
> Attachments: LUCY-191-normalizer.patch
>
>
> As discussed on the mailing list, it would be nice to have Unicode
> normalization, Unicode case folding and stripping of accents as part of the
> analyzer chain. With the help of utf8proc this can be done in one pass. So I
> proposed a new analyzer Lucy::Analyzer::Normalizer with an interface
> described here:
> http://mail-archives.apache.org/mod_mbox/incubator-lucy-dev/201111.mbox/%3C4EC43816.1070107%40aevum.de%3E
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira