Add katakana filter to better deal with katakana spelling variants
------------------------------------------------------------------
Key: LUCENE-3901
URL: https://issues.apache.org/jira/browse/LUCENE-3901
Project: Lucene - Java
Issue Type: New Feature
Components: modules/analysis
Reporter: Christian Moen
Fix For: 3.6, 4.0
Many Japanese katakana words end in a long sound that is sometimes optional.
For example, パーティー and パーティ are both perfectly valid for "party". Similarly we
have センター and センタ that are variants of "center" as well as サーバー and サーバ for
"server".
I'm proposing that we add a katakana stemmer that removes this long sound if
the terms are longer than a configurable length. It's also possible to add the
variant as a synonym, but I think stemming is preferred from a ranking point of
view.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]