[ https://issues.apache.org/jira/browse/LUCENE-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kazuaki Hiraga updated LUCENE-3921: ----------------------------------- Environment: Cent OS 5, IPA Dictionary, Run with "Search mdoe" (was: Cent OS 5, IPA Dictionary) > Add decompose compound Japanese Katakana token capability to Kuromoji > --------------------------------------------------------------------- > > Key: LUCENE-3921 > URL: https://issues.apache.org/jira/browse/LUCENE-3921 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis > Affects Versions: 4.0 > Environment: Cent OS 5, IPA Dictionary, Run with "Search mdoe" > Reporter: Kazuaki Hiraga > Labels: features > > Japanese morphological analyzer, Kuromoji doesn't have a capability to > decompose every Japanese Katakana compound tokens to sub-tokens. It seems > that some Katakana tokens can be decomposed, but it cannot be applied every > Katakana compound tokens. For instance, "トートバッグ(tote bag)" and "ショルダーバッグ" > don't decompose into "トート バッグ" and "ショルダー バッグ" although the IPA dictionary > has "バッグ" in its entry. I would like to apply the decompose feature to every > Katakana tokens if the sub-tokens are in the dictionary or add the capability > to force apply the decompose feature to every Katakana tokens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org