[ https://issues.apache.org/jira/browse/LUCENE-3922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Christian Moen updated LUCENE-3922: ----------------------------------- Attachment: LUCENE-3922.patch Updated patch with decimal number support, additional javadoc and the test code now makes precommit happy. Token-attributes such as part-of-speech, readings, etc. for the normalized token is currently inherited from the last token used when composing the normalized number. Since these values are likely to be wrong, I'm inclined to set this attributes to null or a reasonable default. I'm very happy to hear your thoughts on this. > Add Japanese Kanji number normalization to Kuromoji > --------------------------------------------------- > > Key: LUCENE-3922 > URL: https://issues.apache.org/jira/browse/LUCENE-3922 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis > Affects Versions: 4.0-ALPHA > Reporter: Kazuaki Hiraga > Assignee: Christian Moen > Labels: features > Fix For: 5.1 > > Attachments: LUCENE-3922.patch, LUCENE-3922.patch, LUCENE-3922.patch, > LUCENE-3922.patch, LUCENE-3922.patch, LUCENE-3922.patch, LUCENE-3922.patch > > > Japanese people use Kanji numerals instead of Arabic numerals for writing > price, address and so on. i.e 12万4800円(124,800JPY), 二番町三ノ二(3-2 Nibancho) and > 十二月(December). So, we would like to normalize those Kanji numerals to Arabic > numerals (I don't think we need to have a capability to normalize to Kanji > numerals). > -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org