[ 
https://issues.apache.org/jira/browse/LUCENE-3922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Moen updated LUCENE-3922:
-----------------------------------
    Attachment: LUCENE-3922.patch

Updated patch with decimal number support, additional javadoc and the test code 
now makes precommit happy.

Token-attributes such as part-of-speech, readings, etc. for the normalized 
token is currently inherited from the last token used when composing the 
normalized number. Since these values are likely to be wrong, I'm inclined to 
set this attributes to null or a reasonable default.

I'm very happy to hear your thoughts on this.



> Add Japanese Kanji number normalization to Kuromoji
> ---------------------------------------------------
>
>                 Key: LUCENE-3922
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3922
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/analysis
>    Affects Versions: 4.0-ALPHA
>            Reporter: Kazuaki Hiraga
>            Assignee: Christian Moen
>              Labels: features
>             Fix For: 5.1
>
>         Attachments: LUCENE-3922.patch, LUCENE-3922.patch, LUCENE-3922.patch, 
> LUCENE-3922.patch, LUCENE-3922.patch, LUCENE-3922.patch, LUCENE-3922.patch
>
>
> Japanese people use Kanji numerals instead of Arabic numerals for writing 
> price, address and so on. i.e 12万4800円(124,800JPY), 二番町三ノ二(3-2 Nibancho) and 
> 十二月(December).  So, we would like to normalize those Kanji numerals to Arabic 
> numerals (I don't think we need to have a capability to normalize to Kanji 
> numerals).
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to