[ https://issues.apache.org/jira/browse/LUCENE-8752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808691#comment-16808691 ]
Tomoko Uchida commented on LUCENE-8752: --------------------------------------- Could you please review this pull request? [~cm] (or someone else?) [https://github.com/apache/lucene-solr/pull/632] > Apply a patch to kuromoji dictionary to properly handle Japanese new era '令和' > (REIWA) > ------------------------------------------------------------------------------------- > > Key: LUCENE-8752 > URL: https://issues.apache.org/jira/browse/LUCENE-8752 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis > Reporter: Tomoko Uchida > Priority: Minor > > As of May 1st, 2019, Japanese era '元号' (Gengo) will be set to '令和' (Reiwa). > See this article for more details: > [https://www.bbc.com/news/world-asia-47769566] > Currently '令和' is splitted up to '令' and '和' by {{JapaneseTokenizer}}. It > should be tokenized as one word so that Japanese texts including era names > are searched as users expect. Because the default Kuromoji dictionary > (mecab-ipadic) has not been maintained since 2007, a one-line patch to the > source CSV file is needed for this era change. > Era name is used in many official or formal documents in Japan, so it would > be desirable the search systems properly handle this without adding a user > dictionary or using phrase query. :) > FYI, JDK DateTime API will support the new era (in the next updates.) > [https://blogs.oracle.com/java-platform-group/a-new-japanese-era-for-java] > The patch is available here: > [https://github.com/apache/lucene-solr/pull/632] > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org