Tomoko Uchida created LUCENE-8869:
-------------------------------------
Summary: Build kuromoji system dictionary as a separated jar and
load it from JapaneseTokenizer at runtime
Key: LUCENE-8869
URL: https://issues.apache.org/jira/browse/LUCENE-8869
Project: Lucene - Core
Issue Type: Improvement
Components: modules/analysis
Reporter: Tomoko Uchida
This is a sub-task for LUCENE-8816.
In this issue, I will try to make small but self-contained changes to kuromoji
system dictionary.
- Make it possible to build a jar that contains (maybe) only dictionary data
resource generated by the {{build-dict}} task.
- Make it possible to load external dictionary when initializing
JapaneseTokenizer.
-- Some work are already done on LUCENE-8863
- Decouple current system dictionary data (mecab ipadic) from kuromoji itself
and use it as default (Possibly it can be done with another issue).
Also, some refactoring of the directory/source tree structure may be needed.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]