[jira] [Commented] (LUCENE-4956) the korean analyzer that has a korean morphological analyzer and dictionaries

Benson Margulies (JIRA) Thu, 17 Oct 2013 06:19:16 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-4956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13797876#comment-13797876
 ]


Benson Margulies commented on LUCENE-4956:
------------------------------------------

As a potential user of this technology, I'd like to ask for it to have 
documentation of its linguistic approach.

* What is the goal of the tokenizer? Is it to deliver eojeol or hyung-tae-so? 
If eojeol, does it split up the case where Korean writers are sometimes relaxed 
about whitespace between them?
* Similarly, what does it set out to index? Does it index eojeol and them also 
their contained eumjeol or hyung-tae-so, using position-increment / 
position-length to indicate compound relationships.



> the korean analyzer that has a korean morphological analyzer and dictionaries
> -----------------------------------------------------------------------------
>
>                 Key: LUCENE-4956
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4956
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/analysis
>    Affects Versions: 4.2
>            Reporter: SooMyung Lee
>            Assignee: Christian Moen
>              Labels: newbie
>         Attachments: eval.patch, kr.analyzer.4x.tar, lucene-4956.patch, 
> lucene4956.patch, LUCENE-4956.patch
>
>
> Korean language has specific characteristic. When developing search service 
> with lucene & solr in korean, there are some problems in searching and 
> indexing. The korean analyer solved the problems with a korean morphological 
> anlyzer. It consists of a korean morphological analyzer, dictionaries, a 
> korean tokenizer and a korean filter. The korean anlyzer is made for lucene 
> and solr. If you develop a search service with lucene in korean, It is the 
> best idea to choose the korean analyzer.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-4956) the korean analyzer that has a korean morphological analyzer and dictionaries

Reply via email to