[jira] [Commented] (LUCENE-4956) the korean analyzer that has a korean morphological analyzer and dictionaries

Uwe Schindler (JIRA) Thu, 26 Dec 2013 13:20:06 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-4956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13857081#comment-13857081
 ]


Uwe Schindler commented on LUCENE-4956:
---------------------------------------

Hi,
I have the same problems like Robert with some code parts. Partly the code is 
un-understandable and it looks like some places just have "workarounds" around 
silly bugs in the original code (like the catch ArrayIndexOutOfBoundsException 
and resuming with completely different code paths).

Also the code generates like 5 completely new java.util.Collections (Lists, 
Maps,...) per token, without even reusing the previous ones!

The code has lots of problems with offsets and positions (sometimes we 
workarounded using Math.max(0, positionFromCrazyCode). The code as it is will 
not pass TestRandomChains!

Robert and I already rewrote lots of the code and also removed the GPL code. At 
this point it is still not in a state that can be committed to Lucene trunk or 
even backporting it.

> the korean analyzer that has a korean morphological analyzer and dictionaries
> -----------------------------------------------------------------------------
>
>                 Key: LUCENE-4956
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4956
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/analysis
>    Affects Versions: 4.2
>            Reporter: SooMyung Lee
>            Assignee: Christian Moen
>              Labels: newbie
>         Attachments: LUCENE-4956.patch, eval.patch, kr.analyzer.4x.tar, 
> lucene-4956.patch, lucene4956.patch
>
>
> Korean language has specific characteristic. When developing search service 
> with lucene & solr in korean, there are some problems in searching and 
> indexing. The korean analyer solved the problems with a korean morphological 
> anlyzer. It consists of a korean morphological analyzer, dictionaries, a 
> korean tokenizer and a korean filter. The korean anlyzer is made for lucene 
> and solr. If you develop a search service with lucene in korean, It is the 
> best idea to choose the korean analyzer.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4956) the korean analyzer that has a korean morphological analyzer and dictionaries

Reply via email to