Re: korean and lucene

Youngho Cho Wed, 26 Oct 2005 16:12:10 -0700

Hello Cheolgoo,

Now I updated my lucene version to 1.9 for using StandardAnalyzer for Korean.
And tested your patch which is already adopted in 1.9


http://issues.apache.org/jira/browse/LUCENE-444

But Still I have no good  results with Korean compare with CJKAnalyzer.

Single character is good match but more two character word doesn't match at all.

Am I something missing or still there need some more works ?


Thanks,

Youngho.
 

----- Original Message ----- 
From: "Cheolgoo Kang" <[EMAIL PROTECTED]>
To: <java-user@lucene.apache.org>; "John Wang" <[EMAIL PROTECTED]>
Sent: Tuesday, October 04, 2005 10:11 AM
Subject: Re: korean and lucene


> StandardAnalyzer's JavaCC based StandardTokenizer.jj cannot read
> Korean part of Unicode character blocks.
> 
> You should 1) use CJKAnalyzer or 2) add Korean character
> block(0xAC00~0xD7AF) to the CJK token definition on the
> StandardTokenizer.jj file.
> 
> Hope it helps.
> 
> 
> On 10/4/05, John Wang <[EMAIL PROTECTED]> wrote:
> > Hi:
> >
> > We are running into problems with searching on korean documents. We are
> > using the StandardAnalyzer and everything works with Chinese and Japanese.
> > Are there known problems with Korean with Lucene?
> >
> > Thanks
> >
> > -John
> >
> >
> 
> 
> --
> Cheolgoo
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]

Re: korean and lucene

Reply via email to