Hello Youngho, I don't understand why you couldn't get hits result in Japanese, though, you had better check why the query was empty with Korean data:
> For Korean > [echo] Running lia.analysis.i18n.KoreanDemo... > [java] phrase = 경 > [java] query = The last line should be query = 경 to get hits result. Can you check why StandardAnalyzer removes "경" during tokenizing? Koji > -----Original Message----- > From: Youngho Cho [mailto:[EMAIL PROTECTED] > Sent: Thursday, October 27, 2005 11:37 AM > To: java-user@lucene.apache.org > Subject: Re: korean and lucene > > > Hello Koji, > > Thanks for your kind reply. > > Yes, I used QueryParser. normaly I used > Query = QueryParser.parse( ) method. > > I put your sample code into lia.analysis.i18n package in LuceneAction > and run JapaneseDemo using 1.4 and 1.9 > > results are > > [echo] Running lia.analysis.i18n.JapaneseDemo... > [java] query = content:ラ?メン屋 > > I can't get hits result. > > For Korean > [echo] Running lia.analysis.i18n.KoreanDemo... > [java] phrase = 경 > [java] query = > > I can't get query parse result. > > Thanks, > > Youngho > > > > ----- Original Message ----- > From: "Koji Sekiguchi" <[EMAIL PROTECTED]> > To: <java-user@lucene.apache.org>; "Youngho Cho" <[EMAIL PROTECTED]> > Sent: Thursday, October 27, 2005 9:48 AM > Subject: RE: korean and lucene > > > > Hi Youngho, > > > > With regard to Japanese, using StandardAnalyzer, > > I can search a word/phase. > > > > Did you use QueryParser? StandardAnalyzer tokenizes > > CJK characters into a stream of single character. > > Use QueryParser to get a PhraseQuery and search the query. > > > > Please see the following sample code. Replace Japanese > > "contents" and (search target) "phrase" with Korean in the > program and run. > > > > regards, > > > > Koji > > > > ============================================= > > import java.io.IOException; > > import org.apache.lucene.analysis.Analyzer; > > import org.apache.lucene.analysis.standard.StandardAnalyzer; > > import org.apache.lucene.analysis.cjk.CJKAnalyzer; > > import org.apache.lucene.store.Directory; > > import org.apache.lucene.store.RAMDirectory; > > import org.apache.lucene.index.IndexWriter; > > import org.apache.lucene.document.Document; > > import org.apache.lucene.document.Field; > > import org.apache.lucene.search.IndexSearcher; > > import org.apache.lucene.search.Hits; > > import org.apache.lucene.search.Query; > > import org.apache.lucene.queryParser.QueryParser; > > import org.apache.lucene.queryParser.ParseException; > > > > public class JapaneseByStandardAnalyzer { > > > > private static final String FIELD_CONTENT = "content"; > > private static final String[] contents = { > > "東京にはおいしいラーメン屋がたくさんあります。", > > "北海道にもおいしいラーメン屋があります。" > > }; > > private static final String phrase = "ラーメン屋"; > > //private static final String phrase = "屋"; > > private static Analyzer analyzer = null; > > > > public static void main( String[] args ) throws > IOException, ParseException { > > Directory directory = makeIndex(); > > search( directory ); > > directory.close(); > > } > > > > private static Analyzer getAnalyzer(){ > > if( analyzer == null ){ > > analyzer = new StandardAnalyzer(); > > //analyzer = new CJKAnalyzer(); > > } > > return analyzer; > > } > > > > private static Directory makeIndex() throws IOException { > > Directory directory = new RAMDirectory(); > > IndexWriter writer = new IndexWriter( directory, getAnalyzer(), true ); > > for( int i = 0; i < contents.length; i++ ){ > > Document doc = new Document(); > > doc.add( new Field( FIELD_CONTENT, contents[i], > Field.Store.YES, Field.Index.TOKENIZED ) ); > > writer.addDocument( doc ); > > } > > writer.close(); > > return directory; > > } > > > > private static void search( Directory directory ) throws > IOException, ParseException { > > IndexSearcher searcher = new IndexSearcher( directory ); > > QueryParser parser = new QueryParser( FIELD_CONTENT, getAnalyzer() ); > > Query query = parser.parse( phrase ); > > System.out.println( "query = " + query ); > > Hits hits = searcher.search( query ); > > for( int i = 0; i < hits.length(); i++ ) > > System.out.println( "doc = " + hits.doc( i ).get( FIELD_CONTENT ) ); > > searcher.close(); > > } > > } > > > > > > > -----Original Message----- > > > From: Youngho Cho [mailto:[EMAIL PROTECTED] > > > Sent: Thursday, October 27, 2005 8:18 AM > > > To: java-user@lucene.apache.org; Cheolgoo Kang > > > Subject: Re: korean and lucene > > > > > > > > > Hello Cheolgoo, > > > > > > Now I updated my lucene version to 1.9 for using StandardAnalyzer > > > for Korean. > > > And tested your patch which is already adopted in 1.9 > > > > > > http://issues.apache.org/jira/browse/LUCENE-444 > > > > > > But Still I have no good results with Korean compare with > CJKAnalyzer. > > > > > > Single character is good match but more two character word > > > doesn't match at all. > > > > > > Am I something missing or still there need some more works ? > > > > > > > > > Thanks, > > > > > > Youngho. > > > > > > > > > ----- Original Message ----- > > > From: "Cheolgoo Kang" <[EMAIL PROTECTED]> > > > To: <java-user@lucene.apache.org>; "John Wang" <[EMAIL PROTECTED]> > > > Sent: Tuesday, October 04, 2005 10:11 AM > > > Subject: Re: korean and lucene > > > > > > > > > > StandardAnalyzer's JavaCC based StandardTokenizer.jj cannot read > > > > Korean part of Unicode character blocks. > > > > > > > > You should 1) use CJKAnalyzer or 2) add Korean character > > > > block(0xAC00~0xD7AF) to the CJK token definition on the > > > > StandardTokenizer.jj file. > > > > > > > > Hope it helps. > > > > > > > > > > > > On 10/4/05, John Wang <[EMAIL PROTECTED]> wrote: > > > > > Hi: > > > > > > > > > > We are running into problems with searching on korean > > > documents. We are > > > > > using the StandardAnalyzer and everything works with Chinese > > > and Japanese. > > > > > Are there known problems with Korean with Lucene? > > > > > > > > > > Thanks > > > > > > > > > > -John > > > > > > > > > > > > > > > > > > > > > > -- > > > > Cheolgoo > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]