Rob,
Your example is, hopefully, not exact since you used "C1..." which I
presume was not what you originally tested with.
CJKAnalyzer is working fine for me in this example adapted from your
code:
public void testCJKAnalyzer() throws Exception {
RAMDirectory directory = new RAMDirectory();
IndexWriter writer = new IndexWriter(directory, new CJKAnalyzer
(), true);
Document doc = new Document();
doc.add(new Field("name", "脱下你的裤子", Field.Store.YES,
Field.Index.TOKENIZED));
writer.addDocument(doc);
writer.optimize();
writer.close();
IndexSearcher searcher = new IndexSearcher(directory);
Hits hits = searcher.search(new TermQuery(new Term("name", "裤
子")));
assertEquals(1, hits.length());
}
Erik
On Jun 15, 2006, at 7:30 AM, Robert Haycock wrote:
> Hi,
>
> I have a very simple example. An IndexWriter (Lucene 1.9.0) with
> CJKAnalyzer (latest version as of today). A Chinese friend of
mine as
> given me a sentence and a word that appears in that sentence, eg:
>
> "C1C2C3C4C5C6C7C8" where the word is "C3C4".
>
> Here's code segment:
>
> IndexWriter writer = new IndexWriter(directory, new CJKAnalyzer(),
> true);
> Document doc = new Document();
> doc.add(new Field("name", " C1C2C3C4C5C6C7C8", Store.YES,
> Index.TOKENIZED));
> writer.addDocument(doc);
> writer.optimize();
> writer.close();
>
> IndexSearcher searcher = new IndexSearcher(directory);
> Hits hits = searcher.search(new TermQuery(new Term("name",
"C3C4")));
>
>
> This returns no hits. I've also tried with the ChineseAnalyzer
> too, but
> still no good. It still works fine for English though.
>
> Anyone know how to get this working?
>
> Rob.
>
>
---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]