Re: non-ASCII char search problem with nightly build (12 nov.)

2001-11-12 Thread Philipp Chudinov
Additional question: in QueryParser.jj beginning at strings #258-259 there is a unicode symbol range ("\u0080"-"\uFFFE"). I understand that Lucene can search an index with unicode symbols in this range, right? So, question. Russian (Cyrillic) symbols in unicode table are in this range: u\0401 - u\

Re: non-ASCII char search problem with nightly build (12 nov.)

2001-11-12 Thread Brian Goetz
>The problem probably lies in the QueryParser class, as it takes only the >less significant bytes of the characters given in the query. Are you sure of that? I recently switched from using the standard JavaCC AsciiCharStream implementation, to using Doug's FastCharStream implementation, whic

Re: non-ASCII char search problem with nightly build (12 nov.)

2001-11-12 Thread Andrzej Jarmoniuk
Philipp Chudinov wrote: >Hi! >I am trying to use Lucene with russian texts. I created an index of xml >documents (UTF-8 encoded), but when I am trying to search an index with a >query from a servlet, it seems, that Lucene just finds nothing (though I am >SURE it MUST find a term). Search string i

non-ASCII char search problem with nightly build (12 nov.)

2001-11-12 Thread Philipp Chudinov
Hi! I am trying to use Lucene with russian texts. I created an index of xml documents (UTF-8 encoded), but when I am trying to search an index with a query from a servlet, it seems, that Lucene just finds nothing (though I am SURE it MUST find a term). Search string is reencoded to UTF-8 too, so I