Additional question: in QueryParser.jj beginning at strings #258-259 there
is a unicode symbol range ("\u0080"-"\uFFFE"). I understand that Lucene can
search an index with unicode symbols in this range, right? So, question.
Russian (Cyrillic) symbols in unicode table are in this range: u\0401 -
u\
>The problem probably lies in the QueryParser class, as it takes only the
>less significant bytes of the characters given in the query.
Are you sure of that? I recently switched from using the standard JavaCC
AsciiCharStream implementation, to using Doug's FastCharStream
implementation, whic
Philipp Chudinov wrote:
>Hi!
>I am trying to use Lucene with russian texts. I created an index of xml
>documents (UTF-8 encoded), but when I am trying to search an index with a
>query from a servlet, it seems, that Lucene just finds nothing (though I am
>SURE it MUST find a term). Search string i
Hi!
I am trying to use Lucene with russian texts. I created an index of xml
documents (UTF-8 encoded), but when I am trying to search an index with a
query from a servlet, it seems, that Lucene just finds nothing (though I am
SURE it MUST find a term). Search string is reencoded to UTF-8 too, so I