Sorry for the mistake from the last replay. I wanted to say I think I will
create a list with all non-ascii latin characters,
together with some ascii (alpha-digit) patterns.
'' is used as a part of query syntax.
But Analyzer is used after query recognition to process lexemes or
phrases. So
'' is used as a part of query syntax.
But Analyzer is used after query recognition to process lexemes or
phrases. So htmlentities() may be used.
I will try to replace with some alpha digit pattern;
From the other side, it doesn't help with a problem, which we have for
full UTF-8 support.
I want to index text in UTF-8 format. I use latin characters.
Here are some examples of characters (encoded in ISO-8859-1): ó, é, á, etc.
I used iconv function iconv('ISO-8859-1', 'ASCII//TRANSLIT', 'Animación') and
i got Animaci'on which also contains some break
characters for the