subject:"\[fw\-general\] Zend_Search_Lucene UTF\-8 encoding"

Fw: [fw-general] Zend_Search_Lucene UTF-8 encoding

2006-12-23 Thread Sebi

Sorry for the mistake from the last replay. I wanted to say I think I will create a list with all non-ascii latin characters, together with some ascii (alpha-digit) patterns. '' is used as a part of query syntax. But Analyzer is used after query recognition to process lexemes or phrases. So

Re: [fw-general] Zend_Search_Lucene UTF-8 encoding

2006-12-22 Thread Sebi

'' is used as a part of query syntax. But Analyzer is used after query recognition to process lexemes or phrases. So htmlentities() may be used. I will try to replace with some alpha digit pattern; From the other side, it doesn't help with a problem, which we have for full UTF-8 support.

[fw-general] Zend_Search_Lucene UTF-8 encoding

2006-12-21 Thread Sebi

I want to index text in UTF-8 format. I use latin characters. Here are some examples of characters (encoded in ISO-8859-1): ó, é, á, etc. I used iconv function iconv('ISO-8859-1', 'ASCII//TRANSLIT', 'Animación') and i got Animaci'on which also contains some break characters for the