Hi Stefan,

I borrowed that code from the mbstring extension.  Either I misinterpreted
the code, or mbstring also has it's utf-8 decoder incorrect.

--Wez.

On 08/25/02, "Stefan Esser" <[EMAIL PROTECTED]> wrote:
> Hello,
> 
> html.c / get_next_char() has an utf-8 decoder. The implementation
> is a little bit fishy. AFAIK utf-8 sequences are 1 upto 4 chars
> but this one supports 5, 6 byte utf-8 sequences. I wonder
> where this addition to the standard is defined..
> The problem is the following: the german ue is 0xFC which is an
> invalid utf-8 sequence. But the utf-8 decoder would recognise it
> as the lead byte of a 6 byte utf-8 sequence.



-- 
PHP Development Mailing List <http://www.php.net/>
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to