Hi Stefan, I borrowed that code from the mbstring extension. Either I misinterpreted the code, or mbstring also has it's utf-8 decoder incorrect.
--Wez. On 08/25/02, "Stefan Esser" <[EMAIL PROTECTED]> wrote: > Hello, > > html.c / get_next_char() has an utf-8 decoder. The implementation > is a little bit fishy. AFAIK utf-8 sequences are 1 upto 4 chars > but this one supports 5, 6 byte utf-8 sequences. I wonder > where this addition to the standard is defined.. > The problem is the following: the german ue is 0xFC which is an > invalid utf-8 sequence. But the utf-8 decoder would recognise it > as the lead byte of a 6 byte utf-8 sequence. -- PHP Development Mailing List <http://www.php.net/> To unsubscribe, visit: http://www.php.net/unsub.php