I thought it was valid in terms of CEF nature of UTF-8, but it turrned out RFC states such sequences that start with F5-F7 are not valid. Thanks for the clarification.
Moriyoshi On 2011/10/23, at 20:25, Rui Hirokawa wrote: > Hello, Moriyoshi, > > It is because 0xf5-0xf7 is the invalid four byte UTF-8 sequence. > > ref: http://en.wikipedia.org/wiki/UTF-8 > > Rui > > Moriyoshi Koizumi wrote: >> Rui, what is the reason behind this change? >> >> Moriyoshi >> >> On 2011/10/18, at 23:04, Rui Hirokawa wrote: >> >>> hirokawa Tue, 18 Oct 2011 14:04:13 +0000 >>> >>> Revision: http://svn.php.net/viewvc?view=revision&revision=318184 >>> >>> Log: >>> MFH: fixed byte length of utf-8. >>> >>> Changed paths: >>> U >>> php/php-src/branches/PHP_5_4/ext/mbstring/libmbfl/filters/mbfilter_utf8.c >>> >>> Modified: >>> php/php-src/branches/PHP_5_4/ext/mbstring/libmbfl/filters/mbfilter_utf8.c >>> =================================================================== >>> --- >>> php/php-src/branches/PHP_5_4/ext/mbstring/libmbfl/filters/mbfilter_utf8.c >>> 2011-10-18 14:03:44 UTC (rev 318183) >>> +++ >>> php/php-src/branches/PHP_5_4/ext/mbstring/libmbfl/filters/mbfilter_utf8.c >>> 2011-10-18 14:04:13 UTC (rev 318184) >>> @@ -52,7 +52,7 @@ >>> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, >>> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, >>> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, >>> - 4, 4, 4, 4, 4, 4, 4, 4, 1, 1, 1, 1, 1, 1, 1, 1 >>> + 4, 4, 4, 4, 4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 >>> }; >>> >>> static const char *mbfl_encoding_utf8_aliases[] = {"utf8", NULL}; >>> >>> -- >>> PHP CVS Mailing List (http://www.php.net/) >>> To unsubscribe, visit: http://www.php.net/unsub.php >> > -- Moriyoshi Koizumi m...@mozo.jp -- PHP CVS Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php