I thought it was valid in terms of CEF nature of UTF-8, but it turrned out RFC 
states such sequences that start with F5-F7 are not valid.
Thanks for the clarification.

Moriyoshi

On 2011/10/23, at 20:25, Rui Hirokawa wrote:

> Hello, Moriyoshi,
> 
> It is because 0xf5-0xf7 is the invalid four byte UTF-8 sequence.
> 
> ref: http://en.wikipedia.org/wiki/UTF-8
> 
> Rui
> 
> Moriyoshi Koizumi wrote:
>> Rui, what is the reason behind this change?
>> 
>> Moriyoshi
>> 
>> On 2011/10/18, at 23:04, Rui Hirokawa wrote:
>> 
>>> hirokawa                                 Tue, 18 Oct 2011 14:04:13 +0000
>>> 
>>> Revision: http://svn.php.net/viewvc?view=revision&revision=318184
>>> 
>>> Log:
>>> MFH: fixed byte length of utf-8.
>>> 
>>> Changed paths:
>>>   U   
>>> php/php-src/branches/PHP_5_4/ext/mbstring/libmbfl/filters/mbfilter_utf8.c
>>> 
>>> Modified: 
>>> php/php-src/branches/PHP_5_4/ext/mbstring/libmbfl/filters/mbfilter_utf8.c
>>> ===================================================================
>>> --- 
>>> php/php-src/branches/PHP_5_4/ext/mbstring/libmbfl/filters/mbfilter_utf8.c   
>>>     2011-10-18 14:03:44 UTC (rev 318183)
>>> +++ 
>>> php/php-src/branches/PHP_5_4/ext/mbstring/libmbfl/filters/mbfilter_utf8.c   
>>>     2011-10-18 14:04:13 UTC (rev 318184)
>>> @@ -52,7 +52,7 @@
>>>     2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
>>>     2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
>>>     3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
>>> -   4, 4, 4, 4, 4, 4, 4, 4, 1, 1, 1, 1, 1, 1, 1, 1
>>> +   4, 4, 4, 4, 4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
>>> };
>>> 
>>> static const char *mbfl_encoding_utf8_aliases[] = {"utf8", NULL};
>>> 
>>> -- 
>>> PHP CVS Mailing List (http://www.php.net/)
>>> To unsubscribe, visit: http://www.php.net/unsub.php
>> 
> 

-- 
Moriyoshi Koizumi
m...@mozo.jp




--
PHP CVS Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to