Hi!
Right now, if json_encode sees wrong UTF-8 data, it just cuts the string
in the middle, no error returned, no message produced. Example:
var_dump(json_encode("ab\xE0"));
var_dump(json_encode("ab\xE0\""));
Both strings get cut at "ab". I think it's not a good idea to just
silently cut the data. In fact, I think it is a bug caused by this code
in ext/json/utf8_to_utf16.c:
if (c < 0) {
return UTF8_END ? the_index : UTF8_ERROR;
}
which inherited this bug from code published on json.org. It should be:
if (c < 0) {
return (c == UTF8_END) ? the_index : UTF8_ERROR;
}
Now this is an easy fix but would lead to bad strings silently converted
to empty strings. The question is - should we have an error there? If
so, which one - E_WARNING, E_NOTICE? I'm for E_WARNING.
Also filed as bug #43941.
Any comments?
--
Stanislav Malyshev, Zend Software Architect
[EMAIL PROTECTED] http://www.zend.com/
(408)253-8829 MSN: [EMAIL PROTECTED]
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php