Re: [PHP-DEV] json_encode() bug

2008-04-21 Thread Stanislav Malyshev

Hi!

This should really be fixed similar to the iconv //IGNORE flag  - so 
that bad characters are just replaced with '?'


Maybe using iconv is a way to go in your scenario?

--
Stanislav Malyshev, Zend Software Architect
[EMAIL PROTECTED]   http://www.zend.com/
(408)253-8829   MSN: [EMAIL PROTECTED]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] json_encode() bug

2008-04-18 Thread Alan Knowles
This should really be fixed similar to the iconv //IGNORE flag  - so 
that bad characters are just replaced with '?'


We use it to render spam email summaries, and dont really care if the 
encoding is incorrect, just as long as it shows something.


Throwing a warning without having a fix/workaround,  just  reduces the 
usefulness of the function.


Regards
Alan

Stanislav Malyshev wrote:

Hi!

Right now, if json_encode sees wrong UTF-8 data, it just cuts the 
string in the middle, no error returned, no message produced. Example:


var_dump(json_encode("ab\xE0"));
var_dump(json_encode("ab\xE0\""));

Both strings get cut at "ab". I think it's not a good idea to just 
silently cut the data. In fact, I think it is a bug caused by this 
code in ext/json/utf8_to_utf16.c:

if (c < 0) {
return UTF8_END ? the_index : UTF8_ERROR;
}
which inherited this bug from code published on json.org. It should be:
if (c < 0) {
return (c == UTF8_END) ? the_index : UTF8_ERROR;
}
Now this is an easy fix but would lead to bad strings silently 
converted to empty strings. The question is - should we have an error 
there? If so, which one - E_WARNING, E_NOTICE? I'm for E_WARNING.

Also filed as bug #43941.
Any comments?



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] json_encode() bug

2008-02-06 Thread Derick Rethans
On Tue, 5 Feb 2008, Richard Lynch wrote:

> On Sun, February 3, 2008 7:51 pm, Stanislav Malyshev wrote:
> >> Like I mentioned before (I think), it should not return an empty 
> >> string of course because programmatically it's not possible to 
> >> check for this. As most of our functions return false in those 
> >> cases, so should this function.
> >
> > AFAIR false is not valid JSON, so it would break a lot of code. 
> > Also, I am not sure we should change json_encode to return false on 
> > whole structure if one of the fields contains invalid utf-8.

[snip]

> Put partially decoded value somewhere in a variable to be accessed, so
> that the programmer can make an informed decision about what to do.

We're talking about encoding here though...

regards,
Derick

-- 
Derick Rethans
http://derickrethans.nl | http://ezcomponents.org | http://xdebug.org

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] json_encode() bug

2008-02-05 Thread Richard Lynch
On Sun, February 3, 2008 7:51 pm, Stanislav Malyshev wrote:
>> Like I mentioned before (I think), it should not return an empty
>> string
>> of course because programmatically it's not possible to check for
>> this.
>> As most of our functions return false in those cases, so should this
>> function.
>
> AFAIR false is not valid JSON, so it would break a lot of code. Also,
> I
> am not sure we should change json_encode to return false on whole
> structure if one of the fields contains invalid utf-8.

I see what you are saying, but...

If you want to write good quality code, you have to do something
reasonable here.

PERHAPS:

Return false

E_WARNING

Put partially decoded value somewhere in a variable to be accessed, so
that the programmer can make an informed decision about what to do.

-- 
Some people have a "gift" link here.
Know what I want?
I want you to buy a CD from some indie artist.
http://cdbaby.com/from/lynch
Yeah, I get a buck. So?

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] json_encode() bug

2008-02-04 Thread Derick Rethans
On Sun, 3 Feb 2008, Stanislav Malyshev wrote:

> > Like I mentioned before (I think), it should not return an empty string of
> > course because programmatically it's not possible to check for this. As most
> > of our functions return false in those cases, so should this function.
> 
> AFAIR false is not valid JSON, so it would break a lot of code. Also, I am not
> sure we should change json_encode to return false on whole structure if one of
> the fields contains invalid utf-8.

Of course we should make it possible to detect if there is broken data. 
I think our normal mantra is "be strict with creating, be lenient while 
parsing". Because we're generating data here, we should *abort* if we 
find an error, and not return half baked results. To detect if we have 
broken data, the option is to check a return value. As normally this 
function returns a string, false (or null) is a good candidate for an 
error-result.

regards,
Derick

-- 
Derick Rethans
http://derickrethans.nl | http://ezcomponents.org | http://xdebug.org

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] json_encode() bug

2008-02-04 Thread Stanislav Malyshev
Like I mentioned before (I think), it should not return an empty string 
of course because programmatically it's not possible to check for this. 
As most of our functions return false in those cases, so should this 
function.


AFAIR false is not valid JSON, so it would break a lot of code. Also, I 
am not sure we should change json_encode to return false on whole 
structure if one of the fields contains invalid utf-8.


--
Stanislav Malyshev, Zend Software Architect
[EMAIL PROTECTED]   http://www.zend.com/
(408)253-8829   MSN: [EMAIL PROTECTED]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] json_encode() bug

2008-02-02 Thread Derick Rethans
On Fri, 25 Jan 2008, Stanislav Malyshev wrote:

> Hi!
> 
> Right now, if json_encode sees wrong UTF-8 data, it just cuts the string in
> the middle, no error returned, no message produced. Example:
> 
> var_dump(json_encode("ab\xE0"));
> var_dump(json_encode("ab\xE0\""));
> 
> Both strings get cut at "ab". I think it's not a good idea to just silently
> cut the data. In fact, I think it is a bug caused by this code in
> ext/json/utf8_to_utf16.c:
> if (c < 0) {
> return UTF8_END ? the_index : UTF8_ERROR;
> }
> which inherited this bug from code published on json.org. It should be:
> if (c < 0) {
> return (c == UTF8_END) ? the_index : UTF8_ERROR;
> }
> Now this is an easy fix but would lead to bad strings silently converted to
> empty strings. The question is - should we have an error there? If so, which
> one - E_WARNING, E_NOTICE? I'm for E_WARNING.
> Also filed as bug #43941.
> Any comments?

Like I mentioned before (I think), it should not return an empty string 
of course because programmatically it's not possible to check for this. 
As most of our functions return false in those cases, so should this 
function.

Derick

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] json_encode() bug

2008-02-01 Thread Pierre Joye
On Feb 1, 2008 9:06 PM, Stanislav Malyshev <[EMAIL PROTECTED]> wrote:
> > Stupid question, who actually checks for E_* in his code at runtime
> > after having called such functions? Not me and I would hate to. It
> > sounds to me like a perfect exception use case. As this function can
>
> General policy in PHP as far as I know is that non-OO functions do not
> do exceptions.
>
> > return nearly everything scalar we have (boolean, string, null,
> > integer,...) we can't use our normal "returns FALSE on failure".
>
> This function can return only a string, but I'm not sure returning false
> if somewhere in some value deep down your data is some bad utf-8 is
> warranted. It is doable, though, but I'm afraid most of current code
> never check for return of json_encode() so they are in for big surprises
>   when json_encode won't produce valid JSON.

Oh right, I thought about/read json_decode. But the idea is the same
yes. Not checking return values is a very bad habit, I can't count how
many times I saw codes dying miserably only because they don't check
returns values. Would it make sense to use exception for such cases?
Aka to change our policy? I'm not an excpeption fan, but these cases
are good examples of exception usages. We can invent a meta type
"error" but it would be rather confusing  ;)

-- 
Pierre
http://blog.thepimp.net | http://www.libgd.org

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] json_encode() bug

2008-02-01 Thread Stanislav Malyshev

Stupid question, who actually checks for E_* in his code at runtime
after having called such functions? Not me and I would hate to. It
sounds to me like a perfect exception use case. As this function can


General policy in PHP as far as I know is that non-OO functions do not 
do exceptions.



return nearly everything scalar we have (boolean, string, null,
integer,...) we can't use our normal "returns FALSE on failure".


This function can return only a string, but I'm not sure returning false 
if somewhere in some value deep down your data is some bad utf-8 is 
warranted. It is doable, though, but I'm afraid most of current code 
never check for return of json_encode() so they are in for big surprises 
 when json_encode won't produce valid JSON.

--
Stanislav Malyshev, Zend Software Architect
[EMAIL PROTECTED]   http://www.zend.com/
(408)253-8829   MSN: [EMAIL PROTECTED]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] json_encode() bug

2008-02-01 Thread Pierre Joye
On Feb 1, 2008 8:13 PM, Cristian Rodriguez <[EMAIL PROTECTED]> wrote:
> 2008/1/25, Stanislav Malyshev <[EMAIL PROTECTED]>:
>
> > Now this is an easy fix but would lead to bad strings silently converted
> > to empty strings. The question is - should we have an error there? If
> > so, which one - E_WARNING, E_NOTICE? I'm for E_WARNING.
>
> Yes , E_WARNING is the right thing to have IMHO.

Stupid question, who actually checks for E_* in his code at runtime
after having called such functions? Not me and I would hate to. It
sounds to me like a perfect exception use case. As this function can
return nearly everything scalar we have (boolean, string, null,
integer,...) we can't use our normal "returns FALSE on failure".

Cheers,
-- 
Pierre
http://blog.thepimp.net | http://www.libgd.org

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] json_encode() bug

2008-02-01 Thread Cristian Rodriguez
2008/1/25, Stanislav Malyshev <[EMAIL PROTECTED]>:

> Now this is an easy fix but would lead to bad strings silently converted
> to empty strings. The question is - should we have an error there? If
> so, which one - E_WARNING, E_NOTICE? I'm for E_WARNING.

Yes , E_WARNING is the right thing to have IMHO.

-- 
http://www.cristianrodriguez.net

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] json_encode() bug

2008-01-25 Thread Stanislav Malyshev

Hi!

Right now, if json_encode sees wrong UTF-8 data, it just cuts the string 
in the middle, no error returned, no message produced. Example:


var_dump(json_encode("ab\xE0"));
var_dump(json_encode("ab\xE0\""));

Both strings get cut at "ab". I think it's not a good idea to just 
silently cut the data. In fact, I think it is a bug caused by this code 
in ext/json/utf8_to_utf16.c:

if (c < 0) {
return UTF8_END ? the_index : UTF8_ERROR;
}
which inherited this bug from code published on json.org. It should be:
if (c < 0) {
return (c == UTF8_END) ? the_index : UTF8_ERROR;
}
Now this is an easy fix but would lead to bad strings silently converted 
to empty strings. The question is - should we have an error there? If 
so, which one - E_WARNING, E_NOTICE? I'm for E_WARNING.

Also filed as bug #43941.
Any comments?
--
Stanislav Malyshev, Zend Software Architect
[EMAIL PROTECTED]   http://www.zend.com/
(408)253-8829   MSN: [EMAIL PROTECTED]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php