#45989 [Com]: json_decode() passes through certain invalid JSON strings

2008-12-23 Thread bruno dot p dot reis at gmail dot com
 ID:   45989
 Comment by:   bruno dot p dot reis at gmail dot com
 Reported By:  steven at acko dot net
 Status:   To be documented
 Bug Type: JSON related
 Operating System: Mac OS X
 PHP Version:  5.2.6
 New Comment:

I agree with kevin at metalaxe dot com, 

throwing an exception (may be even a InvalidArgumentException) on
malformed json would be a much more decent way to say the json is
invalid and would clarify a lot the behaviour. 

Other good thing would be another function just to check if a json is
valid or not.


Previous Comments:


[2008-12-12 23:21:57] scott...@php.net

Applied the patch provided by magicaltux



[2008-12-12 07:51:16] kevin at metalaxe dot com

The JSON spec states:
"
A JSON text is a sequence of tokens.  The set of tokens includes six
   structural characters, strings, numbers, and three literal names.

A JSON text is a serialized object or array.
"

So, in order to maintain compliance, PHP must also support
non-objects/arrays as input properly.

If I understand your patch correctly:

If the input is json_decode("null"); the output would be NULL (I saw no
test case for null input in the patch itself). We would have no way of
knowing a problem exists if one were to have an input of
json_decode('[');.

Can't this function throw an exception on failure? Failing that,could
we at least get a PHP warning? Otherwise it will be impossible to full
rely on this function in the case where null is the actual input...



[2008-12-03 22:29:58] magical...@php.net

And here are patches against PHP_5_3 and HEAD:

http://ookoo.org/svn/snip/php_5_3-json-returntype-final-fix.patch

http://ookoo.org/svn/snip/php_head-json-returntype-final-fix.patch

Some tests now work on json on HEAD (less failure than what's currently
displayed on gcov.php.net) but still two fails. As those failures are
not within the scope of this bug (and are specific to HEAD) they be
fixed in different patches.

I believe that once this is commited to the CVS, this bug should be
marked as "To be documented". I also believe till wants to submit some
additional tests for those this issue...



[2008-12-03 21:17:33] magical...@php.net

Just a note for documentation:

http://docs.php.net/json_decode

Right now the documentation says the function returns an object, OR an
array. This is not strictly true as it may return a string, a boolean,
an integer, a double... depending on the input.

Also, the fact json_decode() may return NULL on error isn't explicitly
documented either, instead some examples which happens to return NULL
with the current implementation are provided. I think it would be a good
idea to explicitly document this behavior, if the change I'm proposing
here is accepted.



[2008-12-03 21:10:50] magical...@php.net

Ok guys, I've had a look at the CVS history for json, and checked why
it was following this weird behaviour (returning what was passed in some
cases, and NULL in other cases).

The CVS commit log message for this relates to bug #38680, however it
seems that the behaviour in parsing strings not handled by json is doing
too much to try to "fix" things and find a way to provide parsed value.

Anyway here's a patch that changes this behaviour to make json_decode()
return NULL when we get invalid JSON data, while still keeping null,
true, false and integers parsing.

Some tests were fixed (the result depended on broken behaviour), and
the other tests still run fine.


The patch itself, against PHP_5_2:

http://ookoo.org/svn/snip/php_5_2-json-returntype-final-fix.patch


If nobody can find anything against this (being a bit more strict with
obviously wrong values) I'll add patchs against HEAD and PHP_5_3.



The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
http://bugs.php.net/45989

-- 
Edit this bug report at http://bugs.php.net/?id=45989&edit=1



#45989 [Com]: json_decode() passes through certain invalid JSON strings

2008-12-11 Thread kevin at metalaxe dot com
 ID:   45989
 Comment by:   kevin at metalaxe dot com
 Reported By:  steven at acko dot net
 Status:   Assigned
 Bug Type: JSON related
 Operating System: Mac OS X
 PHP Version:  5.2.6
 Assigned To:  magicaltux
 New Comment:

The JSON spec states:
"
A JSON text is a sequence of tokens.  The set of tokens includes six
   structural characters, strings, numbers, and three literal names.

A JSON text is a serialized object or array.
"

So, in order to maintain compliance, PHP must also support
non-objects/arrays as input properly.

If I understand your patch correctly:

If the input is json_decode("null"); the output would be NULL (I saw no
test case for null input in the patch itself). We would have no way of
knowing a problem exists if one were to have an input of
json_decode('[');.

Can't this function throw an exception on failure? Failing that,could
we at least get a PHP warning? Otherwise it will be impossible to full
rely on this function in the case where null is the actual input...


Previous Comments:


[2008-12-03 22:29:58] magical...@php.net

And here are patches against PHP_5_3 and HEAD:

http://ookoo.org/svn/snip/php_5_3-json-returntype-final-fix.patch

http://ookoo.org/svn/snip/php_head-json-returntype-final-fix.patch

Some tests now work on json on HEAD (less failure than what's currently
displayed on gcov.php.net) but still two fails. As those failures are
not within the scope of this bug (and are specific to HEAD) they be
fixed in different patches.

I believe that once this is commited to the CVS, this bug should be
marked as "To be documented". I also believe till wants to submit some
additional tests for those this issue...



[2008-12-03 21:17:33] magical...@php.net

Just a note for documentation:

http://docs.php.net/json_decode

Right now the documentation says the function returns an object, OR an
array. This is not strictly true as it may return a string, a boolean,
an integer, a double... depending on the input.

Also, the fact json_decode() may return NULL on error isn't explicitly
documented either, instead some examples which happens to return NULL
with the current implementation are provided. I think it would be a good
idea to explicitly document this behavior, if the change I'm proposing
here is accepted.



[2008-12-03 21:10:50] magical...@php.net

Ok guys, I've had a look at the CVS history for json, and checked why
it was following this weird behaviour (returning what was passed in some
cases, and NULL in other cases).

The CVS commit log message for this relates to bug #38680, however it
seems that the behaviour in parsing strings not handled by json is doing
too much to try to "fix" things and find a way to provide parsed value.

Anyway here's a patch that changes this behaviour to make json_decode()
return NULL when we get invalid JSON data, while still keeping null,
true, false and integers parsing.

Some tests were fixed (the result depended on broken behaviour), and
the other tests still run fine.


The patch itself, against PHP_5_2:

http://ookoo.org/svn/snip/php_5_2-json-returntype-final-fix.patch


If nobody can find anything against this (being a bit more strict with
obviously wrong values) I'll add patchs against HEAD and PHP_5_3.



[2008-12-02 18:52:36] steven at acko dot net

till said:
"but it's supposed to return the string as is -- in case it's a literal

type, but why does it in some cases return "null" then?"

What argument is there for having (some) unparseable sequences returned

as is? If json_decode() returns a string, then that should mean that
the 
input was a valid JSON encoding of that string, no?

The only literal types JSON allows are numbers and the pre-defined 
constants 'true' 'false' and 'null'. Strings must be quote-delimited.

The fact that you can switch between 'return NULL' and 'return the 
argument as-is' just by adding/removing a leading space is a pretty big

sign that something is wrong here. To be honest, it seems a bit silly 
that this is even an argument.



[2008-12-01 17:16:06] t...@php.net

Just to add to this:

I know that the function is not supposed to be a JSON validator, but
it's supposed to return the string as is -- in case it's a literal type,
but why does it in some cases return "null" then?

For example:
$bad_json = "{ 'bar': 'baz' }";
json_decode($bad_json); // null

I know this is "probably" an edge-case but $bad_json could be my own
/valid/ string -- not valid JSON. Because a string could look like
anything. Point well taken, I'm passing in a pretty /funky/ looking
string. But instead of "NULL", json_decode should retu