#45989 [Opn]: json_decode() passes through certain invalid JSON strings

2008-12-02 Thread steven at acko dot net
 ID:   45989
 User updated by:  steven at acko dot net
 Reported By:  steven at acko dot net
 Status:   Open
 Bug Type: JSON related
 Operating System: Mac OS X
 PHP Version:  5.2.6
 New Comment:

till said:
but it's supposed to return the string as is -- in case it's a literal

type, but why does it in some cases return null then?

What argument is there for having (some) unparseable sequences returned

as is? If json_decode() returns a string, then that should mean that
the 
input was a valid JSON encoding of that string, no?

The only literal types JSON allows are numbers and the pre-defined 
constants 'true' 'false' and 'null'. Strings must be quote-delimited.

The fact that you can switch between 'return NULL' and 'return the 
argument as-is' just by adding/removing a leading space is a pretty big

sign that something is wrong here. To be honest, it seems a bit silly 
that this is even an argument.


Previous Comments:


[2008-12-01 17:16:06] [EMAIL PROTECTED]

Just to add to this:

I know that the function is not supposed to be a JSON validator, but
it's supposed to return the string as is -- in case it's a literal type,
but why does it in some cases return null then?

For example:
$bad_json = { 'bar': 'baz' };
json_decode($bad_json); // null

I know this is probably an edge-case but $bad_json could be my own
/valid/ string -- not valid JSON. Because a string could look like
anything. Point well taken, I'm passing in a pretty /funky/ looking
string. But instead of NULL, json_decode should return the string
as-is.

That is, according to the documentation, a bug. ;-)

Lots of people also seemed to rely on json_decode as a json validator.
Which is -- once you understand the subtle differences -- not the case.

The case should be made for either one though.



[2008-11-17 15:23:35] [EMAIL PROTECTED]

@Iliaa:

Could this bug be re-evaluated or a more detailed explaination as of
why the docs sometimes note that NULL is returned on invalid json, and
why sometimes json_decode() returns the string instead?

If the function returns whatever then the docs should be updated to
tell the user to not rely on what is returned by json_decode at all.
;-)

I double-checked some of Steve's examples on jsonlint.com (which is in
most docs cited as the reference validator for json data) and they all
show up as invalid.

I also build the most recent 5.2.7 snapshot:
./configure --disable-all --enable-json

[EMAIL PROTECTED]:~/php5.2-200811171330$ ./sapi/cli/php test-45989.php 
string(14) 'invalid json'
string(12) invalid json
string(2)  {
string(2)  [
[EMAIL PROTECTED]:~/php5.2-200811171330$ ./sapi/cli/php --ini
Configuration File (php.ini) Path: /usr/local/lib
Loaded Configuration File: (none)
Scan for additional .ini files in: (none)
Additional .ini files parsed:  (none)
[EMAIL PROTECTED]:~/php5.2-200811171330$ ./sapi/cli/php -m
[PHP Modules]
date
json
Reflection
standard

[Zend Modules]


I'm gonna write a test and send it to QA too.



[2008-09-10 01:14:23] steven at acko dot net

Please clarify the bogus classification.

The following each returns NULL, as expected:

var_dump(json_decode('[')); // unmatched bracket
var_dump(json_decode('{')); // unmatched brace
var_dump(json_decode('{}}'));   // unmatched brace
var_dump(json_decode('{error error}')); // invalid object key/value
notation
var_dump(json_decode('[\]')); // unclosed string
var_dump(json_decode('[ \x ]'));  // invalid escape code

Yet the following each returns the literal argument as a string:

var_dump(json_decode(' ['));
var_dump(json_decode(' {'));
var_dump(json_decode(' {}}'));
var_dump(json_decode(' {error error}')); 
var_dump(json_decode('\'));
var_dump(json_decode(' \x ')); 

Please examine the examples closely: they are all meaningless, invalid
JSON. Even under the 
most widely stretched definition of JSON, the above is not JSON encoded
data. Yet 
json_decode() arbitarily returns /some of it/ as a string... and in a
way that looks 
suspiciously like a bad parser implementation.

If this was merely a case of json_decode() returning /all/ invalid json
as is, then it could 
be classified as an implementation quirk. But because of how
inconsistent it is now, you 
can't say that it is by design or following any kind of spec.

E.g. how would you currently see if json_decode() succeeded or not?



[2008-09-10 00:38:09] [EMAIL PROTECTED]

Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

.


#45989 [Opn]: json_decode() passes through certain invalid JSON strings

2008-12-01 Thread till
 ID:   45989
 Updated by:   [EMAIL PROTECTED]
 Reported By:  steven at acko dot net
 Status:   Open
 Bug Type: JSON related
 Operating System: Mac OS X
 PHP Version:  5.2.6
 New Comment:

Just to add to this:

I know that the function is not supposed to be a JSON validator, but
it's supposed to return the string as is -- in case it's a literal type,
but why does it in some cases return null then?

For example:
$bad_json = { 'bar': 'baz' };
json_decode($bad_json); // null

I know this is probably an edge-case but $bad_json could be my own
/valid/ string -- not valid JSON. Because a string could look like
anything. Point well taken, I'm passing in a pretty /funky/ looking
string. But instead of NULL, json_decode should return the string
as-is.

That is, according to the documentation, a bug. ;-)

Lots of people also seemed to rely on json_decode as a json validator.
Which is -- once you understand the subtle differences -- not the case.

The case should be made for either one though.


Previous Comments:


[2008-11-17 15:23:35] [EMAIL PROTECTED]

@Iliaa:

Could this bug be re-evaluated or a more detailed explaination as of
why the docs sometimes note that NULL is returned on invalid json, and
why sometimes json_decode() returns the string instead?

If the function returns whatever then the docs should be updated to
tell the user to not rely on what is returned by json_decode at all.
;-)

I double-checked some of Steve's examples on jsonlint.com (which is in
most docs cited as the reference validator for json data) and they all
show up as invalid.

I also build the most recent 5.2.7 snapshot:
./configure --disable-all --enable-json

[EMAIL PROTECTED]:~/php5.2-200811171330$ ./sapi/cli/php test-45989.php 
string(14) 'invalid json'
string(12) invalid json
string(2)  {
string(2)  [
[EMAIL PROTECTED]:~/php5.2-200811171330$ ./sapi/cli/php --ini
Configuration File (php.ini) Path: /usr/local/lib
Loaded Configuration File: (none)
Scan for additional .ini files in: (none)
Additional .ini files parsed:  (none)
[EMAIL PROTECTED]:~/php5.2-200811171330$ ./sapi/cli/php -m
[PHP Modules]
date
json
Reflection
standard

[Zend Modules]


I'm gonna write a test and send it to QA too.



[2008-09-10 01:14:23] steven at acko dot net

Please clarify the bogus classification.

The following each returns NULL, as expected:

var_dump(json_decode('[')); // unmatched bracket
var_dump(json_decode('{')); // unmatched brace
var_dump(json_decode('{}}'));   // unmatched brace
var_dump(json_decode('{error error}')); // invalid object key/value
notation
var_dump(json_decode('[\]')); // unclosed string
var_dump(json_decode('[ \x ]'));  // invalid escape code

Yet the following each returns the literal argument as a string:

var_dump(json_decode(' ['));
var_dump(json_decode(' {'));
var_dump(json_decode(' {}}'));
var_dump(json_decode(' {error error}')); 
var_dump(json_decode('\'));
var_dump(json_decode(' \x ')); 

Please examine the examples closely: they are all meaningless, invalid
JSON. Even under the 
most widely stretched definition of JSON, the above is not JSON encoded
data. Yet 
json_decode() arbitarily returns /some of it/ as a string... and in a
way that looks 
suspiciously like a bad parser implementation.

If this was merely a case of json_decode() returning /all/ invalid json
as is, then it could 
be classified as an implementation quirk. But because of how
inconsistent it is now, you 
can't say that it is by design or following any kind of spec.

E.g. how would you currently see if json_decode() succeeded or not?



[2008-09-10 00:38:09] [EMAIL PROTECTED]

Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

.



[2008-09-04 00:32:20] steven at acko dot net

Description:

When json_decode() is given certain invalid JSON strings, it will
return 
the literal string as the result, rather than returning NULL.

Note: in #38680, the decision was made to allow json_decode() to accept

literal basic types (strings, ints, ...) even though this is not
allowed 
by RFC 4627 (which only allows objects/arrays). This bug report is 
different because even under the PHP interpretation of JSON, these 
strings can not be considered valid, and trivial variations on them do

in fact throw an error as expected.

(The non-standard behaviour introduced in #38680 is not documented at 
all by the way, which is kind of ironic given the numerous issues that

have 'go read the spec' as the