Hi internals!
While browsing through bugsnet I encountered this SimpleXML issue with 252
votes: https://bugs.php.net/bug.php?id=54632
TLDR: when you have a XML document (modified a bit from the example in the
bugtracker):
<?xml version="1.0" encoding="UTF-8" ?>
<a><b id="foo">foo</b>bar</a>
And you load it into simpleXML, the result of calling
json_encode($the_simplexml_object) on that is:
{"b":{"@attributes":{"id":"foo"}}}
There's 2 strange things here:
- Where is a?
- Where is the text for b (and a)?
What's going on here is that json_encode() gives the JSON representation of
what var_dump() gives you.
This behaviour is perceived as a bug, given the number of votes and the comment
section.
It's possible to change the JSON encoding, without affected var_dump() and the
way you access simpleXML objects.
One comment suggests the following JSON representation for the above XML:
{"a":{"b":{"@attributes":{"id":"foo"},"@text":"foo"},"@text":"bar"}}
This seems reasonable. Let's take a look at how multiple tags are handled right
now and how that would work for text nodes.
SimpleXML currently handles multiple tags with the same name by placing them in
an array:
Given: <?xml version="1.0" encoding="UTF-8" ?><a><b id="foo"/><x/><y/><x/></a>
You'll get: {"b":{"@attributes":{"id":"foo"}},"x":[{},{}],"y":{}}
We could do the same for text nodes. Given: <?xml version="1.0"
encoding="UTF-8" ?><a><b id="foo"/>foo<x/>bar<y/>baz<x/></a>
Could give: {"a":{"b":{"@attributes":{"id":"foo"}},"x":[{},{}],"y":{}},
"@text": ["foo", "bar", "baz"]}}
Now, this would still not allow to reconstruct the document based on the JSON
however, as the ordering between tags&text is lost (just as is the case now for
ordering between different tags).
I'm not sure what the community specifically wants here.
Are there opinions on how this should behave?
Kind regards
Niels
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php