ID:               44648
 User updated by:  [EMAIL PROTECTED]
 Reported By:      [EMAIL PROTECTED]
 Status:           Open
 Bug Type:         DOM XML related
 Operating System: Windows Vista
 PHP Version:      5.2.5
 Assigned To:      rrichards
 New Comment:

One more: ]]> is not allowed in CDATA blocks.

I also suspect that the other XML extensions have bugs here.


Previous Comments:
------------------------------------------------------------------------

[2008-04-05 23:02:02] [EMAIL PROTECTED]

IIRC, DOM does not make any demands on names or things like that.
libxml2, which is known for its strictness, doesn't either. So, I'm
still hoping that you'll let the checks be turned off. :-)

Some things from my investigation:

- Double hyphens (--) are not allowed in comments
- Most of the text inputs don't check for UTF-8 well-formedness.
Haven't tested numeric character entities either, but those are
suspicious
- Fake namespace declarations in attributes
($d->appendChild($d->createElement('foo:bar')); results in invalid XML,
as foo namespace was never defined)

All these result in a $d->saveXML(); that is invalid XML, and probably
some more.

------------------------------------------------------------------------

[2008-04-05 22:54:04] [EMAIL PROTECTED]

assign to self.

The strictness is dependent upon the DOM specs and setAttribute should

be throwing an exception in that case. While I am going to go through 
and check other methods, let me know if you come across any others that

are not validating names correctly.



------------------------------------------------------------------------

[2008-04-05 21:55:06] [EMAIL PROTECTED]

Description:
------------
libxml2 is fairly lenient when it comes to what it allows to go into
its nodes; you can set attributes and tags with illegal characters in
them and it won't complain. The burden is on the userland code to
perform an appropriate check with the xmlValidate*() functions.

PHP's DOM implementation is extremely spotty when it comes to these
checks, which allows for some broken XML to easily be generated. For
example,

$d = new DOMDocument();
$d->appendChild($n = $d->createElement('a'));
$n->setAttribute('"@', 'foo');
echo $d->saveXML();

outputs:

<?xml version="1.0"?>
<a "@="foo"/>

Which is clearly incorrect. However, if I attempt to

$d->createElement('a@');

DOM complains, because xmlValidateName was called on the element name.

Now, I actually don't mind the lack of checking; the DOM tree is useful
for things like HTML, where the rules are slightly different from XMLs;
an HTML tree can contain a "a@" node, although it would not be valid
HTML. (You can try it out for yourself on Firefox by putting that in a
document and then inspecting the DOM).

However, I want consistency, and I also want the ability to switch on
strict checking when I so desire (especially when I'm producing XML). So
I want all-or-nothing production checks in PHP DOM, adding another
property in DOMDocument (or maybe even a global libxml configuration
option) that specifies whether or not strict production checking should
be done.



------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=44648&edit=1

Reply via email to