ID: 50545
Updated by: [email protected]
-Summary: PHP dropping entities
Reported By: aclark at wayfm dot com
-Status: Open
+Status: Feedback
-Bug Type: *General Issues
+Bug Type: DOM XML related
Operating System: Gentoo Linux
PHP Version: 5.2.12
New Comment:
1. What libxml version is PHP compiled with? (check from phpinfo()..)
2. Your reproducing script has some problems since it's not possible to
copy'n'paste it and just run it..
Previous Comments:
------------------------------------------------------------------------
[2009-12-21 17:45:51] aclark at wayfm dot com
Description:
------------
After a recent PHP upgrade (to 5.2.11-r1), some existing code on a few
of my sites suddenly "broke."
In both instances, it's XML-related PHP code that silently and
completely drops html entities from XML code.
In one instance, it's an RSS feed. "<content:encoded><p>Lorem..."
becomes "<content:encoded>pLorem..."
The (newly) offending code contains the xml_parse_into_struct
function.
In the other, it's a CDATA section of an XML-RPC ping. Same problem.
The entity-escaped tags are preserved, but without the surrounding lt
and gt entities, rendering the payload useless.
This code uses DOMDocument::LoadXML and schemaValidate
Searching a bit turned up the desiccated carcass of bug #35271, but
nothing recent that I could find.
Downgraded to PHP 5.2.9-r2. Same problem
Reproduce code:
---------------
libxml_use_internal_errors(true);
$xdoc= new DomDocument;
$xml=$params[1];
if (!$xml) {
xmlrpc_error(10, "No payload detected.");
}
$xmlschema='payload2.xsd';
$xdoc->LoadXML($xml);
if ($xdoc->schemaValidate($xmlschema)) {
Expected result:
----------------
$xml (payload from incoming XML-RPC ping) is successfully validated
against the schema doc), schemaValidate if statement is true, & code
inside is executed.
Actual result:
--------------
Schema validation fails with "The document has no document element." A
dump of the payload reveals that lt and gt entities have been stripped
from the payload: tag attr="true"tag attr="10046"tag /tagtagTag
Contents/tagtag Tag Contents/tag/tag/tag //tag
schemaValidate if statement is false, else code (omitted) is executed,
returning aforementioned error to RPC client.
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=50545&edit=1