ID: 50545 Updated by: rricha...@php.net Reported By: aclark at wayfm dot com -Status: Open +Status: Feedback Bug Type: DOM XML related Operating System: Gentoo Linux PHP Version: 5.2.12 New Comment:
Try upgrading libxml. Versions 2.7.0 - 2.7.2 had broken some entity handling and this appears to be that Previous Comments: ------------------------------------------------------------------------ [2009-12-22 16:10:17] aclark at wayfm dot com Compiled against libxml2-2.7.2-r2. ------------------------------------------------------------------------ [2009-12-22 09:11:11] j...@php.net 1. What libxml version is PHP compiled with? (check from phpinfo()..) 2. Your reproducing script has some problems since it's not possible to copy'n'paste it and just run it.. ------------------------------------------------------------------------ [2009-12-21 17:45:51] aclark at wayfm dot com Description: ------------ After a recent PHP upgrade (to 5.2.11-r1), some existing code on a few of my sites suddenly "broke." In both instances, it's XML-related PHP code that silently and completely drops html entities from XML code. In one instance, it's an RSS feed. "<content:encoded><p>Lorem..." becomes "<content:encoded>pLorem..." The (newly) offending code contains the xml_parse_into_struct function. In the other, it's a CDATA section of an XML-RPC ping. Same problem. The entity-escaped tags are preserved, but without the surrounding lt and gt entities, rendering the payload useless. This code uses DOMDocument::LoadXML and schemaValidate Searching a bit turned up the desiccated carcass of bug #35271, but nothing recent that I could find. Downgraded to PHP 5.2.9-r2. Same problem Reproduce code: --------------- libxml_use_internal_errors(true); $xdoc= new DomDocument; $xml=$params[1]; if (!$xml) { xmlrpc_error(10, "No payload detected."); } $xmlschema='payload2.xsd'; $xdoc->LoadXML($xml); if ($xdoc->schemaValidate($xmlschema)) { Expected result: ---------------- $xml (payload from incoming XML-RPC ping) is successfully validated against the schema doc), schemaValidate if statement is true, & code inside is executed. Actual result: -------------- Schema validation fails with "The document has no document element." A dump of the payload reveals that lt and gt entities have been stripped from the payload: tag attr="true"tag attr="10046"tag /tagtagTag Contents/tagtag Tag Contents/tag/tag/tag //tag schemaValidate if statement is false, else code (omitted) is executed, returning aforementioned error to RPC client. ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=50545&edit=1