ID:               50545
 Updated by:       rricha...@php.net
 Reported By:      aclark at wayfm dot com
-Status:           Open
+Status:           Feedback
 Bug Type:         DOM XML related
 Operating System: Gentoo Linux
 PHP Version:      5.2.12
 New Comment:

Try upgrading libxml. Versions 2.7.0 - 2.7.2 had broken some entity 
handling and this appears to be that


Previous Comments:
------------------------------------------------------------------------

[2009-12-22 16:10:17] aclark at wayfm dot com

Compiled against libxml2-2.7.2-r2.

------------------------------------------------------------------------

[2009-12-22 09:11:11] j...@php.net

1. What libxml version is PHP compiled with? (check from phpinfo()..)
2. Your reproducing script has some problems since it's not possible to
copy'n'paste it and just run it..

------------------------------------------------------------------------

[2009-12-21 17:45:51] aclark at wayfm dot com

Description:
------------
After a recent PHP upgrade (to 5.2.11-r1), some existing code on a few
of my sites suddenly "broke."

In both instances, it's XML-related PHP code that silently and
completely drops html entities from XML code.

In one instance, it's an RSS feed. "<content:encoded>&lt;p&gt;Lorem..."
becomes "<content:encoded>pLorem..."

The (newly) offending code contains the xml_parse_into_struct
function.


In the other, it's a CDATA section of an XML-RPC ping. Same problem.
The entity-escaped tags are preserved, but without the surrounding lt
and gt entities, rendering the payload useless.

This code uses DOMDocument::LoadXML and schemaValidate

Searching a bit turned up the desiccated carcass of bug #35271, but
nothing recent that I could find.

Downgraded to PHP 5.2.9-r2. Same problem

Reproduce code:
---------------
    libxml_use_internal_errors(true);
    $xdoc= new DomDocument;
    $xml=$params[1];
    if (!$xml) {
        xmlrpc_error(10, "No payload detected.");
    }
    
    $xmlschema='payload2.xsd';
    $xdoc->LoadXML($xml);
    
    if ($xdoc->schemaValidate($xmlschema)) {

Expected result:
----------------
$xml (payload from incoming XML-RPC ping) is successfully validated
against the schema doc), schemaValidate if statement is true, & code
inside is executed.

Actual result:
--------------
Schema validation fails with "The document has no document element." A
dump of the payload reveals that lt and gt entities have been stripped
from the payload: tag attr="true"tag attr="10046"tag /tagtagTag
Contents/tagtag  Tag Contents/tag/tag/tag //tag

schemaValidate if statement is false, else code (omitted) is executed,
returning aforementioned error to RPC client.


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=50545&edit=1

Reply via email to