From: phpbugs at lange dot nom dot fr Operating system: GNU/Linux, Fedora Release 9 PHP version: 5.2.6 PHP Bug Type: XML related Bug description: xml_parse / xml_set_character_data_handler doesn't resolve simple entities
Description: ------------ It may be a misunderstanding of mine, but xml_parse doesn't seem to resolve 'simple' entities like < > or &. According to the doc, and one other bug report, I think it should. While other parsers (XMLReader, SimpleXML, DOM, ...) do. In numerical form, it does work (>). I tested with libxml2 versions 2.7.1 and 2.7.2, and expat 2.0.1. I found this when doing SOAP calls, I couldn't pass the value "<" to a PHP SOAP server (which is using NuSOAP). I don't have a workaround for the moment. Reproduce code: --------------- function character_data($parser, $data){ print("character_data: $data\n"); } function parsestring( $strg ) { $parser = xml_parser_create("UTF-8"); xml_set_character_data_handler($parser,'character_data'); xml_parse($parser,$strg,true); xml_parser_free($parser); $reader = new XMLReader(); $reader->XML( $strg ); while( $reader->read() ) { if( $reader->value ) print( $reader->name . ": " . $reader->value . "\n"); } } $xmlcdata = "<element><![CDATA[A>B&C<D]]></element>"; $xmlescaped = "<element>A>B&C<D</element>"; $xmlnumeric = "<element>A>B&C<D</element>"; print("parsing $xmlcdata \n"); parsestring( $xmlcdata ); print("\nparsing $xmlescaped \n"); parsestring( $xmlescaped ); print("\nparsing $xmlnumeric \n"); parsestring( $xmlnumeric ); Expected result: ---------------- parsing <element><![CDATA[A>B&C<D]]></element> character_data: A>B&C<D #cdata-section: A>B&C<D parsing <element>A>B&C<D</element> character_data: A character_data: > character_data: B character_data: & character_data: C character_data: < character_data: D #text: A>B&C<D parsing <element>A>B&C<D</element> character_data: A character_data: > character_data: B character_data: & character_data: C character_data: < character_data: D #text: A>B&C<D Actual result: -------------- parsing <element><![CDATA[A>B&C<D]]></element> character_data: A>B&C<D #cdata-section: A>B&C<D parsing <element>A>B&C<D</element> character_data: A character_data: B character_data: C character_data: D #text: A>B&C<D parsing <element>A>B&C<D</element> character_data: A character_data: > character_data: B character_data: & character_data: C character_data: < character_data: D #text: A>B&C<D -- Edit bug report at http://bugs.php.net/?id=46307&edit=1 -- Try a CVS snapshot (PHP 5.2): http://bugs.php.net/fix.php?id=46307&r=trysnapshot52 Try a CVS snapshot (PHP 5.3): http://bugs.php.net/fix.php?id=46307&r=trysnapshot53 Try a CVS snapshot (PHP 6.0): http://bugs.php.net/fix.php?id=46307&r=trysnapshot60 Fixed in CVS: http://bugs.php.net/fix.php?id=46307&r=fixedcvs Fixed in release: http://bugs.php.net/fix.php?id=46307&r=alreadyfixed Need backtrace: http://bugs.php.net/fix.php?id=46307&r=needtrace Need Reproduce Script: http://bugs.php.net/fix.php?id=46307&r=needscript Try newer version: http://bugs.php.net/fix.php?id=46307&r=oldversion Not developer issue: http://bugs.php.net/fix.php?id=46307&r=support Expected behavior: http://bugs.php.net/fix.php?id=46307&r=notwrong Not enough info: http://bugs.php.net/fix.php?id=46307&r=notenoughinfo Submitted twice: http://bugs.php.net/fix.php?id=46307&r=submittedtwice register_globals: http://bugs.php.net/fix.php?id=46307&r=globals PHP 4 support discontinued: http://bugs.php.net/fix.php?id=46307&r=php4 Daylight Savings: http://bugs.php.net/fix.php?id=46307&r=dst IIS Stability: http://bugs.php.net/fix.php?id=46307&r=isapi Install GNU Sed: http://bugs.php.net/fix.php?id=46307&r=gnused Floating point limitations: http://bugs.php.net/fix.php?id=46307&r=float No Zend Extensions: http://bugs.php.net/fix.php?id=46307&r=nozend MySQL Configuration Error: http://bugs.php.net/fix.php?id=46307&r=mysqlcfg