Edit report at https://bugs.php.net/bug.php?id=54735&edit=1
ID: 54735 Comment by: peter dot e dot lind at gmail dot com Reported by: phpnet at phoen dot de Summary: xml_parse_into_struct gives uncomplete array Status: Open Type: Bug Package: *XML functions Operating System: Windows XPSP2 PHP Version: 5.3.6 Block user comment: N Private report: N New Comment: Doesn't seem to be a bug on Linux (PHP 5.3.3-7+squeeze8 with Suhosin-Patch (cli)) - when I save the test-script in utf-8 the output is as expected. However, if I save it as ISO-8859-15, the xml parsing stops, as one would expect, when it hits the illegal character - an ISO-8859-15 ⬠would be illegal in utf-8 and xml parsers must stop when they reach illegal content. The docs could be improved though, as http://dk.php.net/manual/en/function.xml- parser-create.php specifies that "The supported encodings are ISO-8859-1, UTF-8 and US-ASCII." but in the context it appears to only be the case for output. However, the behaviour suggests those three character sets also restrict input. Some comments in the php code also suggest that this is the case. Previous Comments: ------------------------------------------------------------------------ [2011-05-14 13:29:54] phpnet at phoen dot de Description: ------------ hi, configuration: Windows XP with XAMPP 1.7.3 (nothing special) xml_parse_into_struct returns an uncomplete array if there is an unexpected sign(maybe utf8). All after "â¬" in the array is missing. No error or messed up character shows up. It doesn't help to set any options. $myxml=utf8_encode($myxml); and afterwards can fix the problem surprisingly. thanks,H. Test script: --------------- <?php $myxml="<XML><FELD1>blubb</FELD1><FELD2>dies hier sonderzeichen â¬</FELD2><FELD3>feld3</FELD3></XML>"; $p = xml_parser_create(); xml_parse_into_struct($p, $myxml, $vals, $index); xml_parser_free($p); print_r($vals); php?> Expected result: ---------------- sth like this Array ( [0] => Array ( [tag] => XML [type] => open [level] => 1 [value] => ) [1] => Array ( [tag] => FELD1 [type] => complete [level] => 2 [value] => blubb ) [2] => Array ( [tag] => XML [value] => [type] => cdata [level] => 1 ) [3] => Array ( [tag] => FELD2 [type] => complete [level] => 2 [value] => dies hier sonderzeichen â¬) [3] => Array ( [tag] => XML [value] => [type] => cdata [level] => 1 ) [3] => Array ( [tag] => FELD3 [type] => complete[level] => 2 [value] => feld3)) Actual result: -------------- Array ( [0] => Array ( [tag] => XML [type] => open [level] => 1 [value] => ) [1] => Array ( [tag] => FELD1 [type] => complete [level] => 2 [value] => blubb ) [2] => Array ( [tag] => XML [value] => [type] => cdata [level] => 1 ) [3] => Array ( [tag] => FELD2 [type] => open [level] => 2 [value] => dies hier sonderzeichen ) ) ------------------------------------------------------------------------ -- Edit this bug report at https://bugs.php.net/bug.php?id=54735&edit=1