ID: 30692
Updated by: [EMAIL PROTECTED]
Reported By: chrivers at iversen-net dot dk
-Status: Open
+Status: Bogus
Bug Type: XML related
Operating System: Linux 2.6.5, Debian Sarge
PHP Version: 5.0.2
New Comment:
This is a change, but nothing wrong as a SAX parser just fires events.
It might break up character data and this is normal behavior.
Previous Comments:
------------------------------------------------------------------------
[2004-11-05 15:24:57] chrivers at iversen-net dot dk
Description:
------------
When converting my pages to PHP5 SAX XML parser, they
broke because of an appearant incompatability. The
chardata-handler is called in a different pattern that in
PHP4. Before, it seemed to be called once per character
block. Now, the buffer is flushed before each block of
high-bit characters, it seems. This is unexpected and
(seemingly?) impossible to change.
Reproduce code:
---------------
<?
function es() {}
function ee() {}
function cd($P, $D) {print "[$D]\n";}
# $str = "UTF:æøå:UTF"; $strenc = "utf-8";
$str = "ISO:���:ISO"; $strenc = "iso-8859-1";
$buffer = "<?xml version=\"1.0\"
encoding=\"$strenc\"?><global>$str</global>";
$xml_parser = xml_parser_create();
# xml_set_element_handler($xml_parser, "es", "ee");
xml_set_character_data_handler($xml_parser, "cd");
xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, true);
xml_parser_set_option($xml_parser, XML_OPTION_TARGET_ENCODING,
"iso-8859-1");
If (xml_parse($xml_parser, $buffer) == false)
die(sprintf("TV import error: %s at line %d col %d\n%s",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser),
xml_get_current_column_number($xml_parser),
$buffer));
xml_parser_free($xml_parser);
?>
Expected result:
----------------
expected: [ISO:���:ISO]
php4: [ISO:���:ISO]
Actual result:
--------------
[ISO:]
[���:ISO]
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=30692&edit=1