ID:               37154
 User updated by:  troelskn at gmail dot com
 Reported By:      troelskn at gmail dot com
 Status:           Bogus
 Bug Type:         DOM XML related
 Operating System: *
 PHP Version:      5.1.2
 New Comment:

Not true.

$mb_detect_charsets = "ASCII,UTF-8,ISO-8859-1";




$dom = new DOMDocument("1.0", "UTF-8");
$doc = $dom->appendChild($dom->createElement("document"));
$doc->appendChild($dom->createTextNode(utf8_encode("Iñtërnâtiônàlizætiøn")));

echo mb_detect_encoding($dom->saveXML(), $mb_detect_charsets) .
"<br>";




$dom = new DOMDocument("1.0", "ISO-8859-1");
$doc = $dom->appendChild($dom->createElement("document"));
$doc->appendChild($dom->createTextNode(utf8_encode("Iñtërnâtiônàlizætiøn")));

echo mb_detect_encoding($dom->saveXML(), $mb_detect_charsets) .
"<br>";

-------------------------------------------------------
outputs :
UTF-8
ISO-8859-1
-------------------------------------------------------
Removing ut8_encode crashes the second example.


Previous Comments:
------------------------------------------------------------------------

[2006-04-21 15:02:53] [EMAIL PROTECTED]

Wrong.
The *default* input encoding is UTF8. But you can always use <?xml
version="1.0" encoding="<your encoding>"?>.
All the result data are in UTF8 anyway, this is libxml2 feature.

------------------------------------------------------------------------

[2006-04-21 14:37:34] troelskn at gmail dot com

Description:
------------
After some digging around and experimentation, I have found out that
the DOM extension needs all input strings to be utf8-encoded. This
means that any code using the extension must be spingled with
urf8_encode.

The problem can probably not be fixed without breaking backward
compatibility, so the most sane choice may be to leave it, but atleast
update the documentation to state this.



------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=37154&edit=1

Reply via email to