ID: 37154 User updated by: troelskn at gmail dot com Reported By: troelskn at gmail dot com Status: Bogus Bug Type: DOM XML related Operating System: * PHP Version: 5.1.2 New Comment:
Not true. $mb_detect_charsets = "ASCII,UTF-8,ISO-8859-1"; $dom = new DOMDocument("1.0", "UTF-8"); $doc = $dom->appendChild($dom->createElement("document")); $doc->appendChild($dom->createTextNode(utf8_encode("Iñtërnâtiônàlizætiøn"))); echo mb_detect_encoding($dom->saveXML(), $mb_detect_charsets) . "<br>"; $dom = new DOMDocument("1.0", "ISO-8859-1"); $doc = $dom->appendChild($dom->createElement("document")); $doc->appendChild($dom->createTextNode(utf8_encode("Iñtërnâtiônàlizætiøn"))); echo mb_detect_encoding($dom->saveXML(), $mb_detect_charsets) . "<br>"; ------------------------------------------------------- outputs : UTF-8 ISO-8859-1 ------------------------------------------------------- Removing ut8_encode crashes the second example. Previous Comments: ------------------------------------------------------------------------ [2006-04-21 15:02:53] [EMAIL PROTECTED] Wrong. The *default* input encoding is UTF8. But you can always use <?xml version="1.0" encoding="<your encoding>"?>. All the result data are in UTF8 anyway, this is libxml2 feature. ------------------------------------------------------------------------ [2006-04-21 14:37:34] troelskn at gmail dot com Description: ------------ After some digging around and experimentation, I have found out that the DOM extension needs all input strings to be utf8-encoded. This means that any code using the extension must be spingled with urf8_encode. The problem can probably not be fixed without breaking backward compatibility, so the most sane choice may be to leave it, but atleast update the documentation to state this. ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=37154&edit=1