ID: 50253 User updated by: kanea at free dot fr Reported By: kanea at free dot fr Status: Open Bug Type: DOM XML related Operating System: linux -PHP Version: 5.2.12RC1 +PHP Version: 5.2.6-1+lenny3 New Comment:
I cannot test on another system Previous Comments: ------------------------------------------------------------------------ [2009-11-20 23:01:27] kanea at free dot fr Description: ------------ I have the same problem with page from wikipedia. It seem that the loadhtml works with iso character in internal. Same bug that bug #32547 Reproduce code: --------------- this code works : $url="http://".$lang.".wikipedia.org/wiki/".$article; $this->dom=new DomDocument('1.0', 'UTF-8'); $str=file_get_contents($url); $this->dom->loadXML($str); $this->contenu = $this->dom->saveXml(); this code don't works : $url="http://".$lang.".wikipedia.org/wiki/".$article; $this->dom=new DomDocument('1.0', 'UTF-8'); $str=file_get_contents($url); $this->dom->loadHtml($str); $this->contenu = $this->dom->saveXml(); It seem that the loadhtml works with iso characters in internal. Expected result: ---------------- Code with utf-8 encoded characters Actual result: -------------- Code with bad characters ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=50253&edit=1