From: kanea at free dot fr Operating system: linux PHP version: 5.2.12RC1 PHP Bug Type: DOM XML related Bug description: internal encoding
Description: ------------ I have the same problem with page from wikipedia. It seem that the loadhtml works with iso character in internal. Same bug that bug #32547 Reproduce code: --------------- this code works : $url="http://".$lang.".wikipedia.org/wiki/".$article; $this->dom=new DomDocument('1.0', 'UTF-8'); $str=file_get_contents($url); $this->dom->loadXML($str); $this->contenu = $this->dom->saveXml(); this code don't works : $url="http://".$lang.".wikipedia.org/wiki/".$article; $this->dom=new DomDocument('1.0', 'UTF-8'); $str=file_get_contents($url); $this->dom->loadHtml($str); $this->contenu = $this->dom->saveXml(); It seem that the loadhtml works with iso characters in internal. Expected result: ---------------- Code with utf-8 encoded characters Actual result: -------------- Code with bad characters -- Edit bug report at http://bugs.php.net/?id=50253&edit=1 -- Try a snapshot (PHP 5.2): http://bugs.php.net/fix.php?id=50253&r=trysnapshot52 Try a snapshot (PHP 5.3): http://bugs.php.net/fix.php?id=50253&r=trysnapshot53 Try a snapshot (PHP 6.0): http://bugs.php.net/fix.php?id=50253&r=trysnapshot60 Fixed in SVN: http://bugs.php.net/fix.php?id=50253&r=fixed Fixed in SVN and need be documented: http://bugs.php.net/fix.php?id=50253&r=needdocs Fixed in release: http://bugs.php.net/fix.php?id=50253&r=alreadyfixed Need backtrace: http://bugs.php.net/fix.php?id=50253&r=needtrace Need Reproduce Script: http://bugs.php.net/fix.php?id=50253&r=needscript Try newer version: http://bugs.php.net/fix.php?id=50253&r=oldversion Not developer issue: http://bugs.php.net/fix.php?id=50253&r=support Expected behavior: http://bugs.php.net/fix.php?id=50253&r=notwrong Not enough info: http://bugs.php.net/fix.php?id=50253&r=notenoughinfo Submitted twice: http://bugs.php.net/fix.php?id=50253&r=submittedtwice register_globals: http://bugs.php.net/fix.php?id=50253&r=globals PHP 4 support discontinued: http://bugs.php.net/fix.php?id=50253&r=php4 Daylight Savings: http://bugs.php.net/fix.php?id=50253&r=dst IIS Stability: http://bugs.php.net/fix.php?id=50253&r=isapi Install GNU Sed: http://bugs.php.net/fix.php?id=50253&r=gnused Floating point limitations: http://bugs.php.net/fix.php?id=50253&r=float No Zend Extensions: http://bugs.php.net/fix.php?id=50253&r=nozend MySQL Configuration Error: http://bugs.php.net/fix.php?id=50253&r=mysqlcfg