ID: 50253
User updated by: kanea at free dot fr
Reported By: kanea at free dot fr
Status: Open
Bug Type: DOM XML related
Operating System: linux
-PHP Version: 5.2.12RC1
+PHP Version: 5.2.6-1+lenny3
New Comment:
I cannot test on another system
Previous Comments:
------------------------------------------------------------------------
[2009-11-20 23:01:27] kanea at free dot fr
Description:
------------
I have the same problem with page from wikipedia.
It seem that the loadhtml works with iso character in internal.
Same bug that bug #32547
Reproduce code:
---------------
this code works :
$url="http://".$lang.".wikipedia.org/wiki/".$article;
$this->dom=new DomDocument('1.0', 'UTF-8');
$str=file_get_contents($url);
$this->dom->loadXML($str);
$this->contenu = $this->dom->saveXml();
this code don't works :
$url="http://".$lang.".wikipedia.org/wiki/".$article;
$this->dom=new DomDocument('1.0', 'UTF-8');
$str=file_get_contents($url);
$this->dom->loadHtml($str);
$this->contenu = $this->dom->saveXml();
It seem that the loadhtml works with iso characters in internal.
Expected result:
----------------
Code with utf-8 encoded characters
Actual result:
--------------
Code with bad characters
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=50253&edit=1