ID: 49984 User updated by: ppass at hotmail dot fr Reported By: ppass at hotmail dot fr Status: Bogus Bug Type: DOM XML related Operating System: Linux ns1 2.6.28.4-rsbac PHP Version: 5.2.11 New Comment:
That you for details, I just filed a bug in their system. Previous Comments: ------------------------------------------------------------------------ [2009-11-02 06:53:03] ras...@php.net We didn't write the DOM implementation. We are simply using libxml2. Information on how to file a bug against libxml2 is here: http://xmlsoft.org/bugs.html But I suspect they won't consider this a bug. Their relaxed html parser isn't a full html parser that knows about embedded script objects. This would only be a PHP bug if we are somehow calling libxml2 incorrectly causing this, but it doesn't appear to be the case here. ------------------------------------------------------------------------ [2009-11-02 05:42:09] ppass at hotmail dot fr No reaction still to this bug. Maybe my previous title was too specific. More generally speaking, it means that the DOM model is broken in php when ever a script tag contains other tags in its text. This is a serious bug that must be corrected asap, other wise it is not possible to make a reliable use of DOM. ------------------------------------------------------------------------ [2009-10-24 04:27:57] ppass at hotmail dot fr Description: ------------ The script node's parent is a div. The script node has the text '</div>' inside its script. The DOM node returns only partial contents of the script node, as if the node was mistakenly truncated when reaching the '</div>' text. Reproduce code: --------------- $html = '<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"><title>Title</title></head><body><div><script type="text/javascript" id="script1">function dummy { object.innerHTML="<div>text</div>"; } function dummy2 { alert("hello"); } </script> </div> </body> </html>'; $dom = new DOMDocument('1.0', 'UTF-8'); @$dom->loadHTML($html); $script_node = $dom->getElementById('script1'); Echo "<![CDATA[$script_node->nodeValue]]>"; Expected result: ---------------- function dummy { object.innerHTML="<div>text</div>"; } function dummy2 { alert("hello"); } I expect to see the whole content of the script node. Actual result: -------------- function dummy { object.innerHTML="<div>text The script node has been truncated. ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=49984&edit=1