Edit report at http://bugs.php.net/bug.php?id=51484&edit=1

 ID:               51484
 Updated by:       il...@php.net
 Reported by:      ifland at gmail dot com
 Summary:          '--' incorrectly allowed inside comments
-Status:           Open
+Status:           Bogus
 Type:             Bug
 Package:          XML related
 Operating System: *
 PHP Version:      5.2.13

 New Comment:

Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

The handling of this is done by libxml2 and not PHP, also you are using
loadHTML() 

which is designed to handle non-well-formed HTML.


Previous Comments:
------------------------------------------------------------------------
[2010-04-05 23:48:17] ifland at gmail dot com

Description:
------------
According to the XML spec (see
http://www.w3.org/TR/2008/REC-xml-20081126/#sec-comments ), comments in
XML are not allowed to contain two hyphens in a row, which can
occasionally surface when processing poorly-formed HTML documents as
input.



No suggestion is given in the spec for how to deal with the situation -
we can't turn the hyphens into entities (those aren't allowed in
comments either), but Firefox and possibly other browsers will fail to
parse XML documents with the double hyphen.





Test script:
---------------
<?php

$doc = new DOMDocument();

$doc->loadHTML("<html><body><!--comment <!--sketchy commented
comment--></body>");

header("Content-type: text/plain");

echo $doc->saveXML();

?>

Expected result:
----------------
Either a catchable error or something like this:



<?xml version="1.0" standalone="yes"?>

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
"http://www.w3.org/TR/REC-html40/loose.dtd";>

<html><body><!--comment <!- -commented comment--></body></html>



Actual result:
--------------
<?xml version="1.0" standalone="yes"?>

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
"http://www.w3.org/TR/REC-html40/loose.dtd";>

<html><body><!--comment <!--commented comment--></body></html>




------------------------------------------------------------------------



-- 
Edit this bug report at http://bugs.php.net/bug.php?id=51484&edit=1

Reply via email to