From:             bart at mediawave dot nl
Operating system: WinXP
PHP version:      5.0.0RC2
PHP Bug Type:     DOM XML related
Bug description:  PHP PI problem with dom->loadHTML

Description:
------------
When loading a W3C valid HTML 4.01 html string with DOMDocument->loadHTML,
DOM has trouble with php Processing Instructions (<?php ... ?>).

html string Is Valid HTML 4.01 Transitional:

http://validator.w3.org/check?uri=http%3A%2F%2Fwww.mediawave.nl%2Fhtmlfile.htm&charset=%28detect+automatically%29&doctype=%28detect+automatically%29&ss=1&verbose=1

Reproduce code:
---------------
<?php

$html = '<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Untitled Document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>

<body>
<p><?php echo "hello? world? are you there? Can you see me? :(" ?></p>
</body>
</html>';

$dom = new DomDocument;
$dom->loadHTML($html);
echo '<pre>', htmlspecialchars($dom->saveHTML()), '</pre>';

?>

Expected result:
----------------
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Untitled Document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body><p><?php echo "hello? world? are you there? Can you see me? :("
?></p></body>
</html>

Actual result:
--------------
Warning: DOMDocument::loadHTML() [function.loadHTML]: htmlParseStartTag:
invalid element name in Entity, line: 9 in D:\Inetpub\wwwroot\test2.php on
line 24

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Untitled Document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body><p></p></body>
</html>

-- 
Edit bug report at http://bugs.php.net/?id=28628&edit=1
-- 
Try a CVS snapshot (php4):  http://bugs.php.net/fix.php?id=28628&r=trysnapshot4
Try a CVS snapshot (php5):  http://bugs.php.net/fix.php?id=28628&r=trysnapshot5
Fixed in CVS:               http://bugs.php.net/fix.php?id=28628&r=fixedcvs
Fixed in release:           http://bugs.php.net/fix.php?id=28628&r=alreadyfixed
Need backtrace:             http://bugs.php.net/fix.php?id=28628&r=needtrace
Need Reproduce Script:      http://bugs.php.net/fix.php?id=28628&r=needscript
Try newer version:          http://bugs.php.net/fix.php?id=28628&r=oldversion
Not developer issue:        http://bugs.php.net/fix.php?id=28628&r=support
Expected behavior:          http://bugs.php.net/fix.php?id=28628&r=notwrong
Not enough info:            http://bugs.php.net/fix.php?id=28628&r=notenoughinfo
Submitted twice:            http://bugs.php.net/fix.php?id=28628&r=submittedtwice
register_globals:           http://bugs.php.net/fix.php?id=28628&r=globals
PHP 3 support discontinued: http://bugs.php.net/fix.php?id=28628&r=php3
Daylight Savings:           http://bugs.php.net/fix.php?id=28628&r=dst
IIS Stability:              http://bugs.php.net/fix.php?id=28628&r=isapi
Install GNU Sed:            http://bugs.php.net/fix.php?id=28628&r=gnused
Floating point limitations: http://bugs.php.net/fix.php?id=28628&r=float

Reply via email to