ID: 41339 User updated by: rasch at raschnet dot com Reported By: rasch at raschnet dot com -Status: Bogus +Status: Open Bug Type: DOM XML related Operating System: Ubuntu Linux PHP Version: 5.2.2 New Comment:
I've decided to have one more go at this bug submission. As a bit of evidence for this bugs validity, I offer that the HTML which causes the DOMDocument class to return no results in fact validates in the W3C validator. Either way, DOMDocument->saveHTML should not return an empty string. <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8, text/html; charset=utf-8"><title>Foo</title></head> <body><p>Hello</p></body> </html> Thanks! David Previous Comments: ------------------------------------------------------------------------ [2007-05-11 01:00:54] rasch at raschnet dot com If you can, please take another look at this. I think parsing the HTML would be above and beyond the bug here.. In fact, the parser _is_ parsing some of the HTML to get the charset out of the content-type meta tag. Unfortunately, it seems if the content-type isn't in the expected format, it's returning nothing. It's not returning the ill-formed HTML back, but nothing. If one alters the content-type meta tag to include just one content-type value it will happily return the html. ------------------------------------------------------------------------ [2007-05-09 23:40:56] [EMAIL PROTECTED] Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://www.php.net/manual/ and the instructions on how to report a bug at http://bugs.php.net/how-to-report.php The parser provided by libXML is not an HTML tag validator, it only cares about the syntax of tags being valid. ------------------------------------------------------------------------ [2007-05-09 15:58:57] rasch at raschnet dot com Description: ------------ In usage of symfony, our code was mistakenly producing a meta tag with two content types. However, from what I understand it's not invalid, but either way PHP falls on this, the DOM parser should return an error. The current behavior is that PHP returns an empty string when calling '$dom->saveHTML()' in the code sample below. Reproduce code: --------------- $dom = new DomDocument("1.0", "utf-8"); $val =$dom->loadHTML(' <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8, text/html; charset=utf-8"> </head> <body>Hello</body></html>'); var_dump($val); print $dom->saveHTML(); print "\n^^^ empty string\n"; Expected result: ---------------- <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8, text/html; charset=utf-8"> </head> <body><p>Hello</p></body></html> Actual result: -------------- bool(true) // ^^^ empty string ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=41339&edit=1