Hi, I use libxml to do HTML processing using htmlParseDocument, than do some simple transformations (like replacing URIs just to correct relative patch etc.) and then save the document using xmlSaveDoc(). The output is an HTML file that is passed to the web browser.
The problem is that in case that there is no DOCTYPE declaration in the input document libxml2 adds a default one: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> There is a difference in rendering of pages by web browsers that comes from various quirks modes that are turned on or off based on the DOCTYPE declaration. To illustrate the difference there is a test page where you can see the same HTML/CSS code with various DOCTYPEs prepended: http://dbaron.org/mozilla/tests/compat?doctype= http://dbaron.org/mozilla/tests/compat?doctype=%3C!DOCTYPE+HTML+PUBLIC+%22-%2F%2FW3C%2F%2FDTD+HTML+4.01+Transitional%2F%2FEN%22+%22http%3A%2F%2Fwww.w3.org%2FTR%2Fhtml4%2Floose.dtd%22%3E http://dbaron.org/mozilla/tests/compat?doctype=%3C!DOCTYPE+HTML+PUBLIC+%22-%2F%2FW3C%2F%2FDTD+HTML+4.01+Transitional%2F%2FEN%22%3E http://dbaron.org/mozilla/tests/compat?doctype=%3C!DOCTYPE+HTML%3E Although that in the cases I've seen the web page having no DOCTYPE is rendered like with the DOCTYPE that is prepended by libxml2 I would be happy if there was a way to not append the default DOCTYPE or to know that the original document had no DOCTYPE at all. Is there a way to do that? -- Damian Pietras http://www.linuxprogrammingblog.com _______________________________________________ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml