On Fri, Jul 27, 2012 at 03:42:27PM -0700, [email protected] wrote: > From: Conrad Irwin <[email protected]> > > Hi Xml, > > In HTML email it's common to find arbitrary fragments of HTML, the one > that triggered this change was of the form: > > <meta><font></font><div>... > > Before this change the <font> tag was part of the implicit <head> that > gets created for the <meta> tag, after this change, it is part of the > <body>, which more closely matches the behaviour of modern HTML > implementations. > > Is there a good reason that these tags didn't close the <head> tag > before?
Well, it's a bit hard to tell, that could simply be that nobody though about such case scenario ... That's the problem of real life html parsing, you will end up with <DOCTYPE> in the middle of the <body> and with <p> within the head ... And it's a complete pain to know what's best to adopt as a stategy when dealing with such an error except looking at what various browser seems to do under the hood and try to mimick it :-\ > I'm also not sure about applet/embed/object, so I've left them > out of the list for now. Yeah until someone screams to have them in I would rather keep as-is > It might be better to move towards a more-HTML-5-based approach where > any non-head-supported tag causes the <head> to be closed. See Section > 12.2.5.4.4 The "in head" insertion mode. [1] But I'm not sure what the > current plans are for HTML-5 in libxml2? Yeah at least that's one of the good point of HTML5 if it ends up as a process, it would give a clear indication on how to process the usual mistakes. We discussed this before, I'm not too tempted to embedd directly an external parser code within libxml2, but I would love to see the existing parser either improved along the lines of HTML5 errors handling or a new specific mode added to the HTML parser indicating to follow HTML5 rules. The problem is not willingness to do this but the time needed to do so and clearly I won't have time for such an effort myself at least in the short term future. So for the good part, I applied our patch, thanks a lot ! http://git.gnome.org/browse/libxml2/commit/?id=b60061a7a59d1305824896172b705c31316bc761 Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ [email protected] | Rpmfind RPM search engine http://rpmfind.net/ I http://veillard.com/ | virtualization library http://libvirt.org/ _______________________________________________ xml mailing list, project page http://xmlsoft.org/ [email protected] https://mail.gnome.org/mailman/listinfo/xml
