On 17/12/2011 21:06, Michel Fortin wrote:
On 2011-12-17 13:09:35 +0000, Stewart Gordon <smjg_1...@yahoo.com> said:
<snip>
No, because in order to determine whether it's well-formed, one must know 
whether it's
meant to be in SGML-based HTML, HTML5 or XHTML.

Perhaps for it matters for validation if you don't say which spec to validate 
against, but
validating against a spec doesn't always reflect reality either. There is no
SGML-based-HTML-compliant parser used by a browser out there. Browsers have two 
parsers:
one for HTML and one for XML (and sometime the HTML parser behaves slightly 
differently in
quirk mode, but that's not part of any spec).

But there is a subset of HTML that is likely to be parsed correctly by browsers' HTML parsers, and this subset is all the HTML you're likely to need to use most of the time. On the other hand, the interpretation of tag soup is undefined and liable to vary from browser to browser. So validation certainly helps you out here.

And whether a browser uses the HTML or the XML parser has nothing to do with 
the doctype
at the top of the file: it depends on the MIME types given in the Content-Type 
HTTP header
or the file extension if it is a local file. HTML 5 doesn't change that.

Almost all web pages declared as XHTML out there are actually parsed using the 
HTML parser
because they are served with the text/html content type and not 
application/xhtml+xml. A
lot of them are not well formed XML and wouldn't be viewable anyway if parsed 
according to
their doctype.

But does any pre-HTML5 spec stipulate that HTML parsers accept tag soup in the first place? ISTM this is all down to a tendency of browser/engine authors to implement fallback for malformed HTML but not for malformed XML.

Stewart.

Reply via email to