Re: xkcd: LTR

Leif Halvard Silli Tue, 27 Nov 2012 14:09:21 -0800

Philippe Verdy, Tue, 27 Nov 2012 21:07:31 +0100:
> Ahhhh ! I see now the problem: the XHTML file is being served as HTML 
> instead of XHTML (but this is not invalid for XHTML 1).


Both SGML-based HTML4 and XML-based XHTML 1 operate with syntax rules 
that are not - and has never been - compatible with the way text/html 
operates. Thus, both HTML4 and XHTML1 permits syntaxes whose semantics 
are ignored when the document is parsed as HTML (as opposed to parsed 
as SGML or as XML).

If you you are interested in creating XHTML syntax that is compatible 
with HTML, then you should look at Polyglot Markup: 
http://www.w3.org/TR/html-polyglot/

> But anyway you're also right that the XML prolog found is NOT valid 
> for HTML5 when the file is served as HTML instead of XHTML.

The fact that XHTML 1 permits the XML prolog regardless how the 
document is served, is just a shortcoming of the XHTML 1 specification.

> So these browsers must find 
> something else: given the XML prolog they should then use HTML5 in 
> its XHTML profile, not in its HTML profile

No, that is not how things works. The decision to parse the document as 
HTML is taken before the browser sees the XML prologue. So the prologue 
should not - and does not - change anything with regard to parsing as 
HTML or as XML.

> ; in this profile, they 
> MUST honor the XML prolog and notably its XML encoding declaration 
> (given that the encoding is not specified in the HTTP Content-type.

Again: Absolutely not. They must not, will not and must not honour the 
XML prologue. (It is another matter that some user agents sometimes use 
the prologue to look for encoding information.)

> Now given the XML prolog and the DTD declaration, the file is clearly 
> not even HTML5 in XML/XHTML (i.e. XHTML 5), but is XHTML 1 (based on 
> a stable subset of HTML4, but working in strict mode without the 
> quirks modes). Once again, this excludes using the HTML5 rules again.

In a way the names and the numbers (HTML4, XHTML1, HTML5) are just 
confusing. There is just one way to parse HTML. When it comes to HTML 
(text/html),then HTML5 differs from HTML4 and XHTML1 in that it is not 
based on a *another* format than HTML itself. Because HTML4 and XHTML1 
are not based on how HTML actually works, and - in addition - does not 
take fully account of that (or whatever the reason), they allow 
syntaxes, such as DTD declarations, which have no effect (except 
side-effects such as quirks-mode) in HTML.

> I'm still convinced that these are bugs in Firefox and IE, which 
> support only HTML5 in its basic HTML profile, but not HTML5 in its 
> XML/XHTML profile (which is also part of the HTML5 standard and where 
> processing the XML prolog is NOT an option but a requirement).

Just for the record: HTML5 defines the most up-to-date parsing 
mechanism for *all* HTML documents - HTML1,2,3,5 as well as any flavour 
of XHTML served as HTML. HTML5 does not allow authors to use the XML 
prologue. So while XHTML1 allows you to use the prologue, the best 
description of how to parse anything that purports to be HTML -  HTML5 
- does not require user agents/browsers to pay any attention to the 
prologue. Thus the correct one to blame in this case for the fact that 
it doesn't work in Firefox, seems to be the author. (Though we could 
also blame the "The history of how HTML developed".
-- 
leif halvard silli

Re: xkcd: LTR

Reply via email to