Philippe Verdy, Tue, 27 Nov 2012 21:07:31 +0100: > Ahhhh ! I see now the problem: the XHTML file is being served as HTML > instead of XHTML (but this is not invalid for XHTML 1).
Both SGML-based HTML4 and XML-based XHTML 1 operate with syntax rules that are not - and has never been - compatible with the way text/html operates. Thus, both HTML4 and XHTML1 permits syntaxes whose semantics are ignored when the document is parsed as HTML (as opposed to parsed as SGML or as XML). If you you are interested in creating XHTML syntax that is compatible with HTML, then you should look at Polyglot Markup: http://www.w3.org/TR/html-polyglot/ > But anyway you're also right that the XML prolog found is NOT valid > for HTML5 when the file is served as HTML instead of XHTML. The fact that XHTML 1 permits the XML prolog regardless how the document is served, is just a shortcoming of the XHTML 1 specification. > So these browsers must find > something else: given the XML prolog they should then use HTML5 in > its XHTML profile, not in its HTML profile No, that is not how things works. The decision to parse the document as HTML is taken before the browser sees the XML prologue. So the prologue should not - and does not - change anything with regard to parsing as HTML or as XML. > ; in this profile, they > MUST honor the XML prolog and notably its XML encoding declaration > (given that the encoding is not specified in the HTTP Content-type. Again: Absolutely not. They must not, will not and must not honour the XML prologue. (It is another matter that some user agents sometimes use the prologue to look for encoding information.) > Now given the XML prolog and the DTD declaration, the file is clearly > not even HTML5 in XML/XHTML (i.e. XHTML 5), but is XHTML 1 (based on > a stable subset of HTML4, but working in strict mode without the > quirks modes). Once again, this excludes using the HTML5 rules again. In a way the names and the numbers (HTML4, XHTML1, HTML5) are just confusing. There is just one way to parse HTML. When it comes to HTML (text/html),then HTML5 differs from HTML4 and XHTML1 in that it is not based on a *another* format than HTML itself. Because HTML4 and XHTML1 are not based on how HTML actually works, and - in addition - does not take fully account of that (or whatever the reason), they allow syntaxes, such as DTD declarations, which have no effect (except side-effects such as quirks-mode) in HTML. > I'm still convinced that these are bugs in Firefox and IE, which > support only HTML5 in its basic HTML profile, but not HTML5 in its > XML/XHTML profile (which is also part of the HTML5 standard and where > processing the XML prolog is NOT an option but a requirement). Just for the record: HTML5 defines the most up-to-date parsing mechanism for *all* HTML documents - HTML1,2,3,5 as well as any flavour of XHTML served as HTML. HTML5 does not allow authors to use the XML prologue. So while XHTML1 allows you to use the prologue, the best description of how to parse anything that purports to be HTML - HTML5 - does not require user agents/browsers to pay any attention to the prologue. Thus the correct one to blame in this case for the fact that it doesn't work in Firefox, seems to be the author. (Though we could also blame the "The history of how HTML developed". -- leif halvard silli