Disclaimer: Still not a WG response.
On Feb 16, 2008, at 14:25, Frank Ellermann wrote:
Henri Sivonen wrote:
HTML5 parsing has no such thing as a valid DTD subset.
<sigh /> If it cannot parse valid XHTML 1 it's fine,
just don't offer the option, or give up with a clear
error message when you "see" a DTD subset or anything
else that won't fit into your model, valid or not.
Like I said before, that was how Validator.nu used to work and a
change to the old behavior was requested. I cannot comply with
everyone's suggestions at the same time when mutually exclusive
behaviors are suggested. I have chosen not to comply with yours on
this point.
But this doesn't affect you or
other validators, what they should do is answer the
simple question:
Is document X valid HTML / HTML5 / XHTML ?
For any given X, independent of how you get it, HTTP,
upload, FTP, pigeon carrier, gopher, form input, ...
Validator.nu checks the combination of the protocol entity body and
the Content-Type header. Pretending that Content-Type didn't matter
wouldn't make sense when it does make a difference in terms of
processing in a browser.
OTOH what you got as X, however you got it, *is* X,
the valid or invalid input for validation. What HTTP
servers claim is at best *optional* additional info
for the task to validate X.
Content-Type is acted on by browsers when provided, so even if
supplying it were optional, looking at it once supplied isn't.
If folks actually want to check X'' = X + HTTP header
or X''' = X + charset or doctype overrides offer this
as option. As you already do it for X''' but not X''.
I also provide the lax type option to override the MIME type (albeit
in a limited way to prevent Validator.nu loading images, movies,
etc.). Respecting Content-Type is the default, though.
The main reason for adding the character encoding override was
supporting the form-based file upload case, but I opted not to hide
the UI in other cases.
Making the references to a misconfigured server is
under your control.
Yeah, I could use form input or upload instead of a
HTTP URL, or maybe set up a decent gopher server and
let your validator tackle this.
What are you trying to achieve? Are you trying to check that your Web
content doesn't have obvious technical problems? If you are, surely it
would be less useful if the validator pretended that Content-Type
didn't matter to parser choice when it does matter in browsers. Or are
you just trying to game a tool to say that your page is valid while
insisting on doing stuff that is practically problematic? If so,
what's the point?
If that is your idea of usability we are wasting time,
as I can simply use validators doing what I want, i.e.
check X, neither X' nor X'', and typically not X'''.
In order to assess whether doing what you want is a waste of time, I'd
like to know what objective you have in mind in the use case sense.
Why are you validating pages?
--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/