James Kass wrote:
> In the event of a conflict between the HTTP header and the HTML meta
> tag, of course the browser should believe the HTML meta tag.  After
> all, who knows better than the author the encoding used to construct
> the file?

Who knows better the encoding used to *send* the file?  The last server to
touch it.

It used to be common, the norm in fact, for Russian servers to store files
in various legacy encodings (KOI-8, 8859-5, DOS-something,...) and to serve
them in some other encoding, after transcoding on-the-fly based on the
User-Agent.  There were also transcoding proxies for Asian character sets
that one could use to overcome the limitations of browsers of that era.
These practices were still around when the HTML 4 spec was released in 1997
and no doubt contributed to getting things as they are.

>  Where the server has performed a character set conversion
> upon request from a browser, then, as a part of the character set 
> conversion process, the HTML meta tag needs to be re-written in case
> the page is archived by the visitor for later off-line viewing.

It takes large amounts of tricky code to reliably parse real-life HTML.  It
is unreasonable to expect servers, which have no business parsing HTML, to
contain this code.  Browsers have it and *they* should adjust the meta tag
when they do a "Save as..."

> If this were the case, we wouldn't be having this thread.

If servers would just shut up when they don't know (as required by the HTML
spec)....

-- 
Fran�ois Yergeau

Reply via email to