James Kass wrote: > In the event of a conflict between the HTTP header and the HTML meta > tag, of course the browser should believe the HTML meta tag. After > all, who knows better than the author the encoding used to construct > the file?
Who knows better the encoding used to *send* the file? The last server to touch it. It used to be common, the norm in fact, for Russian servers to store files in various legacy encodings (KOI-8, 8859-5, DOS-something,...) and to serve them in some other encoding, after transcoding on-the-fly based on the User-Agent. There were also transcoding proxies for Asian character sets that one could use to overcome the limitations of browsers of that era. These practices were still around when the HTML 4 spec was released in 1997 and no doubt contributed to getting things as they are. > Where the server has performed a character set conversion > upon request from a browser, then, as a part of the character set > conversion process, the HTML meta tag needs to be re-written in case > the page is archived by the visitor for later off-line viewing. It takes large amounts of tricky code to reliably parse real-life HTML. It is unreasonable to expect servers, which have no business parsing HTML, to contain this code. Browsers have it and *they* should adjust the meta tag when they do a "Save as..." > If this were the case, we wouldn't be having this thread. If servers would just shut up when they don't know (as required by the HTML spec).... -- Fran�ois Yergeau

