On Fri, Aug 30, 2013 at 7:33 PM, Joshua Cranmer 🐧 <pidgeo...@gmail.com> wrote: > The problem I have with this approach is that it assumes that the page is > authored by someone who definitively knows the charset, which is not a > scenario which universally holds. Suppose you have a page that serves up the > contents of a plain text file, so your source data has no indication of its > charset. What charset should the page report? The choice is between guessing > (presumably UTF-8) or saying nothing (which causes the browser to guess > Windows-1252, generally).
Where did the text file come from? There's a source somewhere... And these days that's hardly how people create content anyway. And again, it has already been pointed out we cannot scan the entire byte stream (since text/plain uses the HTML parser it goes for that too, unless we make an exception I suppose, but what data supports that?), which would make the situation worse. -- http://annevankesteren.nl/ _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform