On Wed, Jun 27, 2012 at 4:17 PM, Andres Perera <andre...@zoho.com> wrote:
...
> that page is encoded iso 8859-1, doesn't state so anywhere, breaks
> with browsers configured to default to utf8 in the absence of encoding
> qualifiers

Those browsers are violating the HTTP/1.1 standard.  RFC 2616, section
3.7.1, paragraph 4:

   The "charset" parameter is used with some media types to define the
   character set (section 3.4) of the data. When no explicit charset
   parameter is provided by the sender, media subtypes of the "text"
   type are defined to have a default charset value of "ISO-8859-1" when
   received via HTTP. Data in character sets other than "ISO-8859-1" or
   its subsets MUST be labeled with an appropriate charset value. See
   section 3.4.1 for compatibility problems.


And then there's section 3.4.1:

3.4.1 Missing Charset

   Some HTTP/1.0 software has interpreted a Content-Type header without
   charset parameter incorrectly to mean "recipient should guess."
   Senders wishing to defeat this behavior MAY include a charset
   parameter even when the charset is ISO-8859-1 and SHOULD do so when
   it is known that it will not confuse the recipient.

   Unfortunately, some older HTTP/1.0 clients did not deal properly with
   an explicit charset parameter. HTTP/1.1 recipients MUST respect the
   charset label provided by the sender; and those user agents that have
   a provision to "guess" a charset MUST use the charset from the
   content-type field if they support that charset, rather than the
   recipient's preference, when initially displaying a document. See
   section 3.7.1.


Wait, was that a warning that an explicit charset parameter broke some
older browsers?  Huh...


Philip Guenther

Reply via email to