On Wed, Jun 27, 2012 at 4:17 PM, Andres Perera <andre...@zoho.com> wrote: ... > that page is encoded iso 8859-1, doesn't state so anywhere, breaks > with browsers configured to default to utf8 in the absence of encoding > qualifiers
Those browsers are violating the HTTP/1.1 standard. RFC 2616, section 3.7.1, paragraph 4: The "charset" parameter is used with some media types to define the character set (section 3.4) of the data. When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined to have a default charset value of "ISO-8859-1" when received via HTTP. Data in character sets other than "ISO-8859-1" or its subsets MUST be labeled with an appropriate charset value. See section 3.4.1 for compatibility problems. And then there's section 3.4.1: 3.4.1 Missing Charset Some HTTP/1.0 software has interpreted a Content-Type header without charset parameter incorrectly to mean "recipient should guess." Senders wishing to defeat this behavior MAY include a charset parameter even when the charset is ISO-8859-1 and SHOULD do so when it is known that it will not confuse the recipient. Unfortunately, some older HTTP/1.0 clients did not deal properly with an explicit charset parameter. HTTP/1.1 recipients MUST respect the charset label provided by the sender; and those user agents that have a provision to "guess" a charset MUST use the charset from the content-type field if they support that charset, rather than the recipient's preference, when initially displaying a document. See section 3.7.1. Wait, was that a warning that an explicit charset parameter broke some older browsers? Huh... Philip Guenther