This situation is rather analogous to the case where HTTP is sent with no charset 
parameter, either directly or in an HTML META statement. RFC 2616 is explicit in 
section 3.7.1

  "                                           When no explicit charset
   parameter is provided by the sender, media subtypes of the "text"
   type are defined to have a default charset value of "ISO-8859-1" when
   received via HTTP.  "

However every browser that I have examined violates this and actually guesses the 
character
set from other information available to it, such as the locale of the machine, or an 
explicit user setting. To my mind the browser manufacturers are correct and the 
standard is wrong. 

One thing that RFC does get right in correcting some earlier deviant behavior of 
browsers is in section 3.4.1

"3.4.1 Missing Charset

   Some HTTP/1.0 software has interpreted a Content-Type header without
   charset parameter incorrectly to mean "recipient should guess."
   Senders wishing to defeat this behavior MAY include a charset
   parameter even when the charset is ISO-8859-1 and SHOULD do so when
   it is known that it will not confuse the recipient.

   Unfortunately, some older HTTP/1.0 clients did not deal properly with
   an explicit charset parameter. HTTP/1.1 recipients MUST respect the
   charset label provided by the sender; and those user agents that have
   a provision to "guess" a charset MUST use the charset from the
   content-type field if they support that charset, rather than the
   recipient's preference, when initially displaying a document. See
   section 3.7.1." 

i.e. - if it is there, do as it says. Here the standard is almost, but not quite, 
admitting that the previous RFC 2068 was wrong and the clients correct in the absence 
of a charset parameter. It is a pity that it did not correct the error rather than 
repeating it in section 3.7.1 - but of little practical concern since that section is 
ignored in practice.

- Tim

Reply via email to