Anne van Kesteren wrote:
On Mon, 07 Dec 2009 16:42:31 +0100, Julian Reschke
<julian.resc...@gmx.de> wrote:
I think XHR needs to elaborate on how non-ASCII characters in request
headers are put on the wire, and how non-ASCII characters in response
headers are transformed back to Javascript characters.
Hmm yeah. I somehow assumed this was easy because everything was
restricted to the ASCII range. It appears octets higher than 7E can
occur as well per HTTP.
For request headers, I would assume that the character encoding is
ISO-8859-1, and if a character can't be encoded using ISO-8859-1, some
kind of error handling occurs (ignore the character/ignore the
header/throw?).
From my limited testing it seems Firefox, Chrome, and Internet Explorer
use UTF-8 octets. E.g. "\xFF" in ECMAScript gets transmitted as C3 BF
(in octets). Opera sends "\xFF" as FF.
For response headers, I'd expect that the octet sequence is decoded
using ISO-8859-1; so no specific error handling would be needed
(although the result may be funny when the intended encoding was
Firefox, Opera, and Internet Explorer indeed do this. Chrome decodes as
UTF-8 as far as I can tell.
I'd love some implementor feedback on the manner.
...
Thanks for doing the testing. The discrepancy between setting and
getting worries me a lot :-).
From HTTP's point of view, the header field value really is opaque. So
you can put there anything, as long as it fits into the header field ABNF.
Of course that only helps if senders and receivers agree on the
encoding. In my experience, server frameworks (servlet API, for
instance) assume ISO-8859-1 here (but that probably should be tested).
For XHR 1 I think the resolution should be to leave this
implementation-specific, and advise users not to rely on anything non-ASCII.
Best regards, Julian