Maciej Stachowiak wrote:
On Jul 27, 2007, at 12:09 PM, Jonas Sicking wrote:
Anne van Kesteren wrote:
I've been looking at overrideMimeType implementations in Gecko and
WebKit and it seems like they differ a bit. In Gecko it has to be
invoked before send(), but in WebKit it would work if you invoke it
just before getting responseXML or responseText. Neither
implementation seems to do any input checks.
If you have any opinion on how it should be specified I suppose now
would be the time to air your thoughts.
Of course I prefer the mozilla way :)
It does seem fairly complicated to allow it to be set after the
download is finished though. You do have the stream stored in
.reponseBody, but at that point all encoding information has been
lost. For HTML parsing (which I hope the spec will support in the
future) there are a pile of rules used to guess the encoding, all of
which would be useful to use, but can't be used if all you have access
to is the unencoded responseBody.
Why would the encoding information be lost? The only sources of encoding
info are the responseText itself and http headers, both of which the
XMLHttpResponse needs to provide anyway.
ResponseText is not the raw byte stream gotten off the wire, it is
already decoded into utf16 using whatever algorithm we define for
determining the encoding. HTML decoding is a lot more complicated since
you have to first guess an encoding, then start to parse the document,
but if you find a
<meta http-equiv="Content-Type" content="text/html; charset=?">
Where charset is different from what you guessed, you have to restart
from the beginning using the charset defined in the meta tag.
Yes, it would definitely be possible for the implementation to keep
around the raw byte stream and either lazily decode responseText, or
keep both the utf16 responseText and the raw byte stream around.
It is a bit quirky behavior though since setting overrideMimeType could
then change the encoding and therefor both responseXML and responseText.
/ Joans