On Mar 31, 2010, at 22:11, Boris Zbarsky wrote:

> On 3/31/10 10:37 AM, Henri Sivonen wrote:
>> Gecko sets the document's character encoding to UTF-8 and uses UTF-8 to 
>> decode the external resource.
> 
> One more clarifying question.... Does Gecko use UTF-8, or the encoding of 
> whatever document it was open() got called on?

Gecko uses the encoding of the document that open() got called on.
http://hsivonen.iki.fi/test/moz/document-open-initial-charset.htm

>> WebKit uses the encoding of the opener. IE8 (both with compat view button 
>> pressed and not pressed) sets the document's character encoding to "unicode" 
>> and uses UTF-8 to decode the external resource. Opera uses Windows-1252 to 
>> decode the external resource.
> 
> Similar question for IE.

IE6 and IE8 set the encoding to "unicode" and use UTF-8 to decode the external 
resource even if the document that open() was called on had a different meta 
charset.

It seems that WebKit uses the encoding of the document that open() was called 
on *and* about:blank in an iframe inherits the encoding of its parent, which is 
why I previously thought WebKit used the encoding of the opener.

Furthermore, I was wrong when I thought Opera didn't support document.charset 
and document.characterSet. It does support them, but document.open()ed docs 
have the document's character encoding set to the empty string and the empty 
string means the user's fallback encoding (Windows-1252 by default) for the 
purpose of external resources.

From the evidence so far, assuming that IE is axiomatically sufficiently Web 
compatible here, it seems to me that making document.open() set the encoding to 
UTF-8 and ignoring meta charset in document.open()ed documents could work. I 
can also see why retaining the encoding of the document that open() was called 
on could be preferable, but so far I'm not persuaded that meta charset in 
document.open()ed documents should have an effect.

I verified that CSS and JS are treated the same way:
http://hsivonen.iki.fi/test/moz/document-open-external-charset-style.htm
http://hsivonen.iki.fi/test/moz/document-open-initial-charset-style.htm
http://hsivonen.iki.fi/test/moz/document-open-internal-charset-style.htm

On Apr 1, 2010, at 06:26, And Clover wrote:

> No browser will actually try to submit a form as UTF-16 for this reason, but 
> it still causes problems. eg. Firefox misleadingly sets the `_charset_` hack 
> field to 'UTF-16' even though the submission is UTF-8-encoded.


Is a bug on file? I didn't find a bug about this in Bugzilla.

-- 
Henri Sivonen
hsivo...@iki.fi
http://hsivonen.iki.fi/


Reply via email to