On Tue, 9 Jun 2009, Anne van Kesteren wrote: > On Tue, 09 Jun 2009 01:42:57 +0200, Øistein E. Andersen <li...@coq.no> wrote: > > Le 5 juin 09, Anne van Kesteren écrivit : > >> > >> Is the implication here that Shift_JIS and Shift-JIS are distinct > >> [...]? > > > > No, Shift-JIS and Windows-932 are commonly used names/labels for the > > encodings that are registered as Shift_JIS and Windows-31J > > (respectively) in the IANA charset registry. Sorry for the confusion > > caused. > > So should HTML5 mention that Windows-932 maps to Windows-31J? (It does > not appear in the IANA registry.)
I've added this mapping too, just in case. On Tue, 9 Jun 2009, �istein E. Andersen wrote: > > That is an interesting question. My (apparently wrong) understanding was > that the table was merely supposed to provide mappings between > encodings, since such mappings are inappropriate in non-HTML contexts > and cannot be added to the IANA registry. It might be to useful to > include a set of MIME charset strings which cannot be or have not yet > been registered (e.g., x-x-big5, x-sjis, windows-932) as well as > information on how CJK character sets are implemented in practice, both > of which seem to be necessary for compatibility. > > Such information does not fit comfortably in the current table, though. Added x-sjis. What are the other mappings that would be good? On Tue, 9 Jun 2009, �istein E. Andersen wrote: > > > > I believe you misunderstand the purpose of this table. The idea is to > > give a mapping of _labels_ to encodings, not encodings to encodings. > > I've clarified the text to this effect. > > You seem to have added "specified by a label" to the phrase which now > reads "an encoding specified by a label given in the first column of the > following table" without changing the column heading ("Input encoding") > and without defining what a "label" actually is. The reference to > "encoding aliasing" is also intact, which seems misleading if the table > is not supposed to map between encodings. I've split the table in two to avoid this issue. Earlier, you wrote: > > GB2312 and GB_2312-80 technically refer to the *character set* GB > 2312-80, [...]. GBK, on the other hand, is an encoding. As far as I can tell, GB2312 and GB_2312-80 are two different encodings according to IANA. On Wed, 10 Jun 2009, Anne van Kesteren wrote: > > I would prefer them being added to the IANA registry. I've noted that I should do that. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'