On Thu, 4 Apr 2002, Anton Tagunov wrote:

 Hi Anton,

 Thanks a lot.

> - changes status of KOI8-U on Jungshik's comment
>   (sorry, I have never tested that myself :-(

  I haven't test it either :-), but both Mozilla/Netscape6 and MS IE
list it in view|encoding  menu, which I interpret as having support
for it.


>    UTF-16 
> -  KOI8-U        (http://www.faqs.org/rfcs/rfc2319.html)
>  
> -are IANA-registered (C<UTF-16> even as a preferred MIME name)
> +=for comment
> +waiting for comments from Jungshik Shin to soften this - Anton
> +
> +is a IANA-registered preferred MIME name
>  but probably should be avoided as encoding for web pages due to 
> -the lack of browser supports.
> +the lack of browser support.

   The reason your test didn't work with MS IE was probably
you didn't prepend your UTF-16 html doc. with BOM(byte order mark).
It's to be noted that a conventional way of informing web browsers
of MIME charset by putting <meta> tag doesn't work for UTF-16/UTF-32.
Either you have to configure your web server to emit C-T header with
'charset=UTF-16(LE|BE)' or you have to put BOM at the beginning.
When BOM is present, MS IE 5/6, Mozilla/Netscape6 and Netscape4
have no problem rendering UTF-16(LE|BE) encoded pages. I put
up a couple of test pages at

   http://jshin.net/i18n/utf16le_kr2.html
   http://jshin.net/i18n/utf16be_kr2.html

For more details on UTF-16 and HTML, you can refer to HTML4 spec. at
 
  http://www.w3.org/TR/html4/charset  (see section 5.2.1)

As I wrote before, I have no intention to encourage use of UTF-16 over
UTF-8 although some people  whose primary script  has a more 'economical'
(in terms of file size) representation in UTF-16 than in UTF-8 may want
to use it.


> +=head2 Microsoft-related naming mess
> +
> +Microsoft products misuse the following names:
> +
> +=over 2
> +
> +=item KS_C_5601-1987
> +
> +Microsoft extension to C<EUC-KR>.
> +
> +Proper name: C<CP949>.
> +
> +See
> +http://lists.w3.org/Archives/Public/ietf-charsets/2001AprJun/0033.html
> +for details.

 Wow, I didn't know that Martin wrote this. Thanks a lot for
digging this up.  He 'rediscovered' what a lot of people in Korea had
complained about. One thing I don't agree with him is what designation
to use for  CP949. I think it'd better be 'windows-949' because that's
more in line with other MS code pages such as windows-125x (for European
scripts). By the same token, MS version of Shift_JIS can be labeled as
'windows-932. At the moment, Mozilla uses 'x-windows-949' for CP949/UHC
because it's not yet registered with IANA. Probably, I have to contact
Martin and discuss this issue.

> +Encode aliases C<KS_C_5601-1987> to C<cp949> to reflect
> +this common misusage. 

 If my patch is accepted, cp949 has a couple of more aliases,
'uhc' and '(x-)-windows-949'. CP949 is commonly known as 
'ÅëÇÕ ¿Ï¼ºÇü'(Unified Hangul Code) in Korea.


> +I<Raw> C<KS_C_5601-1987> encoding is available as C<kcs5601-raw>.

  ksc5601-raw had better be renamed  ksx1001-raw and ksc5601-raw
can be made an alias to ksx1001-raw. Pls, note that now what's now called
ksc5601-raw has two new characters which were only added in Dec. 1998
over a year after the name change (KS C 5601 -> KS X 1001).

> +=item GB2312
> +
> +Encode aliases C<GB2312> to C<euc-cn> in full agreement with
> +IANA registration. C<cp936> is supported separately.
> +I<Raw> C<GB_2312-80> encoding is available as C<kcs5601-raw>.

  Oops... You meant gb2312-raw, didn't you? :-)


> Jungshik, I would have certainly advocated linking not only to
> http://lists.w3.org/Archives/Public/ietf-charsets/2001AprJun/0033.html
> but also to your comments on the KS_C_5601-1987 in the list archive,
> but all your mails were on several subjects each.
> 
> Jungshik> ... refer to Ken Lunde's CJKV Information Processing
> Jungshik> about that 'epic war' between two camps. (see p.197 of
> Jungshik> the book and http://jshin.net/faq/qa8.html)
> Jungshik> We even set up a web page to prevent M$ from spreading that
> Jungshik> ill-defined name.
> 
> maybe we may link to this page? What is the address?

  The campaign web has disappeared since. It was almost 5 years
ago :-). However, my Hangul FAQ subject 8 deals with the issue
(http://jshin.net/faq/qa8.html) so that you may add the link to it.
Well, be aware that it's been untouched for a few years (if not longer)
and needs a complete overhaul.




Reply via email to