On Tue, 26 Mar 2002, Jungshik Shin wrote: > > really means euc-cn and charset="ks_c_5601-1987" really menas euc-kr. > > Sadly this misconception is enbedded to popular browsers.
> M$ OE, M$ Frontpage keep producing html docs. However, > it also has to be noted that the encoding > designated as 'ks_c_5601-1987' by M$ is NOT the same as > EUC-KR BUT their proprieatary extension of EUC-KR, namely > CP949/UHC/(X-)-Windows-949. Therefore, I'd like to suggest (or rather do) for Korean encodings that: - Add X-Windows-949 converter - Make 'ks_c_5601-1987' and 'X-UHC', 'UHC', and 'CP949' as an alias to 'X-Windows-949' - Add JOHAB converter - Remove 'ksc5601' aliased to 'euc-kr'. Since there are some existing data in X-Windows-949 but mislabeled as EUC-KR, it might be necessary to make 'euc-kr' -> Unicode converter generous and act as 'X-Windows-949' to Unicode converter (whether or not this is desirable and necessary depends on what applications Encode may be used for). However, in the other direction (Unicode -> euc-kr) it has to be strictly compliant to the standard. See <http://bugzilla.mozilla.org/show_bug.cgi?id=131388> Jungshik Shin