On Saturday, April 20, 2002, at 04:53 , Autrijus Tang wrote:
> I've been immersed in Big5-related issues in the past few days, and
> came back with these last-minute (err, week?) changes before 5.8-RC1.
>
> The Diff contains fixes to TW.pm, Alias.pm, and README.(tw|cn).

Excellent!

> (For dan) big5-hkscs should be upgraded to the 2001 edition, as per
> Hong Kong government's decree. It's available separately at:
>
>     http://egb.elixus.org/~autrijus/big5-hkscs.ucm.gz
>
> Also, please delete big5.ucm and replace it with big5-eten, at:
>
>     http://egb.elixus.org/~autrijus/big5-eten.ucm.gz

Thus updated.  I needed to update TW/Makefile.PL and 
lib/Encode/Config.pm (so it loads on 'big5-eten' instead of just 
'big5'). but that's not at all a big deal.

> I've fixed Alias.pm so big5 aliases to big5-eten. The reason is that
> the 'Big5' as originally defined isn't used anywhere on earth; non-
> Microsoft systems uses 'big5' to mean 'big5-eten', and Microsoft
> uses 'big5' to mean 'cp950'.
>
> It is therefore unwise to have a canonical 'big5' encoding, much like
> there should not be a 'gb2312' encoding. Since gb2312 is now aliased
> to euc-cn and not cp936, I think big5 should alias to big5-eten and
> not cp950.

I agree.  AFAIK, Big5 is the only major CJK encoding not endorsed by the 
government.  What's so funny is that there seems less confusions between 
encodings there in Taiwan than in Japan or Korea.  Japan is the worst 
for using Shift_JIS, EUC-JP, ISO-2022-JP(-[12])? and now Unicode (IMHO, 
however, the Japanese people should be proud for making multibyte 
character encoding a reality.  But I can't help wondering this mess is 
way too much a price to pay :)....

> Oh, I just noticed that Dan retained the 'gb2312.ucm' name, although
> the encoding is called 'gb2312-raw'. I admit that I don't fully
> understand the reason, but if that's to stand, then big5-eten could also
> be named 'big5.ucm', and still say '<code_set_name> "big5-eten"', for
> consistency's sake.

I renamed big5.ucm to big5-eten.ucm.  "-raw" that are missing from *.ucm 
filenames is just that they look too funny on 8.3 filesystems, nothing 
more :)

> Thanks,
> /Autrijus/

Xin     Ku      Le      !
\x{8f9b}\x{82e6}\x{4e86}

Xiao    Si       Dan
\x{5c0f}\x{98fc} \x{5f3e}\n

Reply via email to