On Saturday, March 30, 2002, at 03:24 , Dan Kogai wrote: > Okay. I've checked > > http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/ > > One more time and it seems that other missing encodings are available > as well, such as korean. I'll look into that.
I think I have found the reason why some of the encodings were missing from Tcl's *.enc, which later turned into *.ucm. Apple makes use of Unicode compound characters too extensively, which doesn't go well with .ucm, not to mention *.enc http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/JAPANESE.TXT > # Apple additions - vertical forms > 0xEB41 0x3001+0xF87E # vertical form for IDEOGRAPHIC COMMA ^^^^^^Mac Japanese, then Unicode Character Encode/macJapan.ucm > <UF8B5> \xEB\x41 |0 # Private Use So they are already conflicting. While MacJapanese doesn't have many, MacKorean does have lots of them. No wonder it is not listed on Tcl. I wonder which one I should trust but I have reasons to believe Apple is still considering the map @ unicode.org canonical. Take HFS+, for example. The word 'Hangul' consists of two syllables, two characters in KSC5601 (han-gul). But on HFS+, it is broken up to h-a-n-g-u-l. Though it is possible to mangle enc2xs to make such mappings (it can handle, in theory, any nbyte-nbyte conversion), the UCM format does not seem to be designed that way. Hmm.... Let me think about it for a while... Well, it's only vendor mapping and Encode support has already matched that of major browsers. So it is already practical enough and I believe the level of support is good enough for 5.8.0. Maybe those vendor mapping that are missing be diverted to Encode::Vendors::(Apple|MS) or something.... Dan the Encode Maintainer