Kaixo! On Wed, Jan 09, 2002 at 05:20:11AM -0500, Glenn Maynard wrote: > On Wed, Jan 09, 2002 at 03:18:36AM -0600, [EMAIL PROTECTED] wrote: > > [The German Fraktur font of Latin being unreadable to English and many > > modern German readers.] > > If modern German used this font, and I wanted to mix German and English, > then I'd definitely want a way to make sure the right font could be used > for both, if that's what the user wanted.
Even in a text-only, monofont appliance like the display of a VCR controler, or a GSM phone display? Even in a road sign? Even when you handwrite the text ? There are cases where the concept of using different fonts for different portions of text (depending on language or any other criteria) applies; and other cases where that doesn't apply at all. > The important goal is to make sure users in other languages aren't so > annoyed with some lack within the spec that they go miles out of it to > get what they want. If they can choose the font they want for display, why would they be annoyed at all? People telling they are annoyed are in fact annoyed of the fact of the unicode unification more than anything other; and even if that unification never has any visible consequence in their lives they will still be annoyed. If they had never heard of the unicode unification they would have never noticed it. > I'm still not certain of the exact cause of, for example, EUC-JP and > Shift-JIS ending up in ID3V2 tags. EUC-JP and Shift-JIS can encode *only* japanese; so, what is the difference between encoding japanese in a japanese-only encoding and using a japanese only font; and encoding japanese text in unucode and using a japanese font? There is absolutely no visible difference. That is why that proposal is nonsense. It would be different if the proposal was for another multilingual encoding similar to but incompatible with unicode (incompatible in that the character set is different, not in that the ordering is different). But even then, it would make more sense to add the missing needed chars to unicode rather than having to wrestle for every application on earth to support a non standard encoding. > I'm getting the impression that > Japanese programmers who wanted Japanese-capable editors didn't use the > library, and ignored the UTF-8 spec more for political reasons ("I don't > like Unicode") than practical ones. (With Japanese encodings, you still > can't embed some characters in a Chinese font.) Another possibility is > that they did use the library, but the library didn't perform the > appropriate conversions (and they couldn't be bothered to fix it to use > an encoding they didn't like to begin with.) The problem is that proper unicode support needs much more than simply Japanese support. You need to handle a complex multi-byte encoding, with also multi-width chars (while Japanese only encodings are quite simple: only two kind of chars: ascii (1 byte, 1 column), japanese (2 bytes, 2 columns). There is no "non spacing" chars, no combining chars, not chars encoded in 3, 4, 5 or 6 bytes... on top of that, libraries to convert between the various japanese encodings are around for years, they are mature and there are lots of sample code (including real applications) using them, and a lot of programers experienced to use them. UTF-8 is just new world, and needs some time to mature at the same level, and to be understood and used by programers. That isn't limited to Japane either; encodings like iso-8859-*, koi8-* etc are still widely used, and still the preferred encoding for a lot of people. All that will change of course; but it is an evolution, not a revolution. It needs time. On the other hand, for a completly new development, it makes sense to use unicode (utf-8 or other encoding) internally, and use a good iconv-like library to convert to locale encoding, if needed. So, as ogg format is quite new, it makes sense to mandate utf-8 as being the default and *only* encodign used for all embedded text. That will also have the extra advantage of avoiding all the encoding problems of misinterpretating the encoding. No mojibake. > Either way, having a stable library that does the appropriate > conversions will probably go a long way to keeping that from happening > again with Ogg tags. At least in Microsoft Windows systems and in systems using GNU libc there actually is such a library allowing to convert encodings. -- Ki ça vos våye bén, Pablo Saratxaga http://www.srtxg.easynet.be/ PGP Key available, key ID: 0x8F0E4975 -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/