Around 16 o'clock on Jul 6, David Starner wrote:
> These aren't that useful, but > vo (Volapük): a ä b c d e f g h i j k i m n o ö p r s t u ü v x y z. >... > chr (Cherokee): I'd like a complete set of 639-1 languages and Volapük is a welcome addition. I'll also start adding the 639-2 languages as I receive them, but I don't expect to get a complete set of those any time soon. > Punctuation (not listed for Dutch?) is the same as German. The goal is to list only the alphabet, abjad or logography needed to represent the complete language; punctuation has too many possible encodings and might accidentally mischaracterize some fonts. One outstanding question is whether we should include numerals; I'm willing to listen to arguments on both sides of that issue. For logographic languages, I'm using standard encodings and stripping out non-language specific bits. So far, that's working pretty well, but I may want to reduce the sets some to make sure I don't miss any fonts. Of course, the key is to include codepoints not generally included in fonts for other languages. That's been less successful -- the simplified chinese font 'simsun' contains every Han codepoint in Big5. Again, we're fortunate that more and more fonts are pre tagged with OS/2 language tables from which we can deduce intended language targets far more accurately. Keith Packard XFree86 Core Team HP Cambridge Research Lab _______________________________________________ I18n mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/i18n