Many TrueType fonts include an OS/2 table which holds codePageRange bits.  
These bits indicate the old OS/2 code pages supported by the font, and 
hence indirectly indicate which languages the font is intended to support.

These tables, however, are quite primitive, indicating support for only a 
very few languages as they hold only 64 bits total.

My question is whether I should take these TrueType fonts and test them 
against my new coverage tables, at least for languages which aren't 
covered by the codePageRange bits.

I now have coverage information for 76 of the 139 ISO 639-1 language 
names; I used the Unicode code charts to mark coverage for the Indic 
languages and a few other scripts:

        Bengali (BN)
        Tibetan (BO)
        Gujarati (GU)
        Khmer (KM)
        Kannada (KN)
        Lao (LO)
        Malayalam (ML)
        Mongolian (MN)
        Oriya (OR)
        Sinhala (Sinhalese) (SI)
        Tamil (TA)
        Telugu (GE)
        Tagalog (TL)

Given that these languages have unique alphabets, this method seems 
relatively sound.  I'm still missing several Indic languages and
all of the non-arabic African languages.

I did remove the @ and ` marks from the latin scripts; that should leave 
all of them including only the alphabet.

I've also committed this whole mess to XFree86 CVS; the coverage 
files can be found in xc/lib/fontconfig/fc-lang/*.orth

Keith Packard        XFree86 Core Team        HP Cambridge Research Lab


_______________________________________________
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Reply via email to