Kaixo! On Sat, Jun 29, 2002 at 05:17:04PM -0700, Keith Packard wrote: > > What are those glyphs? (I'm quite surprised, I would have expected the > > opposite: fonts generally have more glyphs than the standard encodings of > > the sio-8859 family for example) > > My definition of language tag is coloured by the OS/2 table codePageRange > bits from which is was originally defined in fontconfig. Those bits are > defined to map to specific Windows code pages; the Latin-1 case doesn't > map to ISO 8859-1, but rather to code page 1252 for which many fonts are > missing a few random entries.
But what characters are those? It is possible that they are the onesthat have been added to cp1252 and that didn't existed some years ago? I think the matching should be done against the lowest denominator and be strict; or to give different weights to the miss of *letters* or other symbols (it may be more or less acceptable to get quotation marks from another font; bUt lEttErs frOm A dIffErEnt fOnts Is vErY UglY). > > No, the tolerance for missing glyphs in CJK tests should be the same or > > even smaller. The difference is that it isn't needed to test all the glyphs > > for CJK coverages; testing only a set of 256 choose glyphs would be enough > > (if they are correctly choosen, testing that 256 glyphs are present in a > > font is enough to assure, with 99.99% of confidence, that it covers a given > > CJK language). > > I'm not confident enough of this approach; I fear that any set of 256 > glyphs that must appear in a simplified Chinese font may well appear in > many traditional Chinese (or even Japanese) fonts. Most do, of course, but there are a lot that don't. I only dealt with a ~10-15 ttf CJK fonts, but never had false positives using that method. >> out there that doesn't encode all the characters of gb2312? > > It seems that this must be the case -- I set the '500' number so high > because all of the fonts which I have that advertise support for > simplified Chinese are missing over 200 glyphs from GB2312. I got > similar results for Japanese fonts, Korean Wansung fonts and traditional > Chinese fonts. But what characters are those missing? Could it be that those are semi-graphic ones, or scripts used by other languages (eg: cyrillic, greek, japanese kana in chinese font, etc). Here too, different weights should be used, it is not a big problem if a CJK font is missing cyrillic, a font designed for russian will be a much better choice to render cyrillic anyway; but it may be a big problem if some needed characters are missing. And I'm really surprised by such a high number as 200. Are you sure you tested against gb2312 and not agains the Microsoft codepage based on it (that surely adds several extra characters) ? >> But to handle such case, I think it would be better to choose a given >> definition of "big5" (or several of them) and stick to it, rather than >> allowing a so tremendously big hole as 500 possible missing chars. > > Missing 500 from a repertoire of nearly 20000 doesn't seem to render most > of these fonts unusable. It could, it depends on what glyphs are missing. -- Ki ça vos våye bén, Pablo Saratxaga http://chanae.stben.be/pablo/ PGP Key available, key ID: 0xD9B85466 [you can write me in Walloon, Spanish, French, English, Italian or Portuguese] _______________________________________________ Fonts mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/fonts