Re: [Fonts]fontconfig peculiarity(??)
On Fri, 18 Oct 2002, Keith Packard wrote: Around 7 o'clock on Oct 18, Jungshik Shin wrote: For some unknown reason, 'New Gulim' is picked up by 'fontconfig' or 'Xft' for a certain characters when CODE2000 is explicitly requested by applications like Mozilla and gedit (via Pango) More specifically, those certain characters are U+115F(Hangul leading consonant filler) and U+1160(Hangul trailing consonant filler). Fontconfig has a kludge to weed out fonts with broken encoding tables; such fonts often have encoding table entries pointing at blank glyphs which aren't supposed to be blank. It checks each glyph in the encoding and ignores those which are inappropriately blank. which are expected to be blank, that list was derived from a similar table in Mozilla. Blank glyphs not in the table are assumed to represent broken Keith, we talked about this a month ago (Sep. 7th) on this very list :-) You came up with a much more extensive list of characters than Mozilla's blank glyph list. I also filed a bug for Mozilla-Windows (http://bugzilla.mozilla.org/show_bug.cgi?id=167136). You must have forgottent about it. :-) I added those two characters to the blank glyph list /etc/fonts/fonts.conf then. In addition, both Ngulim and Code2000 have blank glyphs for both characters. The only difference is that in Ngulim they're both *spacing*(width 0) while in Code2000 only U+115f is spacing and U+1160 is non-spacing(width=0). So, even if my blank glyph list doesn't have them, there's no reason I can think of Ngulim is preferred over Code2000 for those characters. If they're equal on this count, the explicit request seems to have to take precedence, doesn't it? One possible explanation is that Code2000 isn't marked as supporting 'ko' in font-cache for some reason while Ngulim is. However, both fonts have more or less similar coverage of Korean characters (the full set of precomposed syllables and Hangul Conjoining Jamos and other symbols in KS X 1001). So, this is a bit mytery, too. weren't included in the table. This means that no font will ever be listed as supporting these glyphs, so Mozilla will pick the first font in the match list to draw them with, expecting that this will produce a missing glyph indication. BTW, could it be possible to 'deceive' or 'force' fontconfig to believe that a certain font covers a certain range of Unicode even if it doesn't appear to? I guess it's not possible at the moment, but wouldn't it be nice to add it? What I'm thinking of is something like this: match target=font test qual=any name=familystringGulim Old Hangul Jamo/string/test edit name=coverage mode=assign binding=strong coderanges./coderanges/edit /match where coderanges are a comman-separated list of unicode code points (integer) or code ranges (sth. like [0x-0x]). I found in font cache file that charset property does exactly the thing I want to do with 'coderanges'. If so, would it be possible to use 'charset' to achieve what I described above? Well, I've gotta figure out how 'charset' represents Unicode ranges. Some fonts have a hack-encoding (although advertised as in Unicode) and their apparent Unicode coverage cannot be guessed at all by fontconfig based on Unicode cmap. An application or library aware of this hack-encoding can do some hack with them, though. However, fontconfig does not appear to return a requested font and come up with a fallback after 'intelligent guess' even if explicitly specified because what it thinks a font with hack-encoding can cover does not match at all the range of Unicode an application want to draw with the font. It'd be also nice to be able to do something similar with 'lang' tag. I thought the following line would *make* fontconfig *believe* (*ignoring* what it finds out with OS lang tag and orthography map) that 'Gulim Old Hangul Jamo' is suitable for Korean, but it doesn't seem to work. Did I do anything wrong? --- match target=font test qual=any name=familystringGulim Old Hangul Jamo/string/test edit name=lang mode=assign binding=strongstringko/string/edit /match --- Both of these certaily look like a hack, but some applications (perhaps mathml, Indic script handling, Korean alphabet handling...) need them until OTFs are widely available. Related problems are talked about at http://bugzilla.mozilla.org/show_bug.cgi?id=126919#c315 and comments references therein. http://bugzilla.mozilla.org/show_bug.cgi?id=95708 Jungshik ___ Fonts mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/fonts
Re: [Fonts]fontconfig peculiarity(??)
Around 12 o'clock on Oct 18, Jungshik Shin wrote: Keith, we talked about this a month ago (Sep. 7th) on this very list :-) Sorry, I didn't look at the email address from your previous message. One possible explanation is that Code2000 isn't marked as supporting 'ko' in font-cache for some reason while Ngulim is. If your font specification includes language, this would cause Ngulim to be preferred over Code2000 if both are added to the pattern in the config file. If the application explicitly names 'Code2000' as a family name, then the language shouldn't matter. Code2000 isn't marked as supporting Korean as it is missing a large number of Han glyphs, totalling some 3136 characters from the KSC 5601-1992 encoding. Many Korean documents will not be completely covered by this font. It also isn't marked as supporting Japanese or any of the Chinese languages. Keith PackardXFree86 Core TeamHP Cambridge Research Lab ___ Fonts mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/fonts
Re: [Fonts]fontconfig peculiarity(??)
On Fri, 18 Oct 2002, Keith Packard wrote: Around 12 o'clock on Oct 18, Jungshik Shin wrote: One possible explanation is that Code2000 isn't marked as supporting 'ko' in font-cache for some reason while Ngulim is. This explanation only makes sense when those two chars are NOT included in the blank glyph list, doesn't it? As I wrote, they've have been in the blank glyph list in my fonts.conf since early September. Hmm, things are getting more interesting. After I removed Ngulim.ttf from my font path and then put it back (I ran fc-cache before testing), suddenly Mozilla picks up U+1160 glyph from Code2000. The same is true of 'gedit' when Code2000 is specified as a font to use. Is it at the whim of electrons whirling around inside my computer :-) ? If your font specification includes language, this would cause Ngulim to be preferred over Code2000 if both are added to the pattern in the config file. If the application explicitly names 'Code2000' as a family name, then the language shouldn't matter. The page in question (http://jshin.net/i18n/korean/hunmin.html and http://jshin.net/i18n/korean/hunmin_comp.html) specifies font-family to be CODE2000 explicitly with CSS. I assume this will make Mozilla with Xft enabled ask fontconfig for that font explicitly. As for Pango(gedit), I'm less certain because I don't know whether Pango specifies language when sending fonts request down(or up) the road. Therefore, my original mystery still remains a mystery :-) Code2000 isn't marked as supporting Korean as it is missing a large number of Han glyphs, totalling some 3136 characters from the KSC 5601-1992 encoding. Many Korean documents will not be completely covered by this Sorry I didn't check Han glyphs only checking that it has the full set of precomposed Hangul syllables(11,172 of them.). As I suggested before, a kind of multi-level orthography check may be necessary to cope with situations like this. Or, would it be possible for users to override manually what fontconfig *detects* (both code range coverage and lang) in fonts.conf as suggested in my prev. email? Jungshik ___ Fonts mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/fonts
Re: [Fonts]fontconfig peculiarity(??)
Around 16 o'clock on Oct 18, Jungshik Shin wrote: Hmm, things are getting more interesting. After I removed Ngulim.ttf from my font path and then put it back (I ran fc-cache before testing), suddenly Mozilla picks up U+1160 glyph from Code2000. The same is true of 'gedit' when Code2000 is specified as a font to use. Is it at the whim of electrons whirling around inside my computer :-) ? fc-cache ignores directories which are older than the associated cache file; you have to use the '-f' option to force it to rescan the files. The cache holds the list of available characters in each font, so a failure to update the cache could easily have been the source of this problem. Note that fc-cache doesn't rescan directories when the configuration changes; the only configuration option which affects the resulting cache file is the blank glyph list, which isn't expected to (ever) change aside from bug fixes. The page in question (http://jshin.net/i18n/korean/hunmin.html and http://jshin.net/i18n/korean/hunmin_comp.html) specifies font-family to be CODE2000 explicitly with CSS. I assume this will make Mozilla with Xft enabled ask fontconfig for that font explicitly. Yes it does. As for Pango(gedit), I'm less certain because I don't know whether Pango specifies language when sending fonts request down(or up) the road. I don't know either, but fontconfig will pick up the current locale and convert that to a language if Pango doesn't explicitly set one. As I suggested before, a kind of multi-level orthography check may be necessary to cope with situations like this. Or, would it be possible for users to override manually what fontconfig *detects* (both code range coverage and lang) in fonts.conf as suggested in my prev. email? I believe Korean may be unique in this reguard; I don't know of other languages with multiple common character sets which are essentially independently usable. Japanese has kana and kanji, but there are strong conventions on which words are spelled in each set. As per my comment above, I strongly prefer to make the contents of the cache files independent of the configuration so that multiple configurations can share the same cache files without difficulty. Remember that the language name is just a shorthand notation for a unicode coverage table; if you want to identify fonts with Hangul syllables, you can easily build a charset encompassing those and ask for the font covering the greatest number. Keith PackardXFree86 Core TeamHP Cambridge Research Lab ___ Fonts mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/fonts