[ft-devel] Incomplete cmap table for platform 0 (Apple Unicode)
Hello, http://bugs.winehq.org/show_bug.cgi?id=9840 has a ttf font attached to it which can be perfectly displayed in Windows, but Wine is not able to actually show any character using this font, only 'c' is displayed. That's because Freetype selects first unicode cmap table which happens to be with platform id 0 (Apple Unicode), and that cmap table is incomplete (or truncated). cmap tables for other platforms (1 and 3) are good, and making Freetype ignore cmap with platform id 0 makes the font display properly in Wine's notepad. Attached is the hack that makes Freetype ignore cmaps with platform id 0. What Freetype developers think about this problem? -- Dmitry. diff -uprN freetype2/src/sfnt/ttcmap.c freetype2/src/sfnt/ttcmap.c --- freetype2/src/sfnt/ttcmap.c 2007-06-13 17:05:55.0 +0900 +++ freetype2/src/sfnt/ttcmap.c 2007-09-30 20:45:39.0 +0900 @@ -2284,6 +2284,20 @@ charmap.encoding= FT_ENCODING_NONE; /* will be filled later */ offset = TT_NEXT_ULONG( p ); + FT_TRACE2(( "found cmap platform_id %u, encoding_id %u\n", + charmap.platform_id, charmap.encoding_id )); + + /* cmap tables with platform_id == Apple Unicode sometimes are + * incomplete in comparison to other tables. + * FIXME: perhaps fallback to this table if no other table exists. + */ + if ( charmap.platform_id == 0 ) /* Apple Unicode */ + { +FT_TRACE2(( "ignoring Apple Unicode encoding\n" )); +continue; + } + + if ( offset && offset <= face->cmap_size - 2 ) { FT_Byte* volatile cmap = table + offset; ___ Freetype-devel mailing list Freetype-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/freetype-devel
Re: [ft-devel] Incomplete cmap table for platform 0 (Apple Unicode)
Hi, On Sun, 30 Sep 2007 21:13:00 +0900 "Dmitry Timoshkov" <[EMAIL PROTECTED]> wrote: >http://bugs.winehq.org/show_bug.cgi?id=9840 has a ttf font attached >to it which can be perfectly displayed in Windows, but Wine is not able >to actually show any character using this font, only 'c' is displayed. Thank you for providing the sample font. It seems that the first cmap (platformID, platformSpecificID) = (0, 0) = (Unicode, Default Semantics) is not broken, but designed to ignore most of glyph. Its contents is like this: [cmap]format = 4, length=0x007a, languageID=0x(unknown) startCode=0x0022, endCode=0x0022, idDelta=0x0036, idRangeOffset=0x startCode=0x0026, endCode=0x0027, idDelta=0x, idRangeOffset=0x0012 startCode=0x002c, endCode=0x002c, idDelta=0x0025, idRangeOffset=0x startCode=0x002e, endCode=0x002f, idDelta=0x, idRangeOffset=0x0012 startCode=0x003b, endCode=0x003c, idDelta=0x, idRangeOffset=0x0014 startCode=0x003e, endCode=0x003e, idDelta=0x0021, idRangeOffset=0x startCode=0x005b, endCode=0x005d, idDelta=0x, idRangeOffset=0x0014 startCode=0x0060, endCode=0x0060, idDelta=0xfff5, idRangeOffset=0x startCode=0x007b, endCode=0x007e, idDelta=0x, idRangeOffset=0x0016 startCode=0x, endCode=0x, idDelta=0x0001, idRangeOffset=0x subtable0 cmap fmt4 : code 0x0022 -> gid 0x0058(88) glyfLength=22 subtable0 cmap fmt4 : code 0x0026 -> gid 0x0053(83 = 83 + 0) glyfLength=0 subtable0 cmap fmt4 : code 0x0027 -> gid 0x0052(82 = 82 + 0) glyfLength=18 subtable0 cmap fmt4 : code 0x002c -> gid 0x0051(81) glyfLength=20 subtable0 cmap fmt4 : code 0x002e -> gid 0x0054(84 = 84 + 0) glyfLength=14 subtable0 cmap fmt4 : code 0x002f -> gid 0x0056(86 = 86 + 0) glyfLength=16 subtable0 cmap fmt4 : code 0x003b -> gid 0x0061(97 = 97 + 0) glyfLength=24 subtable0 cmap fmt4 : code 0x003c -> gid 0x0060(96 = 96 + 0) glyfLength=22 subtable0 cmap fmt4 : code 0x003e -> gid 0x005f(95) glyfLength=22 subtable0 cmap fmt4 : code 0x005b -> gid 0x0059(89 = 89 + 0) glyfLength=20 subtable0 cmap fmt4 : code 0x005c -> gid 0x0057(87 = 87 + 0) glyfLength=16 subtable0 cmap fmt4 : code 0x005d -> gid 0x005a(90 = 90 + 0) glyfLength=20 subtable0 cmap fmt4 : code 0x0060 -> gid 0x0055(85) glyfLength=14 subtable0 cmap fmt4 : code 0x007b -> gid 0x005e(94 = 94 + 0) glyfLength=52 subtable0 cmap fmt4 : code 0x007c -> gid 0x005d(93 = 93 + 0) glyfLength=14 subtable0 cmap fmt4 : code 0x007d -> gid 0x005c(92 = 92 + 0) glyfLength=52 subtable0 cmap fmt4 : code 0x007e -> gid 0x005b(91 = 91 + 0) glyfLength=0 subtable0 cmap fmt4 : code 0x -> gid 0x(0) glyfLength=0 In HUDfont.ttf, most alphabets have glyphIndex < 81, so this cmap cannot access them. One of the problem is that the cmap is NOT broken in the viewpoint of the data structure, so non-intellectual validator (that does not compare the coverage of accessible glyph and the coverage of included glyph) cannot refuse it. Your patch assumes that Apple Unicode cmap is often broken but others are more reliable, but I'm afraid that this is not generic assumption. >That's because Freetype selects first unicode cmap table which happens >to be with platform id 0 (Apple Unicode), and that cmap table is incomplete >(or truncated). cmap tables for other platforms (1 and 3) are good, and >making Freetype ignore cmap with platform id 0 makes the font display >properly in Wine's notepad. > >Attached is the hack that makes Freetype ignore cmaps with platform id 0. > >What Freetype developers think about this problem? Excuse me, do you think the selection of best cmap is the role of FreeType? I think, FreeType2 provides an API for users to select cmap subtable by the pair of platformID & platform-specificID. http://www.freetype.org/freetype2/docs/reference/ft2-base_interface.html I think, it's better for Wine to have internal priorities of "prefered" cmap and try to load from the best to the worst. For example, thinking about UCS-4 capable fonts (like SURSONG.TTF or SIMSUN-EXTB.TTF). Such fonts have cmap subtables for Microsoft UCS2, and Microsoft UCS4. Usually Microsoft UCS4 cmap subtable appears after MS UCS2 cmap subtable. So, if we let FreeType to choose the cmap subtable automatically, we cannot reach Microsoft UCS4 cmap subtable, even if we ignore Apple Unicode. >From the viewpoint of compatibility with Microsoft products, it's not good idea. So, I think, it's better for Wine to have internal priorities of "prefered" cmap subatble and choose the best cmap subatble by himself. How do you think of? Regards, mpsuzuki ___ Freetype-devel mailing list Freetype-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/freetype-devel
Re: [ft-devel] Incomplete cmap table for platform 0 (Apple Unicode)
<[EMAIL PROTECTED]> wrote: Your patch assumes that Apple Unicode cmap is often broken but others are more reliable, but I'm afraid that this is not generic assumption. That was really a hack to show that other cmap tables actually work better for that font. Of course other fonts can have cmap tables for other platfoms rather than 0 "broken". Excuse me, do you think the selection of best cmap is the role of FreeType? I think, FreeType2 provides an API for users to select cmap subtable by the pair of platformID & platform-specificID. http://www.freetype.org/freetype2/docs/reference/ft2-base_interface.html I think, it's better for Wine to have internal priorities of "prefered" cmap and try to load from the best to the worst. For example, thinking about UCS-4 capable fonts (like SURSONG.TTF or SIMSUN-EXTB.TTF). Such fonts have cmap subtables for Microsoft UCS2, and Microsoft UCS4. Usually Microsoft UCS4 cmap subtable appears after MS UCS2 cmap subtable. So, if we let FreeType to choose the cmap subtable automatically, we cannot reach Microsoft UCS4 cmap subtable, even if we ignore Apple Unicode. From the viewpoint of compatibility with Microsoft products, it's not good idea. So, I think, it's better for Wine to have internal priorities of "prefered" cmap subatble and choose the best cmap subatble by himself. How do you think of? Wine uses FT_Select_Charmap API to select either FT_ENCODING_UNICODE, FT_ENCODING_MS_SYMBOL or FT_ENCODING_APPLE_ROMAN when appropriate. So that's actually Freetype's responsibility to choose the best/correct/ working charmap table in that case. Yes, Wine can arrange some kind of cmap priorites, what if some of "preferred" cmap tables is broken? How an application can decide which cmap table is better without digging into internal cmap data? Shouldn't that be a responsibility of Freetype to ignore incomplete/broken cmaps, especially since it already parses cmap tables and can easily decide which one is better? -- Dmitry. ___ Freetype-devel mailing list Freetype-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/freetype-devel
Re: [ft-devel] Incomplete cmap table for platform 0 (Apple Unicode)
Dear Sir, On Mon, 1 Oct 2007 11:52:19 +0900 "Dmitry Timoshkov" <[EMAIL PROTECTED]> wrote: ><[EMAIL PROTECTED]> wrote: >> So, I think, it's better for Wine to have internal priorities >> of "prefered" cmap subatble and choose the best cmap subatble >> by himself. How do you think of? > >Wine uses FT_Select_Charmap API to select either FT_ENCODING_UNICODE, >FT_ENCODING_MS_SYMBOL or FT_ENCODING_APPLE_ROMAN when appropriate. >So that's actually Freetype's responsibility to choose the best/correct/ >working charmap table in that case. I see, I was misunderstanding. Now I think your report says that FT_ENCODING_UNICODE is too rough to choose the best cmap subtable for Unicode encoding. I guess, if you can specify Microsoft platform and UCS2 or UCS4 encoding cmap subtable explicitly, it serves your purpose. Am I understanding correctly? I think it's reasonable request. >Yes, Wine can arrange some kind of cmap priorites, what if some of "preferred" >cmap tables is broken? How an application can decide which cmap table is better >without digging into internal cmap data? Shouldn't that be a responsibility of >Freetype to ignore incomplete/broken cmaps, especially since it already parses >cmap tables and can easily decide which one is better? As I've shown in previous post, Apple Unicode cmap in the sample font is NOT broken from the viewpoint of data structure, I think. To detect "broken" cmap as you say, it's required to investigate cmap, loca, glyf subtables content semantically. After checking the all cmap, loca, glyf subtables, the best cmap subtable would be chosen. I'm not against the fact that such feature is convenient, but I'm questionable if it should be built-in feature of FreeType and should be enabled by default. Regards, mpsuzuki ___ Freetype-devel mailing list Freetype-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/freetype-devel
Re: [ft-devel] Incomplete cmap table for platform 0 (Apple Unicode)
On Sun, 2007-09-30 at 20:16, [EMAIL PROTECTED] wrote: > As I've shown in previous post, Apple Unicode cmap in the > sample font is NOT broken from the viewpoint of data > structure, I think. To detect "broken" cmap as you say, > it's required to investigate cmap, loca, glyf subtables > content semantically. After checking the all cmap, loca, > glyf subtables, the best cmap subtable would be chosen. > I'm not against the fact that such feature is convenient, > but I'm questionable if it should be built-in feature of > FreeType and should be enabled by default. I'm not sure how you could even do it. It is perfectly valid to have unencoded glyphs, desirable even. How on earth could you determine that one subtable was "better"? In ttc files many glyphs won't be encoded for a given font, nor with they be the result of GSUB transformations. I can't think of an algorithm that could produce reasonable results. ___ Freetype-devel mailing list Freetype-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/freetype-devel
Re: [ft-devel] Incomplete cmap table for platform 0 (Apple Unicode)
"George Williams" <[EMAIL PROTECTED]> wrote: On Sun, 2007-09-30 at 20:16, [EMAIL PROTECTED] wrote: As I've shown in previous post, Apple Unicode cmap in the sample font is NOT broken from the viewpoint of data structure, I think. To detect "broken" cmap as you say, it's required to investigate cmap, loca, glyf subtables content semantically. After checking the all cmap, loca, glyf subtables, the best cmap subtable would be chosen. I'm not against the fact that such feature is convenient, but I'm questionable if it should be built-in feature of FreeType and should be enabled by default. I'm not sure how you could even do it. It is perfectly valid to have unencoded glyphs, desirable even. How on earth could you determine that one subtable was "better"? http://www.freetype.org/freetype2/docs/reference/ft2-base_interface.html#FT_Select_Charmap says: "Because many fonts contain more than a single cmap for Unicode encoding, this function has some special code to select the one which covers Unicode best." So it looks like either documentation should be changed to not mention that FT_Select_Charmap does the best choice, or FT_Select_Charmap behaviour should be changed to actually "select the one which covers Unicode best". -- Dmitry. ___ Freetype-devel mailing list Freetype-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/freetype-devel
Re: [ft-devel] Incomplete cmap table for platform 0 (Apple Unicode)
Dear Dmitry, Following is a function whose API is similar to FT_Select_Charmap() but ignores non-Microsoft cmap subtables. Does it serve your purpose? #include #include FT_FREETYPE_H #include FT_TRUETYPE_IDS_H /* getting the first cmap subtable for Microsoft platform * matching specified encoding. If encoding is FT_ENCODING_UNICODE, * UCS4 is prioritized than UCS2. */ FT_Error FT_Select_Charmap_Microsoft( FT_Face face, FT_Encoding encoding ) { FT_Int i, chosen_cmap_idx; chosen_cmap_idx = -1; /* -1 means not found */ for ( i = 0; i < face->num_charmaps; i ++ ) { if ( face->charmaps[ i ]->platform_id != TT_PLATFORM_MICROSOFT ) continue; else if ( encoding != FT_ENCODING_UNICODE && face->charmaps[ i ]->encoding == encoding ) { chosen_cmap_idx = i; break; } else if ( face->charmaps[ i ]->encoding_id == TT_MS_ID_UCS_4 ) { chosen_cmap_idx = i; break; } else if ( face->charmaps[ i ]->encoding_id == TT_MS_ID_UNICODE_CS ) { if ( chosen_cmap_idx < 0 ) chosen_cmap_idx = i; } } if ( chosen_cmap_idx < 0 ) return FT_Err_Invalid_CharMap_Handle; return FT_Set_Charmap( face, face->charmaps[ chosen_cmap_idx ] ); } ___ Freetype-devel mailing list Freetype-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/freetype-devel