[
https://issues.apache.org/jira/browse/PDFBOX-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117728#comment-15117728
]
John Hewson edited comment on PDFBOX-3092 at 1/26/16 6:51 PM:
--------------------------------------------------------------
A cmap table doesn't need to define mappings for all glyphs, many glyphs have
no code points, e.g composite glyphs (such as accents) and contextually
substituted glyphs (from GSUB). Arial Unicode is certainly not a broken font!
Microsoft are responsible for most of the TrueType / OTL spec and it's a
flagship font on Windows.
The cmap table in Arial Unicode contains 2496 entries corresponding to 38916
codepoint to glyph mappings, FontBox returns approx. 100 or so, so we're
missing most of the cmap entries in FontBox, which is why PDFBox fails to
render glyphs that we know exist in the cmap table and the font. I suspect this
is because there are over thirty eight thousand mappings, which FontBox isn't
used to.
was (Author: jahewson):
A cmap table doesn't need to define mappings for all glyphs, many glyphs have
no code points, e.g composite glyphs (such as accents) and contextually
substituted glyphs (from GSUB). Arial Unicode is certainly not a broken font!
Microsoft are responsible for most of the TrueType / OTL spec and it's a
flagship font on Windows.
The cmap table in Arial Unicode contains 2496 entries corresponding to 38916
codepoint to glyph mappings, FontBox returns approx. 100 or so, so we're
missing most of the cmap entries in FontBox, which is why PDFBox fails to
render glyphs that we know exist in the cmap table and the font.
> Format 4 TTF cmap table is parsed incorrectly
> ---------------------------------------------
>
> Key: PDFBOX-3092
> URL: https://issues.apache.org/jira/browse/PDFBOX-3092
> Project: PDFBox
> Issue Type: Bug
> Affects Versions: 2.0.0
> Reporter: John Hewson
> Fix For: 2.1.0
>
>
> Certain large Format 4 cmap tables aren't being parsed correctly by
> CmapSubtable#processSubtype4(), for example in the font "ArialUnicodeMS".
> This results in missing glyphs when rendering the file from PDFBOX-2950, when
> "ArialUnicodeMS" is used as a substitute. You can force this to happen by
> changing the following line of PDCIDFontType2:
> {code}
> // find font or substitute
> CIDFontMapping mapping = FontMappers.instance()
> .getCIDFont(getBaseFont(),
> getFontDescriptor(),
> getCIDSystemInfo());
> {code}
> Replace getBaseFont() with "ArialUnicodeMS"
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]