[ 
https://issues.apache.org/jira/browse/PDFBOX-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117728#comment-15117728
 ] 

John Hewson edited comment on PDFBOX-3092 at 1/26/16 6:46 PM:
--------------------------------------------------------------

A cmap table doesn't need to define mappings for all glyphs, many glyphs have 
no code points, e.g composite glyphs (such as accents) and contextually 
substituted glyphs (from GSUB).

Arial Unicode is certainly not a broken font! Microsoft are responsible for 
most of the TrueType / OTL spec and it's a flagship font on Windows.

The cmap table in Arial Unicode contains 2496 entries, FontBox returns approx. 
100 or so, so we're missing about 95% of the cmap entries in PDFBox, which is 
why PDFBox fails to render glyphs that we know exist in the cmap table and the 
font.


was (Author: jahewson):
A cmap table doesn't need to define mappings for all glyphs, many glyphs have 
no code points, e.g composite glyphs (such as accents) and contextually 
substituted glyphs (from GSUB).

Arial Unicode is certainly not a broken font! Microsoft are responsible for 
most of the TrueType / OTL spec and it's a flagship font on Windows.

The cmap table in Arial Unicode contains 2496 entries, FontBox returns approx. 
100 or so, so we're missing about 95% of the cmap entries in PDFBox.

> Format 4 TTF cmap table is parsed incorrectly
> ---------------------------------------------
>
>                 Key: PDFBOX-3092
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3092
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>            Reporter: John Hewson
>             Fix For: 2.1.0
>
>
> Certain large Format 4 cmap tables aren't being parsed correctly by 
> CmapSubtable#processSubtype4(), for example in the font "ArialUnicodeMS".
> This results in missing glyphs when rendering the file from PDFBOX-2950, when 
> "ArialUnicodeMS" is used as a substitute. You can force this to happen by 
> changing the following line of PDCIDFontType2:
> {code}
> // find font or substitute
> CIDFontMapping mapping = FontMappers.instance()
>                                     .getCIDFont(getBaseFont(), 
> getFontDescriptor(),
>                                                 getCIDSystemInfo());
> {code}
> Replace getBaseFont() with "ArialUnicodeMS"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to