[ 
https://issues.apache.org/jira/browse/PDFBOX-2721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15122163#comment-15122163
 ] 

Andreas Lehmkühler commented on PDFBOX-2721:
--------------------------------------------

I guess there is a missunderstanding.

{quote}
PDF Reference says:
beginnotdefchar, endnotdefchar, beginnotdefrange, and endnotdefrange
define notdef mappings from character codes to CIDs. As described in the
section “Handling Undefined Characters” on page 355, a notdef mapping is
used if the normal mapping produces a CID for which no glyph is present in
the associated CIDFont.
{quote}
This section is about embedded CMaps which is something different than 
ToUnicode CMaps.

These are the rules for ToUnicode CMaps:
{quote}
The CMap file shall contain begincodespacerange and endcodespacerange operators 
that are
consistent with the encoding that the font uses. In particular, for a simple 
font, the codespace shall be one
byte long.
It shall use the beginbfchar, endbfchar, beginbfrange, and endbfrange operators 
to define the mapping
from character codes to Unicode character sequences expressed in UTF-16BE 
encoding.
{quote}
The given CMap provides the first but instead of the second it contains a CID 
mapping which isn't used within ToUnicode CMaps.

The rendering works fine as the pdf consists of images. The text in question is 
invisible (text rendering 3) and even if it were visible the ToUnicode CMap 
isn't used for rendering.

> Invalid ToUnicode CMap in font 
> -------------------------------
>
>                 Key: PDFBOX-2721
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2721
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox
>    Affects Versions: 2.0.0
>            Reporter: Juraj Lonc
>         Attachments: cmap_beginnotdefrange.pdf
>
>
> Attached PDF file works fine in Adobe Reader, but PDFBox logs warnings:
> 2015-03-20 15:48:57,573 WARN  [org.apache.pdfbox.pdmodel.font.PDFont] 
> (http-0.0.0.0-8080-7) Invalid ToUnicode CMap in font HPDFAA+Thoth-Unicode
> It seems that you require "beginbfchar" or "beginbfrange" in CMap. But should 
> it be required?
> CMap definition contains "beginnotdefrange" and this is ignored in PDFBox.
> PDF Reference says:
> beginnotdefchar, endnotdefchar, beginnotdefrange, and endnotdefrange
> define notdef mappings from character codes to CIDs. As described in the
> section “Handling Undefined Characters” on page 355, a notdef mapping is
> used if the normal mapping produces a CID for which no glyph is present in
> the associated CIDFont.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to