I have a pdf document using (besides others) the font CourierNewPS-BoldMT and text with this font containing a double quote.

When calling PDFont.encode, this results in a NullPointerException due to the following:

1. The font encoding is built using pdf /DIFFERENCES which overwrites
   the original "quotedbl" at index 34 with an "A". The entries for
   quotedblbase/left/right are left unchanged. As a result, the
   inverted font does not contain "quotedbl" as key.
2. Within encode, the character code 34 gets assigned the name
   "quotedbl", which is then not found in the inverse encoding
   (PDTrueTypeFont.encode -> int code = inverted.get(name))
3. Right before this code line causing the NullPointerException, there
   is a check whether ttf.hasGlyph("quotedbl") (which in this case is
   false) and, if not, whether ttf.hasGlyph("uni0022") (which in this
   case is true); however, this has no consequence for the continuation
   of the code, which then crashes, since inverted.get("quotedbl") is
   null (which is assigned to an int).

I believe, this is a bug in PDFBox, but have no idea, whether the handling within encode should be changed (maybe using the "else" part in case ttf.hasGlyph("quotedbl") is false or whether code 34 should be assigned to quotedblbase in the first place, or even something else.

In any case, I'd of course be eager to learn about ways to circumvent this situation as a PDFBox user.

Juergen

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to