Hi Juergen, Thanks for letting us know about this, the NullPointerException certainly sounds like a PDFBox bug. Please open an issue on JIRA (https://issues.apache.org/jira/browse/PDFBOX/ <https://issues.apache.org/jira/browse/PDFBOX/>) and upload the problem PDF (via More > Attach Files).
Thanks, — John > On 24 Aug 2015, at 11:11, Jürgen Uhl <[email protected]> wrote: > > I have a pdf document using (besides others) the font CourierNewPS-BoldMT and > text with this font containing a double quote. > > When calling PDFont.encode, this results in a NullPointerException due to the > following: > The font encoding is built using pdf /DIFFERENCES which overwrites the > original "quotedbl" at index 34 with an "A". The entries for > quotedblbase/left/right are left unchanged. As a result, the inverted font > does not contain "quotedbl" as key. > Within encode, the character code 34 gets assigned the name "quotedbl", which > is then not found in the inverse encoding (PDTrueTypeFont.encode -> int code > = inverted.get(name)) > Right before this code line causing the NullPointerException, there is a > check whether ttf.hasGlyph("quotedbl") (which in this case is false) and, if > not, whether ttf.hasGlyph("uni0022") (which in this case is true); however, > this has no consequence for the continuation of the code, which then crashes, > since inverted.get("quotedbl") is null (which is assigned to an int). > I believe, this is a bug in PDFBox, but have no idea, whether the handling > within encode should be changed (maybe using the "else" part in case > ttf.hasGlyph("quotedbl") is false or whether code 34 should be assigned to > quotedblbase in the first place, or even something else. > In any case, I'd of course be eager to learn about ways to circumvent this > situation as a PDFBox user. > Juergen

