[ https://issues.apache.org/jira/browse/PDFBOX-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tilman Hausherr closed PDFBOX-3004. ----------------------------------- Resolution: Not A Problem Ok, I'm closing this issue for the reasons mentioned. Please ask any "how to" questions (e.g. if you want to use the 2.0 version) on the user mailing list and we'll try to help you. https://mail-archives.apache.org/mod_mbox/pdfbox-users/ > PDF fulltext index fails. > ------------------------- > > Key: PDFBOX-3004 > URL: https://issues.apache.org/jira/browse/PDFBOX-3004 > Project: PDFBox > Issue Type: Bug > Reporter: Arkady Zalkowitsch > Attachments: Tika-Extract-Error.png, Tika-Meta.png, > not_found-2.0.txt, not_found.pdf, tika-out.txt > > > PDF fulltext index fails when font dictionary in there contains one entry for > the font Helvetica and an entry for Encoding whose value does not represent a > font at all. > The PDF Object in PDF looks like this: > {code} > obj = { > "/Fields": [ 12 0 R ], > "/DA": "/Helvetica 0 Tf 0 g", > "/DR": { > "/Font": { > "/Helvetica": "11 0 R", > "/Encoding": { > "/PDFDocEncoding": "10 0 R" > } > } > "/NeedAppearances": true > } > {code} > PDFBox tries to parse that "font" called Encoding and fails doing so. but > PDResources.getFonts() only logs the resulting exception: > {code} > try { > newFont = PDFontFactory.createFont( (COSDictionary)font ); > } catch (IOException exception) { > LOG.error("error while creating a font", exception); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org