Add Tika parsers for PDF and TTF
--------------------------------
Key: PDFBOX-1132
URL: https://issues.apache.org/jira/browse/PDFBOX-1132
Project: PDFBox
Issue Type: New Feature
Components: FontBox, Parsing
Reporter: Jukka Zitting
The PDF and TTF parsers in Apache Tika rely more on improvements in PDFBox than
on those in Tika, so it would make more sense for that code to reside inside
Apache PDFBox.
Having the code inside PDFBox would allow for tighter integration with PDFBox
internals and avoid need to wait for an official PDFBox release before new
features can be used inside the PDF and TTF parsers.
To do this, I'd migrate the code PDF and TTF parser classes and related test
cases and files from Tika to the PDFBox and FontBox components. We'd add an
optional dependency to tika-core to these components, so people who don't use
or need Tika wouldn't be affected.
I'll attach a patch with the proposed changes.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira