[
https://issues.apache.org/jira/browse/PDFBOX-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Adam Nichols resolved PDFBOX-802.
---------------------------------
Resolution: Fixed
Patch committed in revision 988694
> Better handle corrupt/missing %%EOF flags at the end of a file
> --------------------------------------------------------------
>
> Key: PDFBOX-802
> URL: https://issues.apache.org/jira/browse/PDFBOX-802
> Project: PDFBox
> Issue Type: Improvement
> Reporter: Adam Nichols
> Assignee: Adam Nichols
> Fix For: 1.3.0
>
>
> Currently, when the %%EOF flag at the end of the file is missing, an
> IOException is thrown which produces a stacktrace something like this:
> java.io.IOException: Error: Expected to read '%%EOF' instead started reading
> '%%E^@'
> at
> org.apache.pdfbox.pdfparser.BaseParser.readExpectedString(BaseParser.java:1090)
> at
> org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:463)
> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:859)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:826)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:751)
> While these PDFs are non-conforming, it'd be an improvement to allow them to
> be read and processed since we're only a few bytes from the end of file
> anyway.
> There's existing code which checks to see if what was read was %%EOF and
> throw an exception if %%EOF wasn't read and we're not at the end of file.
> However, this is never reached because readExpectedString() throws an
> exception before this can happen. To fix this, I changed
> readExpectedString() to readString() and left the manual check to see if the
> proper %%EOF flag was found. If not, it'll output a warning. If we're not
> at the end of the file, we'll still throw an exception. I've seen corrupted
> and missing %%EOF flags at the end of a file, but never in the middle. Since
> this doesn't seem to happen, if it does the PDF is clearly out of spec, and
> these issues would be much harder to deal with, throwing an exception still
> seems like a reasonable thing to do.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.