Improved handling erronous data between endstream and endobj lines
------------------------------------------------------------------
Key: PDFBOX-803
URL: https://issues.apache.org/jira/browse/PDFBOX-803
Project: PDFBox
Issue Type: Improvement
Reporter: Adam Nichols
Assignee: Adam Nichols
Fix For: 1.3.0
I found that a PDF created by Exstream Dialogue Version 5.0.039 had ">> "
between the endstream and endobj sections. When this happened, PDFBox threw an
exception. This patch ignores junk characters between these sections so the
files can be processed. A log message is written warning the user of the
violation of the spec. For reference, here's the object I found in the file
(excluding the stream data):
27 0 obj
<<
/Filter [/A85 /Fl]
/Length 322
>>
stream
(data from stream omitted)
endstream
>> endobj
%PDF Font (F315)
As a side note Exstream seems to have sold their Dialogue software to HP, and
the current version is 7. This means the bug is likely fixed in the latest
version, but there are still some older PDFs out there which PDFBox should be
able to handle without throwing an exception.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.