Tilman Hausherr created PDFBOX-2772:
---------------------------------------
Summary: EI token lost for rewrite
Key: PDFBOX-2772
URL: https://issues.apache.org/jira/browse/PDFBOX-2772
Project: PDFBox
Issue Type: Bug
Components: Parsing, Writing
Affects Versions: 1.8.9, 1.8.10, 2.0.0
Reporter: Tilman Hausherr
Assignee: Tilman Hausherr
Fix For: 1.8.10, 2.0.0
>From Lukas S. in the dev mailing list:
{quote}
a co-worker and i are currently developing a service for searching and
replacing content in pdf documents based on pdfbox. We started our project with
the 1.8.2 version of pdfbox and just trying to migrated to 1.8.8 recently.
On changing to version 1.8.8 we are running into troubles with pdf content
concerning inline images. Our code study of the differences between those
versions of pdfbox led us to the handling of the EI operator as reason of our
troubles.
In version 1.8.2 the method parseNextToken() of the
org.apache.pdfbox.pdfparser.PDFStreamParser does an unread of the EI token on
inline images. In newer versions this unread of the EI token doesn't exist
anymore with the following comment "// the EI operator isn't unread, as it
won't be processed anyway".
As a consequence the token sets of a document containing an inline image
delivered by the PDFStreamParser can't be used to (re)render a valid pdf
document by the ContentStreamWriter.
The reason is the missing token for the EI operator. Maybe, that the EI token
doesn't trigger any further processing, but it is still necessary to represent
the delimiter in the token sequence.
On the other side if a inline image should be part of a pdf page and is
inserted as a token set manually, the EI token must also be present in the
token set, so that the ContentStreamWriter is able to create a correct pdf
document.
>From our point of view there are two simple approaches to get a more
>consistent internal representation of pdf documents with pdfbox concerning
>inline images. Either represent the EI operator as a token (revert to handling
>in version 1.8.2.) explicitly or extend the writeObject method in the
>ContentStreamWriter to append the EI operator implicitly.
{quote}
THAT is what I call an excellent bug report :-) I think that the 2nd solution
you suggested is the better one.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]