[
https://issues.apache.org/jira/browse/PDFBOX-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tilman Hausherr resolved PDFBOX-2762.
-------------------------------------
Resolution: Fixed
> remove parseCOSStream() call from PDFStreamParser
> -------------------------------------------------
>
> Key: PDFBOX-2762
> URL: https://issues.apache.org/jira/browse/PDFBOX-2762
> Project: PDFBox
> Issue Type: Task
> Components: Parsing
> Affects Versions: 2.0.0
> Reporter: Tilman Hausherr
> Assignee: Tilman Hausherr
> Fix For: 2.0.0
>
>
> This code is found in PDFStreamParser
> {code}
> if (c == '<')
> {
> COSDictionary pod = parseCOSDictionary();
> skipSpaces();
> if ((char)pdfSource.peek() == 's')
> {
> retval = parseCOSStream( pod );
> }
> else
> {
> retval = pod;
> }
> }
> {code}
> This is incorrect. PDFStreamParser is for content streams. There are no
> streams in content streams, the spec requires "All streams shall be indirect
> objects". An "indirect object" is something between obj and endobj. But
> indirect objects are not allowed in content streams: "Indirect objects and
> object references shall not be permitted at all". So parseCOSStream() will
> never be called. Thus the new code will be
> {code}
> if (c == '<')
> {
> retval = parseCOSDictionary();
> }
> {code}
> To be sure, I tested my own test set and the digitalcopora set (250000 files)
> to see whether parseCOSStream is ever called in PDFStreamParser. No it isn't.
> How did this incorrect code end up there? Don't know, but it has been there
> since 2002.
> http://pdfbox.cvs.sourceforge.net/viewvc/pdfbox/pdfbox/src/org/pdfbox/pdfparser/PDFStreamParser.java?revision=1.1&view=markup
> Why do I care about this? It is related to a posting in a mailing list by
> Andrea Vacondio who mentioned that there are several versions of
> parseCOSStream(), so I'm trying to clean up.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]