[ https://issues.apache.org/jira/browse/PDFBOX-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500167#comment-14500167 ]
ASF subversion and git services commented on PDFBOX-2762: --------------------------------------------------------- Commit 1674353 from [~tilman] in branch 'pdfbox/trunk' [ https://svn.apache.org/r1674353 ] PDFBOX-2762: remove parseCOSStream() call from PDFStreamParser, because there are no streams in content streams > remove parseCOSStream() call from PDFStreamParser > ------------------------------------------------- > > Key: PDFBOX-2762 > URL: https://issues.apache.org/jira/browse/PDFBOX-2762 > Project: PDFBox > Issue Type: Task > Components: Parsing > Affects Versions: 2.0.0 > Reporter: Tilman Hausherr > Assignee: Tilman Hausherr > Fix For: 2.0.0 > > > This code is found in PDFStreamParser > {code} > if (c == '<') > { > COSDictionary pod = parseCOSDictionary(); > skipSpaces(); > if ((char)pdfSource.peek() == 's') > { > retval = parseCOSStream( pod ); > } > else > { > retval = pod; > } > } > {code} > This is incorrect. PDFStreamParser is for content streams. There are no > streams in content streams, the spec requires "All streams shall be indirect > objects". An "indirect object" is something between obj and endobj. But > indirect objects are not allowed in content streams: "Indirect objects and > object references shall not be permitted at all". So parseCOSStream() will > never be called. Thus the new code will be > {code} > if (c == '<') > { > retval = parseCOSDictionary(); > } > {code} > To be sure, I tested my own test set and the digitalcopora set (250000 files) > to see whether parseCOSStream is ever called in PDFStreamParser. No it isn't. > How did this incorrect code end up there? Don't know, but it has been there > since 2002. > http://pdfbox.cvs.sourceforge.net/viewvc/pdfbox/pdfbox/src/org/pdfbox/pdfparser/PDFStreamParser.java?revision=1.1&view=markup > Why do I care about this? It is related to a posting in a mailing list by > Andrea Vacondio who mentioned that there are several versions of > parseCOSStream(), so I'm trying to clean up. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org