[ https://issues.apache.org/jira/browse/PDFBOX-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249741#comment-15249741 ]
Petras commented on PDFBOX-3321: -------------------------------- Not always. It may also be called by _BaseParser.parseCOSStream(:490)_ when the value of */Length* entry is indirect. When dealing with visual signature, temporary COSDocument is created in memory by PDVisibleSigProperties#buildSignature (via PDFTemplateCreator#buildPDF) which is later parsed. See, for example, stack trace and what _RandomAccessFileOutputStream.write_ get begining from _SignatureOptions_: {code} written 28 bytes (total=28): 7120312030203020312030203020636D202F6E3020446F20510A0D0A at org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:114) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:141) at org.apache.pdfbox.pdfparser.EndstreamOutputStream.flush(EndstreamOutputStream.java:137) at org.apache.pdfbox.pdfparser.BaseParser.readUntilEndStream(BaseParser.java:732) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:490) at org.apache.pdfbox.pdfparser.VisualSignatureParser.parseObject(VisualSignatureParser.java:234) at org.apache.pdfbox.pdfparser.VisualSignatureParser.parse(VisualSignatureParser.java:73) at org.apache.pdfbox.pdmodel.interactive.digitalsignature.SignatureOptions.setVisualSignature(SignatureOptions.java:66) at org.apache.pdfbox.pdmodel.interactive.digitalsignature.SignatureOptions.setVisualSignature(SignatureOptions.java:81) ... {code} There _BaseParser_ reads appearance streams created by _PDVisibleSigBuilder#injectAppearanceStreams_: {code} String holderFormComment = "q 1 0 0 1 0 0 cm /" + innerFormName + " Do Q \n"; ... appendRawCommands(pdfStructure.getHolderFormStream().createOutputStream(), holderFormComment); ... // void appendRawCommands(OutputStream os, String commands) os.write(commands.getBytes("UTF-8")); os.close(); {code} > ASCII stream data size is increased when written > ------------------------------------------------ > > Key: PDFBOX-3321 > URL: https://issues.apache.org/jira/browse/PDFBOX-3321 > Project: PDFBox > Issue Type: Bug > Components: Parsing > Affects Versions: 1.8.11 > Reporter: Petras > Priority: Critical > Labels: signature, streams > > This bug is quite complicated and was discovered when visual signatures were > used along with parsing of the document with Preflight before signing. > I dig a bit trying to investigate this bug nature as the bug does not appear > regularly. It appears that it manifests itself under such conditions: > # Document is parsed when opened (ex. by Preflight) and entry with number > value is detected, which is marked as direct by > _BaseParser.parseCOSDictionary(BaseParser.java:381)_; > # Stream with ASCII filter is created or present in document having the same > length as the number found in step 1 (ex. when visual signature is created by > calling _SignatureOptions#setVisualSignature()_); > # While written _COSWriter_ checks the stream length by its _direct_ > property. If */Length* is present and is flaged as direct, it is not > recalculated when written. > As a result, when doucument is written, the stream length is changed: written > stream is increased by 2 bytes, while */Length* entry still indicate the > original length. That violates PDF requirements for the */Length* entry: > bq. The number of bytes from the beginning of the line following the keyword > *stream* to the last byte just before the keyword *endstream*. (There may be > an additional EOL marker, preceding *endstream*, that is not included in the > count and is not logically part of the stream data.) > These bugs complement to this effect: > * PDFBOX-3320 & PDFBOX-2685, as number used for stream length is marked as > direct; > * _BaseParser.parseCOSStream(BaseParser.java:490)_ parses ASCII stream using > _EndstreamOutputStream_ class, which always includes all characters till the > *endstream* keyword, though CRLF preceding *endstream* is not part of the > stream data; > * _COSWriter_ checks the stream length by its _direct_ property, even though > it could be set as indirect via _COSObject_. As it is flaged as direct due to > mutability of cached COSNumber, the stream length is not recalculated. > As _COSWriter_ always adds CRLF at the end of the stream, the final stream > data increased by 2 bytes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org