[jira] [Commented] (PDFBOX-5366) Unhandled IOException thrown from BaseParser creates issue in PDFStreamEngine.processStreamOperators

Alban Fonrouge (Jira) Tue, 18 Jan 2022 23:47:12 -0800


    [ 
https://issues.apache.org/jira/browse/PDFBOX-5366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17478376#comment-17478376
 ]


Alban Fonrouge commented on PDFBOX-5366:
----------------------------------------

Here is the stream that's causing an issue, the array preceding the TJ operator 
is incorrectly formed, it has 2 [[:
{code:java}
0.400000  0.400000  0.400000 RG
0.400000  0.400000  0.400000 rg
BT
0.766272 0.642516 -0.642516 0.766272 311.82 767.81  Tm
0 Tc
/F1 12.00  Tf
0 Tr
1 w
[[(Z9C:A&"B0)]TJ
1 0 0 1 0 0 Tm
ET
0 0 0 rg
0 0 0 RG{code}
Here are the warnings, including the one logged by the catch I added:
{code:java}
 Warning  [BaseParser] Skipped unexpected dir object = 'TJ' at offset 174 
(start offset: 172)
 Warning  [BaseParser] Corrupt array element at offset 174, start offset: 158
 Warning  [BaseParser] Skipped unexpected dir object = 'Tm' at offset 190 
(start offset: 188)
 Warning  [BaseParser] Corrupt array element at offset 190, start offset: 158
 Warning  [BaseParser] Skipped unexpected dir object = 'ET' at offset 194 
(start offset: 192)
 Warning  [BaseParser] Corrupt array element at offset 194, start offset: 158
 Warning  [BaseParser] Skipped unexpected dir object = 'rg' at offset 204 
(start offset: 202)
 Warning  [BaseParser] Corrupt array element at offset 204, start offset: 158
 Warning  [PDFStreamEngine] Error while parsing next token, cannot continue 
processing stream
 java.io.IOException: object reference 0 0 R at offset 213 in content stream
    at 
org.apache.pdfbox.pdfparser.BaseParser.getObjectFromPool(BaseParser.java:186)
    at org.apache.pdfbox.pdfparser.BaseParser.parseCOSArray(BaseParser.java:642)
    at 
org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:167)
    at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:486)
    at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:461)
    at 
org.apache.pdfbox.contentstream.PDFStreamEngine.showForm(PDFStreamEngine.java:174)
    at org.apache.pdfbox.rendering.PageDrawer.showForm(PageDrawer.java:1558)
    at 
org.apache.pdfbox.contentstream.operator.graphics.DrawObject.process(DrawObject.java:85)
    at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:853)
    at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:480)
    at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:461)
    at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:147)
    at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:282)
    at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:355)
    at 
org.apache.pdfbox.debugger.pagepane.PagePane$RenderWorker.doInBackground(PagePane.java:449)
    at 
org.apache.pdfbox.debugger.pagepane.PagePane$RenderWorker.doInBackground(PagePane.java:431)
    at javax.swing.SwingWorker$1.call(SwingWorker.java:295)
    at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266)
    at java.util.concurrent.FutureTask.run(FutureTask.java)
    at javax.swing.SwingWorker.run(SwingWorker.java:334)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748){code}
Before the code change, the exception was intercepted at 
{{PDFStreamEngine.processOperator}} (line 9 of the stack), so the 
{{popResources}} and other state restore operations in 
{{PDFStreamEngine.processStream}} were not done correctly.

It could be better to add try/finally blocks around every call to 
{{PDFStreamEngine.processStreamOperators,}} at the places where resources are 
changed.

> Unhandled IOException thrown from BaseParser creates issue in 
> PDFStreamEngine.processStreamOperators
> ----------------------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-5366
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5366
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 2.0.25
>            Reporter: Alban Fonrouge
>            Priority: Major
>
> We had a document where the fonts were listed in PDFDebugger under Page: 1 / 
> Resources / Fonts, but not found during rendering.
> The issue is in SetFontAndSize, which should also check the resources at the 
> page level. I have attached a patch, but cannot provide a test file as this 
> was a confidential file.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (PDFBOX-5366) Unhandled IOException thrown from BaseParser creates issue in PDFStreamEngine.processStreamOperators

Reply via email to