[
https://issues.apache.org/jira/browse/PDFBOX-6194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18078360#comment-18078360
]
Andreas Lehmkühler edited comment on PDFBOX-6194 at 5/5/26 6:24 AM:
--------------------------------------------------------------------
I dug deeper into this and the behaviour is strange. Everything works fine if
the sample pdfs are processed separately or in reversed order.
However I've found the commit which introduced the issue to 3.0 :
[r1929914|https://svn.apache.org/r1929914]
I haven't any idea yet, what went wrong, still investigating.
Update
Disabling compression during writing fixes the issue, no surprise, as the code
in question is about compressed object streams
Reverting the static instance of the used font to a method local instance fixes
the issue as well. Somehow the objects of both pdfs are "interconnected" and in
some case this may lead to the described issue.
was (Author: lehmi):
I dug deeper into this and the behaviour is strange. Everything works fine if
the sample pdfs are processed separately or in reversed order.
However I've found the commit which introduced the issue to 3.0 :
[r1929914|https://svn.apache.org/r1929914]
I haven't any idea yet, what went wrong, still investigating.
> COSStream becomes COSDictionary after save — shared XObject reference
> replaced by Font
> --------------------------------------------------------------------------------------
>
> Key: PDFBOX-6194
> URL: https://issues.apache.org/jira/browse/PDFBOX-6194
> Project: PDFBox
> Issue Type: Bug
> Components: PDModel
> Affects Versions: 3.0.7 PDFBox
> Environment: Windows Server 2016, Java 21, PDFBox 3.0.7
> Reporter: HABA
> Priority: Major
> Attachments: 000012.pdf, 000016.pdf, 000025.pdf, bad-000016.pdf,
> bad-000025.pdf, image-2026-04-20-12-33-11-057.png,
> image-2026-04-20-13-52-20-247.png, image-2026-04-20-13-52-44-302.png,
> image-2026-05-01-19-07-19-330.png, screenshot-1.png
>
>
> Hi,
> `document.save()` corrupts an `/XObject` on page 3 of a 3-page PDF.
> Before save:
> - `Obj5` = `COSStream` (ImageMask)
> After save:
> - `Obj5` = `COSDictionary` (Courier font)
> Pages 1–2 are unaffected. All pages share the same indirect XObject refs
> (`Obj4`, `Obj5`).
> Flow:
> - load PDF
> - render pages via `PDFRenderer.renderImageWithDPI()`
> - append invisible OCR text using `PDPageContentStream` (AppendMode.APPEND,
> Courier)
> - save document → corruption occurs
> Result:
> java.io.IOException: Unexpected object type: COSDictionary
>
> Reproduced consistently on:
> * Windows Server 2016, Java 21, PDFBox 3.0.7
> Not reproducible on:
> * Windows 11, Java 21 (same code + input)
> Likely related to shared indirect XObject being overwritten during save.
> Cannot share original PDF (confidential), but can test with synthetic
> reproducer if needed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]