[ https://issues.apache.org/jira/browse/PDFBOX-5832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851137#comment-17851137 ]
Andreas Lehmkühler edited comment on PDFBOX-5832 at 5/31/24 3:32 PM: --------------------------------------------------------------------- Yes, right after line 187 {code}base = ((COSObject) object).getObject();{code} The objects 2240, 2241 and 2242 are unknown to the pdf as the last known object is 2239, so that the indirect references to that objects within the compressed object stream ended up in the middle of nowhere. It shouldn't make any difference whether those references are part of a compressed stream or not. Once the writer tries to write the object containing that reference it stumbles upon the broken reference and fails. May I add the change? Without having a hand on that specific pdf ii is just a theory, but I'm quite sure to be right. The pdf or better its creator is to blame not pdfbox was (Author: lehmi): Yes, right after line 187 {code}base = ((COSObject) object).getObject();{code} The objects 2240, 2241 and 2242 are unknown to the pdf as the last known object is 2239, so that the indirect references to that objects within the compressed object stream ended up in the middle of nowhere. It shouldn't make any difference whether those references are part of a compressed stream or not. Once the writer tries to write the object containing that reference it stumbles upon the broken reference and fails. Without having a hand on that specific pdf ii is just a theory, but I'm quite sure to be right. The pdf or better its creator is to blame not pdfbox > Error when writing a document with OutlineItems containing null SE objects > -------------------------------------------------------------------------- > > Key: PDFBOX-5832 > URL: https://issues.apache.org/jira/browse/PDFBOX-5832 > Project: PDFBox > Issue Type: Bug > Affects Versions: 3.0.2 PDFBox, 3.0.3 PDFBox > Reporter: Arthur Renard > Priority: Major > Attachments: Screenshot from 2024-05-30 15-49-52.png, > image-2024-05-30-11-58-21-024.png, image-2024-05-30-12-00-33-290.png, > image-2024-05-30-12-01-30-708.png, image-2024-05-30-14-14-49-237.png, > screenshot-1.png, screenshot-2.png > > > Hello, > I'm reaching out to you because we encountered some errors when loading > documents after updating PDFBox to v3.0.2. > I cloned the project in local env and tried with v3.0.3-SNAPSHOT but the same > error appeared. > When trying to save my document using the PDDocument save() method, the > following exception occurs: > > {code:java} > java.io.IOException: Error: Unknown type in object stream:COSObject{2240, 0} > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:238) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObjectsToStream(COSWriterObjectStream.java:119) > at > org.apache.pdfbox.pdfwriter.COSWriter.doWriteBodyCompressed(COSWriter.java:499) > at > org.apache.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:1307) > {code} > I can't share the document used for testing because it contains sensitive > information, but after debugging a bit I found that it contains OutlineItems > with null SE objects and that is apparently what's causing the error: > !image-2024-05-30-11-58-21-024.png! > !image-2024-05-30-12-00-33-290.png! > > The document was produced using Adobe Acrobat Pro 2020 20.5 30636 > !image-2024-05-30-12-01-30-708.png! > Unfortunately I don't have access to this software and I coulnd't recreate a > similar document to reproduce the issue. > > I found a user with a similar issue in your mailing lists : > [https://www.mail-archive.com/users@pdfbox.apache.org/msg13258.html] > > Let me know if you need more details regarding this problem. > > Also, if you are able to create a test document that would reproduce the > issue, would you please mind sharing it? It would be of great help. > Or if you have way to anonymize a document without altering its structure. > > Many thanks in advance! -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org