[ https://issues.apache.org/jira/browse/PDFBOX-5832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851113#comment-17851113 ]
Andreas Lehmkühler commented on PDFBOX-5832: -------------------------------------------- I agree with Tilmans theory. The indirect object isn't a known object and can't be dereferenced which ends up in a null-value if one tries to get the object from the COSObject instance. Unlike Tilman I like his proposal to handle the null similar to COSNull ;-) But instead of extending the last if checking fpr COSNull I'd rather add that special case to the beginning of that method when COSObject are processed. Simply write COSNull if the base value of toplevel COSObjects is null due to a broken reference and add some debug-logging. This would handle orphaned indirect objects in a lenient way but doesn't swallow other possible issues. WDYT? > Error when writing a document with OutlineItems containing null SE objects > -------------------------------------------------------------------------- > > Key: PDFBOX-5832 > URL: https://issues.apache.org/jira/browse/PDFBOX-5832 > Project: PDFBox > Issue Type: Bug > Affects Versions: 3.0.2 PDFBox, 3.0.3 PDFBox > Reporter: Arthur Renard > Priority: Major > Attachments: Screenshot from 2024-05-30 15-49-52.png, > image-2024-05-30-11-58-21-024.png, image-2024-05-30-12-00-33-290.png, > image-2024-05-30-12-01-30-708.png, image-2024-05-30-14-14-49-237.png, > screenshot-1.png, screenshot-2.png > > > Hello, > I'm reaching out to you because we encountered some errors when loading > documents after updating PDFBox to v3.0.2. > I cloned the project in local env and tried with v3.0.3-SNAPSHOT but the same > error appeared. > When trying to save my document using the PDDocument save() method, the > following exception occurs: > > {code:java} > java.io.IOException: Error: Unknown type in object stream:COSObject{2240, 0} > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:238) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObjectsToStream(COSWriterObjectStream.java:119) > at > org.apache.pdfbox.pdfwriter.COSWriter.doWriteBodyCompressed(COSWriter.java:499) > at > org.apache.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:1307) > {code} > I can't share the document used for testing because it contains sensitive > information, but after debugging a bit I found that it contains OutlineItems > with null SE objects and that is apparently what's causing the error: > !image-2024-05-30-11-58-21-024.png! > !image-2024-05-30-12-00-33-290.png! > > The document was produced using Adobe Acrobat Pro 2020 20.5 30636 > !image-2024-05-30-12-01-30-708.png! > Unfortunately I don't have access to this software and I coulnd't recreate a > similar document to reproduce the issue. > > I found a user with a similar issue in your mailing lists : > [https://www.mail-archive.com/users@pdfbox.apache.org/msg13258.html] > > Let me know if you need more details regarding this problem. > > Also, if you are able to create a test document that would reproduce the > issue, would you please mind sharing it? It would be of great help. > Or if you have way to anonymize a document without altering its structure. > > Many thanks in advance! -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org