---------- Forwarded message ---------
Von: Joshua <[email protected]>
Date: Di., 19. Mai 2026, 10:37
Subject: High object numbers trigger OOME during save operation
To: <[email protected]>


Hi there,

We recently encountered a PDF document that contains unusually high object
numbers in its source. Here is a non-contiguous excerpt:
>
> <</Info 2 0 R /Root 1 0 R /Encrypt 1151 0 R /Prev 213301232 0 obj
> 0021353438 0000087 0 obj
> 002135350785 0 obj
> 0021353501209 0 obj
> 11521216 0 obj
> 000001241 0 obj
> 0000000000 65531225 0 obj
> 00213543971214 0 obj


The PDF has the following restrictions:

> PDF Version: 1.7 extension level 8
> R = 6
> P = -1052
> User password =
> Supplied password is user password
> extract for accessibility: allowed
> extract for any purpose: not allowed
> print low resolution: allowed
> print high resolution: allowed
> modify document assembly: not allowed
> modify forms: allowed
> modify annotations: allowed
> modify other: not allowed
> modify anything: not allowed
> stream encryption method: AESv3
> string encryption method: AESv3
> file encryption method: AESv3
> File is not linearized
> No syntax or stream encoding errors found; the file may still contain
> errors that qpdf cannot detect


The PDF contains:

   - Several hundred pages
   - 1282 objects
   - Size: ~25MB


We are using PDFBox (currently version 3.0.3) to remove restrictions and
save the file as unrestricted:
document.setAllSecurityToBeRemoved(true);
document.save(unrestrictedFile, CompressParameters.NO_COMPRESSION);

For this type of document, saving consistently triggers an OutOfMemoryError
in the JVM, even with more than 100 GB of RAM. Here is the stack trace:

> java.lang.OutOfMemoryError: Java heap space
> at java.base/java.util.Arrays.copyOf(Arrays.java:3481)
> at java.base/java.util.ArrayList.grow(ArrayList.java:237)
> at java.base/java.util.ArrayList.grow(ArrayList.java:244)
> at java.base/java.util.ArrayList.add(ArrayList.java:454)
> at java.base/java.util.ArrayList.add(ArrayList.java:467)
> at
> org.apache.pdfbox.pdfwriter.COSWriter.fillGapsWithFreeEntries(COSWriter.java:820)
> at
> org.apache.pdfbox.pdfwriter.COSWriter.doWriteXRefTable(COSWriter.java:761)
> at
> org.apache.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:1326)
> at org.apache.pdfbox.cos.COSDocument.accept(COSDocument.java:429)
> at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1586)
> at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1462)
> at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1040)
> at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:990)


Due to the malformed object numbering in the PDF, the freeNumbers ArrayList
in COSWriter grows excessively, as it attempts to store every integer up to
the highest object number. This eventually causes memory allocation to
exceed available heap space.

We understand that the PDF itself is malformed. However, we would like to
ask whether it would be possible to add a pre-check in PDFBox to prevent
implausible object-number ranges from causing uncontrolled OOM errors. From
our perspective, this behavior represents a potential attack surface:
specially crafted documents could be used to trigger a denial-of-service
condition and potentially disrupt an entire system.

Thank you for your work on PDFBox and for considering this request.

Best regards,
Joshua

Reply via email to