Kim Hage created PDFBOX-5781: -------------------------------- Summary: OutOfMemoryError when building FontsCache Key: PDFBOX-5781 URL: https://issues.apache.org/jira/browse/PDFBOX-5781 Project: PDFBox Issue Type: Bug Components: FontBox Affects Versions: 2.0.30 Environment: Running inside a JVM with -Xmx512 openjdk version "17.0.10" 2024-01-16 LTS OpenJDK Runtime Environment (build 17.0.10+13-LTS) OpenJDK 64-Bit Server VM (build 17.0.10+13-LTS, mixed mode, sharing) macOS Sonoma 14.3.1 (23D60) Reporter: Kim Hage Attachments: FileSystemFontProvider_OutOfMemoryError.stacktrace, FileSystemFontProvider_use_DigestInputStream.patch
We experienced an OutOfMemoryError when calling PDAcroForm.getDefaultResources().getFont(COSName); with COSName\{Helv} The reason seemed to be that PdfBox initializes a FontCache when getFont is called and this scans _all_ fonts on the system. This also loads some large system fonts (AppleColorEmoji is 189,9MB). Each font gets copied into a single large byte array at the location below and this causes an OutOfMemoryError at this point in the code. {{org.apache.pdfbox.pdmodel.font.FileSystemFontProvider#addTrueTypeFontImpl:773}} {code:java} InputStream is = ttf.getOriginalData(); byte[] ba = IOUtils.toByteArray(is); is.close(); String hash = computeHash(ba); {code} I think this would be easily fixed by using a DigestInputStream instead of a byte array to compute hashes at this location. I have tested this locally and it seemed to work. Please see the attached .patch file -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org