Kim Hage created PDFBOX-5781:
--------------------------------

             Summary: OutOfMemoryError when building FontsCache
                 Key: PDFBOX-5781
                 URL: https://issues.apache.org/jira/browse/PDFBOX-5781
             Project: PDFBox
          Issue Type: Bug
          Components: FontBox
    Affects Versions: 2.0.30
         Environment: Running inside a JVM with -Xmx512
openjdk version "17.0.10" 2024-01-16 LTS
OpenJDK Runtime Environment (build 17.0.10+13-LTS)
OpenJDK 64-Bit Server VM (build 17.0.10+13-LTS, mixed mode, sharing)
macOS Sonoma 14.3.1 (23D60)
            Reporter: Kim Hage
         Attachments: FileSystemFontProvider_OutOfMemoryError.stacktrace, 
FileSystemFontProvider_use_DigestInputStream.patch

We experienced an OutOfMemoryError when calling
PDAcroForm.getDefaultResources().getFont(COSName); with COSName\{Helv}

The reason seemed to be that PdfBox initializes a FontCache when getFont is 
called and this scans _all_ fonts on the system. This also loads some large 
system fonts (AppleColorEmoji is 189,9MB). Each font gets copied into a single 
large byte array at the location below and this causes an OutOfMemoryError at 
this point in the code.

{{org.apache.pdfbox.pdmodel.font.FileSystemFontProvider#addTrueTypeFontImpl:773}}
{code:java}
InputStream is = ttf.getOriginalData();
byte[] ba = IOUtils.toByteArray(is);
is.close();
String hash = computeHash(ba); {code}
I think this would be easily fixed by using a DigestInputStream instead of a 
byte array to compute hashes at this location. I have tested this locally and 
it seemed to work. Please see the attached .patch file



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to