Kim Hage created PDFBOX-5781:
--------------------------------
Summary: OutOfMemoryError when building FontsCache
Key: PDFBOX-5781
URL: https://issues.apache.org/jira/browse/PDFBOX-5781
Project: PDFBox
Issue Type: Bug
Components: FontBox
Affects Versions: 2.0.30
Environment: Running inside a JVM with -Xmx512
openjdk version "17.0.10" 2024-01-16 LTS
OpenJDK Runtime Environment (build 17.0.10+13-LTS)
OpenJDK 64-Bit Server VM (build 17.0.10+13-LTS, mixed mode, sharing)
macOS Sonoma 14.3.1 (23D60)
Reporter: Kim Hage
Attachments: FileSystemFontProvider_OutOfMemoryError.stacktrace,
FileSystemFontProvider_use_DigestInputStream.patch
We experienced an OutOfMemoryError when calling
PDAcroForm.getDefaultResources().getFont(COSName); with COSName\{Helv}
The reason seemed to be that PdfBox initializes a FontCache when getFont is
called and this scans _all_ fonts on the system. This also loads some large
system fonts (AppleColorEmoji is 189,9MB). Each font gets copied into a single
large byte array at the location below and this causes an OutOfMemoryError at
this point in the code.
{{org.apache.pdfbox.pdmodel.font.FileSystemFontProvider#addTrueTypeFontImpl:773}}
{code:java}
InputStream is = ttf.getOriginalData();
byte[] ba = IOUtils.toByteArray(is);
is.close();
String hash = computeHash(ba); {code}
I think this would be easily fixed by using a DigestInputStream instead of a
byte array to compute hashes at this location. I have tested this locally and
it seemed to work. Please see the attached .patch file
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]