[ 
https://issues.apache.org/jira/browse/PDFBOX-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824701#comment-17824701
 ] 

ASF subversion and git services commented on PDFBOX-5781:
---------------------------------------------------------

Commit 1916180 from Tilman Hausherr in branch 'pdfbox/branches/3.0'
[ https://svn.apache.org/r1916180 ]

PDFBOX-5781: avoid OOM that can happen with huge fonts, by Kim Hage

> OutOfMemoryError in FileSystemFontsProvider.scanFonts
> -----------------------------------------------------
>
>                 Key: PDFBOX-5781
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5781
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox
>    Affects Versions: 2.0.30, 3.0.1 PDFBox
>         Environment: Running inside a JVM with -Xmx512
> openjdk version "17.0.10" 2024-01-16 LTS
> OpenJDK Runtime Environment (build 17.0.10+13-LTS)
> OpenJDK 64-Bit Server VM (build 17.0.10+13-LTS, mixed mode, sharing)
> macOS Sonoma 14.3.1 (23D60)
>            Reporter: Kim Hage
>            Priority: Minor
>             Fix For: 2.0.31, 3.0.2 PDFBox, 4.0.0
>
>         Attachments: 2.0.31.patch, 3.0.patch, 
> FileSystemFontProvider_OutOfMemoryError.stacktrace, 
> FileSystemFontProvider_use_DigestInputStream.patch
>
>
> We experienced an OutOfMemoryError when calling
> PDAcroForm.getDefaultResources().getFont(COSName); with COSName\{Helv}
> The reason seemed to be that PdfBox initializes a FontCache when getFont is 
> called and this scans _all_ fonts on the system. This also loads some large 
> system fonts (AppleColorEmoji is 189,9MB). Each font gets copied into a 
> single large byte array at the location below and this causes an 
> OutOfMemoryError at this point in the code.
> {{org.apache.pdfbox.pdmodel.font.FileSystemFontProvider#addTrueTypeFontImpl:773}}
> {code:java}
> InputStream is = ttf.getOriginalData();
> byte[] ba = IOUtils.toByteArray(is);
> is.close();
> String hash = computeHash(ba); {code}
> I think this would be easily fixed by using a DigestInputStream instead of a 
> byte array to compute hashes at this location. I have tested this locally and 
> it seemed to work. Please see the attached .patch file



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to