[ https://issues.apache.org/jira/browse/FOP-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17729746#comment-17729746 ]
Peter Radomski edited comment on FOP-2937 at 6/6/23 1:59 PM: ------------------------------------------------------------- we have the same issues with FOP 2.8 and large files. was (Author: pan4o): we have the same issues with EOP 2.8 and large files. > [PATCH]Post PDF generation, Soft reference of PDFObject in PDFReference are > not immediately garbage collected leading to excessive memory usage. > ------------------------------------------------------------------------------------------------------------------------------------------------ > > Key: FOP-2937 > URL: https://issues.apache.org/jira/browse/FOP-2937 > Project: FOP > Issue Type: Improvement > Affects Versions: 2.3, 2.4 > Reporter: Piyush Khandelwal > Priority: Major > Attachments: PDFDictionary.patch, pdfreference.patch > > > PDFReference object holds a SoftReference of PDFObject (PDFPage, PDFLabel, > PDFName etc.). > If we generate a huge PDF ; *I tried with a PDF having around 150 thousand > pages with 12 GB of RAM;* lots of these references linger around waiting for > the garbage collector to collect them. > But GC wont collect them as long as JVM is able to recover enough memory > without throwing out of memory. > Here are few metadata from my testing for further understanding of the issue > - > Stats for generating 1 PDF - > *FO size:* 2.03GB > *Generated PDF No. of Pages:* Around 150 K > RAM: 12 GB > Peak memory that reached while generation - 11.3GB > Residual memory after forced GC: 9 GB > The FO mainly contains tabular data with each pages sequence having max of > 500 rows. > On analyzing the memory dump; found lots of reference for PDFPage, PDFName > etc. > *Question - * Is there any specific reason for using SoftReference in > PDFReference class instead of WeakReference. > Testing by changing SoftReference to WeakReference in PDFReference shows > following improvements without any issue in the generation whatsoever - > Stats for Generating 5 PDF in parallel - > *FO size:* 2.03GB > *Generated PDF No. of Pages:* Around 150 K > RAM: 12 GB > Peak memory that reached while generation - 4GB > Residual memory after forced GC: 300 MB > So, by changing SoftReference to WeakReference, I was able to generate 5 PDF > having 150K pages in parallel with max 4GB Ram; without any generation > issues. > You can clearly see the performance benefits of changing to WeakReference. > But as I dont understand the complete internal details of how FOP works, I > would like to understand if we can target this change and if not what is the > reason behind using SoftReference? -- This message was sent by Atlassian Jira (v8.20.10#820010)