Ben Manes created PDFBOX-4668:
---------------------------------

             Summary: Add ResourceCacheFactory as global setting
                 Key: PDFBOX-4668
                 URL: https://issues.apache.org/jira/browse/PDFBOX-4668
             Project: PDFBox
          Issue Type: Task
          Components: Rendering
            Reporter: Ben Manes
         Attachments: memory.png, threads.png

Image rendering is cached by {{DefaultResourceCache}} per-document using soft 
references. As described in the [FAQ|https://pdfbox.apache.org/2.0/faq.html], 
this can lead to an {{OutOfMemoryError}} when processing, e.g. many documents 
in parallel. The configuration of this cache is per-document and it is 
initialized with the default.

{code}
// document-wide cached resources
private ResourceCache resourceCache = new DefaultResourceCache();
{code}

This requires all call sites be modified to disable it, some of which may be in 
3rd party code. The ask is to static factory to configure the default globally, 
which would return a new {{DefaultResourceCache}} when called. This would let a 
user specify a new static factory, e.g. one that returns a custom cache or 
{{null}} if disabled.

Soft references are a problematic caching scheme, which degrades poorly. It is 
very likely that the many and large image fragments causes GC promotion 
(eden=>young=>old) which requires a full GC to collect. Under memory/cpu 
pressure, the GC can devolve into a death spiral of collecting the minimal heap 
space to match its pause time constraints, leading to repeated GCs due to soft 
reference pollutions and an eventual OOME. If caching was set, it might be 
preferable to be size-based (by rough byte-size) and perhaps tied into 
{{MemoryUsageSetting}} main memory configuration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to