[ 
https://issues.apache.org/jira/browse/PDFBOX-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854124#comment-17854124
 ] 

Andreas Lehmkühler commented on PDFBOX-5824:
--------------------------------------------

I ran some more tests and I can't find any substantial differences concerning 
memory consumption. I'm going to remove the SmallMap optimisation from the 
trunk and the 3.0 branch.

> Allow COSDictionary.MAP_THRESHOLD to be defined as System property
> ------------------------------------------------------------------
>
>                 Key: PDFBOX-5824
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5824
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: PDModel
>    Affects Versions: 3.0.3 PDFBox, 4.0.0
>            Reporter: Jonathan Prates
>            Assignee: Andreas Lehmkühler
>            Priority: Minor
>             Fix For: 3.0.3 PDFBox, 4.0.0
>
>         Attachments: Screenshot 2024-05-21 at 11.00.25.jpg
>
>
> [COSDictionary.MAP_THRESHOLD|https://github.com/apache/pdfbox/blob/trunk/pdfbox/src/main/java/org/apache/pdfbox/cos/COSDictionary.java#L54]
>  controls which Map class is used to optimize memory usage. By default, a 
> SmallMap is used. However, if the number of items in a COSDictionary reaches 
> the MAP_THRESHOLD value (hardcoded to 1,000), the references [are copied 
> |https://github.com/apache/pdfbox/blob/trunk/pdfbox/src/main/java/org/apache/pdfbox/cos/COSDictionary.java#L208]to
>  a LinkedHashMap.
> For larger documents, where the COSDictionary is expected to be substantial 
> bigger than this limit, this copying occurs frequently. Additionally, 
> [SmallMap.keySet is not 
> efficient|https://github.com/apache/pdfbox/blob/trunk/pdfbox/src/main/java/org/apache/pdfbox/util/SmallMap.java#L281].
>  The attached screenshot shows pdfbox performance with SmallMap (in red) 
> versus using LinkedHashMap, ignoring the threshold (in green).
> *Would it be beneficial to allow MAP_THRESHOLD to be defined as a System 
> property?*
> If set to 0, LinkedHashMap would be used. If not set, it would default to the 
> current MAP_THRESHOLD value and SmallMap, not changing the current behaviour.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to