[ 
https://issues.apache.org/jira/browse/PDFBOX-441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frank Nestel updated PDFBOX-441:
--------------------------------

    Attachment: COSName.java

Remarks:
- The cleanResources thing is a hack, in a major environment, since it is not 
clear who should call it when. 
- We had used a ConcurrentHashMap here at some other time. This caused major 
speed improvement then (older PDFbox anyway). However we realized we would not 
stand the leak.
- What would really be grat would be a beast like 
http://www.stacksmash.com/jsr166y/ This would allow a ConcurrentHashMap using 
weak references, one could simply put all the statics in, since they are 
strongly references they will never get cleared.
- In between attached find the beast we are currently relying upon, which is 
weakreferences done right (the PDFbox 1.1 version is still leaky, since each 
COSname keeps a strong reference to its key) and with (semi-)fast read/write 
locking.
- Note that we removed the hashCode field member is a deoptimization, since 
common Java implementations have an hashCode field in their String class anyway 
(this wasn't true in earlier times, so for old environments this field might 
still be an optimization)

> remove CosName nameMap cache
> ----------------------------
>
>                 Key: PDFBOX-441
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-441
>             Project: PDFBox
>          Issue Type: Improvement
>    Affects Versions: 0.7.3
>            Reporter: Sean Bridges
>            Priority: Minor
>             Fix For: 1.2.0
>
>         Attachments: COSName.java
>
>
> The CosName class keeps a cache of all instances created in a static 
> synchronized map.  I am guessing this is for performance reasons to avoid 
> creating objects, but in our system it is causing performance problems.  We 
> are running 7 threads extracting text from pdf's, and we can see a large 
> number of conflicts reading from nameMap.
> The CosName map is also a potential memory leak, which forces users to 
> periodically clear it, as noted in PDFBOX-351
> Can nameMap be removed altogether?
> At the least, if PDSimpleFont replaced, 
>  COSName.getPDFName( "FontDescriptor" ) 
> with 
> COSName.FONT_DESC
> It would reduce contention.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to