[ https://issues.apache.org/jira/browse/PDFBOX-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andreas Lehmkühler updated PDFBOX-1622: --------------------------------------- Merged into 1.8-branch in revision 1542735 > TextNormalize init not thread-safe, may lead to infinite loop > ------------------------------------------------------------- > > Key: PDFBOX-1622 > URL: https://issues.apache.org/jira/browse/PDFBOX-1622 > Project: PDFBox > Issue Type: Bug > Components: Utilities > Affects Versions: 1.0.0 > Reporter: Florent Guillaume > Assignee: Andreas Lehmkühler > Fix For: 1.8.3, 2.0.0 > > Attachments: PDFBOX-1622.patch.txt > > > TextNormalize fills a static HashMap (DIACHASH) from a method > (populateDiacHash) called by the TextNormalize constructor. > If the constructor is called from two different threads at the same time, > then the HashMap may be written by two concurrent threads which may and will > cause infinite loops. > We see the CPU at 100% and jstack shows 4 threads all stuck at: > "Thread-2" prio=10 tid=0x00007f6e94499000 nid=0x347 runnable > [0x00007f6e925d6000] > java.lang.Thread.State: RUNNABLE > at java.util.HashMap.put(HashMap.java:391) > at > org.apache.pdfbox.util.TextNormalize.populateDiacHash(TextNormalize.java:82) > at org.apache.pdfbox.util.TextNormalize.<init>(TextNormalize.java:41) > at > org.apache.pdfbox.util.PDFTextStripper.<init>(PDFTextStripper.java:193) > A patch to fix this is attached, it just moves the initialization to a static > block. > Please apply to the 1.8.3 and 2.0.0 branches. -- This message was sent by Atlassian JIRA (v6.1#6144)