[ 
https://issues.apache.org/jira/browse/PDFBOX-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12922975#action_12922975
 ] 

Jeremias Maerki commented on PDFBOX-874:
----------------------------------------

Hmm, I'm still seeing the problem in trunk. When I increase to -Xmx96M, the 
OutOfMemoryError goes away. I think it could the allah2.pdf that contains 10 or 
11 subset fonts. The heap dump seems to indicate that each of those fonts takes 
up about 3MB of memory. All in all that could be filling to the 64MB.

TotalSize (TotalSize/HeapSize%) [ObjectSize] NumberOfChildObject(6'105) 
ObjectName Address
 30'210'128 (83%) [16] 1 java/util/AbstractList$Itr 0x2e74848
  30'210'112 (83%) [12] 1 java/util/ArrayList 0x2f830b8
   30'210'100 (83%) [152] 32 [Ljava/lang/Object; 0x6884698
    3'019'039 (8%) [64] 4 org/apache/pdfbox/util/TextPosition 0x2e745f8
     3'018'913 (8%) [47] 6 org/apache/pdfbox/pdmodel/font/PDType0Font 0x370e5d0
     40 (0%) [4] 1 org/apache/pdfbox/util/Matrix 0x2e745b8
     18 (0%) [16] 1 java/lang/String 0x370d400
     4 (0%) [4] 0 float[] 0x2e74640
    3'018'791 (8%) [64] 4 org/apache/pdfbox/util/TextPosition 0x5ccf430
     3'018'665 (8%) [47] 6 org/apache/pdfbox/pdmodel/font/PDType0Font 0x370e598
     40 (0%) [4] 1 org/apache/pdfbox/util/Matrix 0x5d1a9f8
     18 (0%) [16] 1 java/lang/String 0x370d388
     4 (0%) [4] 0 float[] 0x5d1aa08
    3'018'541 (8%) [64] 4 org/apache/pdfbox/util/TextPosition 0x6884970
    3'018'527 (8%) [64] 4 org/apache/pdfbox/util/TextPosition 0x58405c8
    3'018'519 (8%) [64] 4 org/apache/pdfbox/util/TextPosition 0x52b8190
    3'018'511 (8%) [64] 4 org/apache/pdfbox/util/TextPosition 0x642e250
    3'018'497 (8%) [64] 4 org/apache/pdfbox/util/TextPosition 0x4d35188
    3'018'471 (8%) [64] 4 org/apache/pdfbox/util/TextPosition 0x6884780
    3'018'455 (8%) [64] 4 org/apache/pdfbox/util/TextPosition 0x3cd0668
    3'018'391 (8%) [64] 4 org/apache/pdfbox/util/TextPosition 0x47c4e20
    21'938 (0%) [64] 4 org/apache/pdfbox/util/TextPosition 0x2e746f8
    108 (0%) [64] 4 org/apache/pdfbox/util/TextPosition 0x5840580
    108 (0%) [64] 4 org/apache/pdfbox/util/TextPosition 0x52b81d8
    108 (0%) [64] 4 org/apache/pdfbox/util/TextPosition 0x52b8148
    [..]

> OutOfMemoryError in text extraction tests
> -----------------------------------------
>
>                 Key: PDFBOX-874
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-874
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.3.1
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>             Fix For: 1.3.1
>
>
> As reported on dev@:
> TestTextStripper failed:
> testExtract(org.apache.pdfbox.util.TestTextStripper)  Time elapsed:
> 7.32 sec  <<< ERROR!
> java.lang.OutOfMemoryError: Java heap space
>        at 
> com.ibm.icu.impl.UCharacterNameReader.read(UCharacterNameReader.java:90)
> I can reproduce it by adding a <argLine>-Xmx128m</argLine> option to the 
> surefire plugin configuration in pdfbox/pom.xml. The same problem doesn't 
> occur with 1.2.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to