[ 
https://issues.apache.org/jira/browse/PDFBOX-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler resolved PDFBOX-1743.
----------------------------------------

       Resolution: Fixed
    Fix Version/s: 2.0.0
                   1.8.3
         Assignee: Andreas Lehmkühler

I fixed the issue in revision 1530740 based on Tilmans proposal.

Obviously there are different kinds of CFF fonts. Some are supported other not. 
Two of the unsupported (true type only fonts and true type collection fonts) 
can be detected now.

This fix will be part of the next bugfix release.

Thanks for the report and the contribution!

> OutOfMemoryError in fontbox
> ---------------------------
>
>                 Key: PDFBOX-1743
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1743
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox, PDModel
>            Reporter: Stanea Paul
>            Assignee: Andreas Lehmkühler
>             Fix For: 1.8.3, 2.0.0
>
>         Attachments: document1.pdf
>
>
> When trying to index a pdf document in solr, pdfbox (fontbox) throws 
> java.lang.OutOfMemoryError: Java heap space exception
> This is the stack trace:
> SEVERE: Full Import failed:java.lang.RuntimeException: 
> java.lang.RuntimeException: 
> org.apache.solr.handler.dataimport.DataImportHandlerException:
> java.lang.OutOfMemoryError: Java heap space at 
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:273)
> at 
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
>  at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448) 
> at 
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
> Caused by: java.lang.RuntimeException: 
> org.apache.solr.handler.dataimport.DataImportHandlerException:
> java.lang.OutOfMemoryError: Java heap space at 
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:413)
> at 
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:326) 
> at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:234)
> ... 3 more
> Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: 
> java.lang.OutOfMemoryError:
> Java heap space at 
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:542)
> at 
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:411)
>  ... 5
> more
> Caused by: java.lang.OutOfMemoryError: Java heap space at 
> org.apache.fontbox.cff.IndexData.initData(IndexData.java:95)
> at org.apache.fontbox.cff.CFFParser.readIndexData(CFFParser.java:152) at 
> org.apache.fontbox.cff.CFFParser.parse(CFFParser.java:103)
> at org.apache.pdfbox.pdmodel.font.PDType1CFont.load(PDType1CFont.java:322) at 
> org.apache.pdfbox.pdmodel.font.PDType1CFont.<init>(PDType1CFont.java:104)
> at org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(PDType1Font.java:162) at 
> org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:92)
> at org.apache.pdfbox.pdmodel.PDResources.getFonts(PDResources.java:203) at 
> org.apache.pdfbox.util.PDFStreamEngine.getFonts(PDFStreamEngine.java:604)
> at org.apache.pdfbox.util.operator.SetTextFont.process(SetTextFont.java:54) 
> at 
> org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:554)
> at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:268)
>  at
>  
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235)
>  at 
> org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:215)
> at 
> org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:455) 
> at 
> org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:379)
> at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:335) 
> at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:66)
> at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:153) at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> at 
> org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:127)
> at 
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243)
> at
>  
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:472)
>  ... 6 more



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to