[jira] [Commented] (PDFBOX-2607) Failed reading embedded Font

John Hewson (JIRA) Mon, 19 Jan 2015 12:18:59 -0800

    [ 
https://issues.apache.org/jira/browse/PDFBOX-2607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282952#comment-14282952
 ]


John Hewson commented on PDFBOX-2607:
-------------------------------------

This file is not a valid PDF, because instead of embedding the relevant 
sections from the PDB file, the entire PFB file (headers and all) has been 
embedded. Acrobat handles this, so we will too, by detecting the header and 
unpacking the PFB before processing.

> Failed reading embedded Font
> ----------------------------
>
>                 Key: PDFBOX-2607
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2607
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox
>    Affects Versions: 2.0.0
>            Reporter: Holger Floerke
>            Assignee: John Hewson
>         Attachments: 0023-4834_t1_1.pdf
>
>
> Hi,
> I try to extract an image out of the attatched pdf. PDFViewer like "Acrobat 
> Reader" or the Ubuntu "Document Viewer" are able to display the PDF in a 
> correct way. pdfbox is throwing exception:
> {code}
> SCHWERWIEGEND: Can't read the embedded Type1 font GLCNUS+StempelGaramond-Roman
> java.io.IOException: Invalid start of ASCII segment
>       at org.apache.fontbox.type1.Type1Parser.parseASCII(Type1Parser.java:83)
>       at org.apache.fontbox.type1.Type1Parser.parse(Type1Parser.java:61)
>       at 
> org.apache.fontbox.type1.Type1Font.createWithSegments(Type1Font.java:70)
>       at 
> org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(PDType1Font.java:174)
>       at 
> org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:65)
>       at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:92)
>       at 
> org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:50)
>       at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:803)
>       at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:465)
>       at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:439)
>       at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:149)
>       at 
> org.apache.pdfbox.tools.ExtractImages$ImageGraphicsEngine.run(ExtractImages.java:195)
>       at org.apache.pdfbox.tools.ExtractImages.extract(ExtractImages.java:174)
>       at org.apache.pdfbox.tools.ExtractImages.run(ExtractImages.java:139)
>       at org.apache.pdfbox.tools.ExtractImages.main(ExtractImages.java:83)
>       at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:59)
> {code}
> Checked with the latest version from git.
> {code}
> java -jar pdfbox-app-2.0.0-SNAPSHOT.jar ExtractImages 
> /home/hf/Downloads/0023-4834_t1_1.pdf
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PDFBOX-2607) Failed reading embedded Font

Reply via email to