[ https://issues.apache.org/jira/browse/PDFBOX-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14144870#comment-14144870 ]
Daniel Scheibe commented on PDFBOX-2350: ---------------------------------------- I think i nailed it down or at least i might have found out what potentially goes wrong: Inside the {code}org.apache.pdfbox.pdmodel.font.PDType1Font{code} constructor {code}public PDType1Font(COSDictionary fontDictionary) {code} the try catch block consisting of {code:java} COSStream stream = fontFile.getStream(); int length1 = stream.getInt(COSName.LENGTH1); int length2 = stream.getInt(COSName.LENGTH2); // the PFB embedded as two segments back-to-back byte[] bytes = fontFile.getByteArray(); byte[] segment1 = Arrays.copyOfRange(bytes, 0, length1); byte[] segment2 = Arrays.copyOfRange(bytes, length1, length1 + length2); t1 = Type1Font.createWithSegments(segment1, segment2); {code} is either in my case reporting an incorrect value for length1 or more likely length1 is taken into account incorrectly, i guess the following should be correct instead: {code} byte[] segment1 = Arrays.copyOfRange(bytes, 0, length1 - 1); byte[] segment2 = Arrays.copyOfRange(bytes, length1 - 1, length1 + length2); {code} I mean isn't the 0 .. length1 one byte too much? While debugging and dumping out the arrays i noticed that the byte array segment2 is always missing the first byte and when i changed it as shown above it works fine, all embedded fonts are rendered correctly and the image looks just fine. Anyways, i have no clue if my change makes sense or might have a larger impact and something else goes south by that but it might help you guys to come up with a suitable fix? > Type1 Parser hangs indefinitely > ------------------------------- > > Key: PDFBOX-2350 > URL: https://issues.apache.org/jira/browse/PDFBOX-2350 > Project: PDFBox > Issue Type: Bug > Components: FontBox > Affects Versions: 2.0.0 > Environment: Windows 7, JDK 1.7.0_51-b13 > Reporter: Daniel Scheibe > Attachments: PDFBOX-2350-289451-endless.pdf > > > When rendering the first page of my pdf document the Type1Parser > (org.apache.fontbox.type1.Type1Parser) hangs in a loop in > {{parseBinary(byte[] bytes) throws IOException}} > and "kills" our rendering pipeline. Please find the loop that hangs below: > // find /Private dict > while (!lexer.peekToken().getText().equals("Private")) > { > lexer.nextToken(); > } > There is no token named "Private" ever in the list of returned tokens > (they're empty all the time). > Furthermore going deeper into the source code it seems the class reading the > tokens (Type1Lexer) does never finally advance the buffer position and always > returns an empty name token in the readToken(Token prevToken) method. > Looking at the decrypted buffer i cannot get something useful out of it based > on my current understanding. > Unfortunately i cannot provide the pdf in question as it contains confidental > data. > Acrobat Reader XI Version 11.0.08 renders the document just fine. > In addition it seems the pdf was encrypted (40-Bit RC4) with an empty > password and says it's pdf version 1.5. > Does this provide enough information or can i do anything else to help > nailing this one down? > I guess this might be a pdf document structure/feature that is not yet > supported completely but at least pdfbox should throw an exception instead of > failing "silently"... -- This message was sent by Atlassian JIRA (v6.3.4#6332)