[ https://issues.apache.org/jira/browse/PDFBOX-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16697733#comment-16697733 ]
Tilman Hausherr commented on PDFBOX-4385: ----------------------------------------- Don't know about MacOS, but Adobe Reader likely does parsing on demand which we don't. Wouldn't it be easier if you just take that clients file and overwrite "18446744073430152624" with "2 " (which is a guess) and also tell him to complain to the creator of that PDF? Alternatively, try this: in BaseParser replace the {{throw}} with {{return COSNull.NULL}}. > IOException "expected number, actual=COSFloat{18446744073430152624}" when > loading PDF > -------------------------------------------------------------------------------------- > > Key: PDFBOX-4385 > URL: https://issues.apache.org/jira/browse/PDFBOX-4385 > Project: PDFBox > Issue Type: Bug > Components: PDModel > Affects Versions: 2.0.12 > Environment: Mac OS 10.14.1 > Reporter: Kasper Schnack > Priority: Major > > On a PDF document, which opens fine with Adobe Reader and Preview on Mac OS, > the PDDocument.load() method throws the following: > java.io.IOException: expected number, actual=COSFloat\{18446744073430152624} > at offset 33182 > at > org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:166) > at > org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryNameValuePair(BaseParser.java:279) > at > org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:212) > at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:862) > at org.apache.pdfbox.pdfparser.COSParser.parseFileObject(COSParser.java:905) > at > org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:874) > at > org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:794) > at org.apache.pdfbox.pdfparser.COSParser.parseDictObjects(COSParser.java:754) > at org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:185) > at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:220) > at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1160) > at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1057) > Sorry the material is sensitive so I can't attach it :( > > However if I cat the file it looks like this around the offset: > 48 0 obj > << /Type /StructElem /S /P /P 30 0 R /Pg 2 0 R /K 15 >> > endobj > 49 0 obj > << /Type /StructElem /S /P /P 30 0 R /Pg 2 0 R /K 16 >> > endobj > 50 0 obj > << /Type /StructElem /S /P /P 30 0 R /Pg 2 0 R /K 17 >> > endobj > 51 0 obj > << /Type /StructElem /S /P /P 30 0 R /Pg 2 0 R /K 18 >> > endobj > 52 0 obj > << /Type /StructElem /S /P /P 30 0 R /Pg 18446744073430152624 0 R /K [ 99 0 R > 100 0 R ] >> > endobj > 99 0 obj > << /Type /StructElem /S /Span /P 52 0 R /Pg 2 0 R /K 19 >> > endobj > 100 0 obj -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org