[ 
https://issues.apache.org/jira/browse/PDFBOX-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159299#comment-13159299
 ] 

Timo Boehme commented on PDFBOX-1174:
-------------------------------------

This is a 'normal' problem with the current serial PDF parser. If an object is 
parsed it expects the start of another one (reading the object number). However 
there are a large number of PDFs in the wild containing some garbage in 
between. For a conforming parser using the XREF table this is not a problem 
since it only parses the content the XREF table refers to.
The current short term solution is to specify 'forceParsing=true' in 
PDDocument.load( FILENAME, forceParsing ). This will try to find the next 
object start if such an error like the reported one occurs.

The long term solution is a conforming parser (PDFBOX-1000) or a nearly 
conforming parser (PDFBOX-1104). I have reworked the first code of PDFBOX-1104 
so that it is now a valid replacement of the current parser. In short time I 
will post this to PDFBOX-1104.  
                
> i have problem in  BaseParser.readInt
> -------------------------------------
>
>                 Key: PDFBOX-1174
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1174
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing, PDModel
>    Affects Versions: 1.6.0
>            Reporter: ahmad makram
>             Fix For: 1.6.0
>
>
> i can't load PDF to PDDocument.load( )
> it give me this exception
> java.io.IOException: Error: Expected an integer type, actual='Fatal'
>       at org.apache.pdfbox.pdfparser.BaseParser.readInt(BaseParser.java:1384)
>       at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:517)
>       at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:184)
>       at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1069)
>       at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1036)
>       at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1007)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to