[
https://issues.apache.org/jira/browse/PDFBOX-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642972#comment-13642972
]
Maruan Sahyoun commented on PDFBOX-1407:
----------------------------------------
The reason for the original exception was that the 'classic' parser invoked by
PDDocument.load() is parsing PDFs sequentially from top to bottom. Because of
this there might be references to PDF objects which are no longer valid but
still within the PDF file. The 'non sequential parser' invoked by
PDDocument.loadNonSeq() is parsing PDFs in line with the PDF specification
which is by using the Xref entries to determine which PDF objects are valid.
At time of this writing both parsers coexist as some applications are dependent
on the 'classic' parser. This might change for the next major release.
In addition PDFBOX-1560 is addressing the infrastructure for the PDFBox website
which will then build the basis for enhancing the documentation.
> ClassCastException: COSObject cannot be cast to COSName
> -------------------------------------------------------
>
> Key: PDFBOX-1407
> URL: https://issues.apache.org/jira/browse/PDFBOX-1407
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 1.7.0
> Reporter: Lau Brino
> Assignee: Andreas Lehmkühler
>
> Parsing PDF file
> java.lang.ClassCastException: org.apache.pdfbox.cos.COSObject cannot be cast
> to org.apache.pdfbox.cos.COSName
> at
> org.apache.pdfbox.cos.COSDocument.getObjectsByType(COSDocument.java:264)
> at
> org.apache.pdfbox.cos.COSDocument.dereferenceObjectStreams(COSDocument.java:571)
> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:225)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1090)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1055)
> at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:110)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira