[ 
https://issues.apache.org/jira/browse/PDFBOX-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452233#comment-13452233
 ] 

Andreas Lehmkühler commented on PDFBOX-1407:
--------------------------------------------

Lau sent me the pdf in question via private mail and I can confirm that my 
first guess was correct. Using the non-sequential parser solves the issue.

@Lau
You should ask the TIKA-people to implement some sort of switch so that one can 
choose the parser.
                
> ClassCastException: COSObject cannot be cast to COSName
> -------------------------------------------------------
>
>                 Key: PDFBOX-1407
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1407
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.7.0
>            Reporter: Lau Brino
>
> Parsing PDF file
> java.lang.ClassCastException: org.apache.pdfbox.cos.COSObject cannot be cast 
> to org.apache.pdfbox.cos.COSName
>         at 
> org.apache.pdfbox.cos.COSDocument.getObjectsByType(COSDocument.java:264)
>         at 
> org.apache.pdfbox.cos.COSDocument.dereferenceObjectStreams(COSDocument.java:571)
>         at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:225)
>         at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1090)
>         at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1055)
>         at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:110)
>         at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
>         at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to