[ 
https://issues.apache.org/jira/browse/PDFBOX-5879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882282#comment-17882282
 ] 

ASF subversion and git services commented on PDFBOX-5879:
---------------------------------------------------------

Commit 1920728 from Tilman Hausherr in branch 'pdfbox/branches/3.0'
[ https://svn.apache.org/r1920728 ]

PDFBOX-5879: avoid ClassCastException

> Regression from PDFBOX-5841: Text extraction with rotation magic fails for 
> PDF with multiple content streams in a page
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-5879
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5879
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 3.0.3 PDFBox
>            Reporter: Gábor Stefanik
>            Priority: Major
>         Attachments: MVM_Aram_augusztus.pdf
>
>
> {code:java}
> java -jar pdfbox-app-3.0.3.jar export:text -console -rotationMagic 
> -i="MVM_Aram_augusztus.pdf" {code}
> fails with the following error:
> {code:java}
> java.lang.ClassCastException: class org.apache.pdfbox.cos.COSObject cannot be 
> cast to class org.apache.pdfbox.cos.COSArray (org.apache.pdfbox.cos.COSObject 
> and org.apache.pdfbox.cos.COSArray are in unnamed module of loader 'app')
>         at 
> org.apache.pdfbox.tools.ExtractText.extractPages(ExtractText.java:336)
>         at org.apache.pdfbox.tools.ExtractText.call(ExtractText.java:225)
>         at org.apache.pdfbox.tools.ExtractText.call(ExtractText.java:62)
>         at picocli.CommandLine.executeUserObject(CommandLine.java:2045)
>         at picocli.CommandLine.access$1500(CommandLine.java:148)
>         at 
> picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2465)
>         at picocli.CommandLine$RunLast.handle(CommandLine.java:2457)
>         at picocli.CommandLine$RunLast.handle(CommandLine.java:2419)
>         at 
> picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2277)
>         at picocli.CommandLine$RunLast.execute(CommandLine.java:2421)
>         at picocli.CommandLine.execute(CommandLine.java:2174)
>         at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:76) {code}
> The same command succeeds in 3.0.2.
> The triggering PDF can be downloaded from 
> [https://nagykorosiallatmentok.hu/wp-content/uploads/2023/09/MVM_Aram_augusztus.pdf,]
>  and is also attached.
> The root cause appears to be this change: 
> [https://github.com/apache/pdfbox/commit/b03d12d56dd74e5c52d80cf0b80c5bfb1f3209b2]
>  from PDFBOX-5841



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to