[ 
https://issues.apache.org/jira/browse/TIKA-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17052332#comment-17052332
 ] 

Tilman Hausherr edited comment on TIKA-3059 at 3/5/20, 4:57 PM:
----------------------------------------------------------------

And yes it's (partly) a feature, getExtGStateNames() returns the resource names 
(keyset) without checking whether there is a value attached 😂

However the PDFBox code in ImageGraphicsEngine is wrong, it should have a check.


was (Author: tilman):
And yes it's a feature, getExtGStateNames() returns the resource names (keyset) 
without checking whether there is a value attached 😂

> New NPE in ImageGraphicsEngine
> ------------------------------
>
>                 Key: TIKA-3059
>                 URL: https://issues.apache.org/jira/browse/TIKA-3059
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>         Attachments: 
> 40f86f17048dbe5402a98b2c2c19161e813f99710137a96d8aca15d0c4603183, Screen Shot 
> 2020-03-05 at 9.40.27 AM.png
>
>
> When we run the new inline image extraction code on some PDFs, we're getting 
> a new NPE.
> {noformat}
> for (COSName name : res.getExtGStateNames()) {
>             PDSoftMask softMask = res.getExtGState(name).getSoftMask();
> {noformat}
> In some cases, res.getExtGStateNames() appears to return COSNames that cannot 
> then be found by res.getExtGState(name).
> We can add a null check to Tika for now.  Is this a bug or feature in PDFBox?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to