[ https://issues.apache.org/jira/browse/TIKA-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17052332#comment-17052332 ]
Tilman Hausherr edited comment on TIKA-3059 at 3/5/20, 5:03 PM: ---------------------------------------------------------------- And yes it's (partly) a feature, getExtGStateNames() returns the resource names (keyset) without checking whether there is a value attached 😂 However the PDFBox code in ImageGraphicsEngine is wrong, it should have a null check, I'll fix that later. was (Author: tilman): And yes it's (partly) a feature, getExtGStateNames() returns the resource names (keyset) without checking whether there is a value attached 😂 However the PDFBox code in ImageGraphicsEngine is wrong, it should have a check. > New NPE in ImageGraphicsEngine > ------------------------------ > > Key: TIKA-3059 > URL: https://issues.apache.org/jira/browse/TIKA-3059 > Project: Tika > Issue Type: Task > Reporter: Tim Allison > Priority: Major > Attachments: > 40f86f17048dbe5402a98b2c2c19161e813f99710137a96d8aca15d0c4603183, Screen Shot > 2020-03-05 at 9.40.27 AM.png > > > When we run the new inline image extraction code on some PDFs, we're getting > a new NPE. > {noformat} > for (COSName name : res.getExtGStateNames()) { > PDSoftMask softMask = res.getExtGState(name).getSoftMask(); > {noformat} > In some cases, res.getExtGStateNames() appears to return COSNames that cannot > then be found by res.getExtGState(name). > We can add a null check to Tika for now. Is this a bug or feature in PDFBox? -- This message was sent by Atlassian Jira (v8.3.4#803005)