[ https://issues.apache.org/jira/browse/TIKA-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16131249#comment-16131249 ]
Hudson commented on TIKA-2374: ------------------------------ ABORTED: Integrated in Jenkins build Tika-trunk #1344 (See [https://builds.apache.org/job/Tika-trunk/1344/]) TIKA-2374 and TIKA-2434 - roll back extracting inline images for pdfs in (tallison: [https://github.com/apache/tika/commit/10baddcc15501c196dccf956463e607d9973c403]) * (edit) tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java > Tika App -z should extract PDF inline images by default > ------------------------------------------------------- > > Key: TIKA-2374 > URL: https://issues.apache.org/jira/browse/TIKA-2374 > Project: Tika > Issue Type: Improvement > Components: cli > Affects Versions: 1.14 > Reporter: Nick Burch > Fix For: 1.16 > > > As discussed on dev@ - If you use the Tika App with the default config and > the {{-z}} extract option, it will extract embedded resources, except PDF > inline images. This is unexpected for new users, who won't know that they'd > need to pass in a custom config with the {{extractInlineImages}} PDF parser > option set > If the user passes in an explicit config to the app, we should respect that. > However, if they don't pass one in and take the default, the -z option should > (but only that one) enable whatever options are needed to make extraction > work properly + fully (currently just {{extractInlineImages}}) > If possible/easy, the -z option should print out some info to let affected > users know that the default config was tweaked to give extra embedded > resources -- This message was sent by Atlassian JIRA (v6.4.14#64029)