Re: Tika App, Extract (-z) and Inline PDF Images?

2017-05-22 Thread Nick Burch
On Thu, 18 May 2017, Timothy Allison wrote: I think this would be ok if we added a warning that -z is different and a pointer to changing the config? Works for me. I've raised https://issues.apache.org/jira/browse/TIKA-2374 for us to track/implement once 1.15 is out of the way Nick On 2017

Re: Tika App, Extract (-z) and Inline PDF Images?

2017-05-18 Thread Timothy Allison
I think this would be ok if we added a warning that -z is different and a pointer to changing the config? On 2017-05-18 17:02 (-0400), Nick Burch wrote: > Hi All> > > I've just been caught out by the Tika App's -z on a PDF not extracting the > > embedded images. I think we probably shouldn't

Tika App, Extract (-z) and Inline PDF Images?

2017-05-18 Thread Nick Burch
Hi All I've just been caught out by the Tika App's -z on a PDF not extracting the embedded images. I think we probably shouldn't tweak the default config for the other Tika App modes, but what about extract? Any reason why we shouldn't turn on the PDF Parser option "extractInlineImages" when -