Tim Allison created TIKA-4012:
---------------------------------
Summary: Improve extraction of embedded documents in PDFs
Key: TIKA-4012
URL: https://issues.apache.org/jira/browse/TIKA-4012
Project: Tika
Issue Type: New Feature
Reporter: Tim Allison
We're currently processing the EmbeddedFiles entry in the name tree and
annotations to look for file spec dictionaries. Unfortunately, PDFs may embed
files in lots of other places. The newly free 2.0 spec makes this abundantly
and painfully clear.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)