Tim Allison created TIKA-2563: --------------------------------- Summary: Extract embedded files in HTML Key: TIKA-2563 URL: https://issues.apache.org/jira/browse/TIKA-2563 Project: Tika Issue Type: Improvement Reporter: Tim Allison
Files (esp images) can be base64 encoded in HTML files. We should extract those like any other embedded file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)