Tim Allison created TIKA-3962: --------------------------------- Summary: Set RFC822 parser to noRecurse Key: TIKA-3962 URL: https://issues.apache.org/jira/browse/TIKA-3962 Project: Tika Issue Type: Task Reporter: Tim Allison
On our test file {{testGroupWiseEml.eml}}, there's an embedded rfc822 attachment that is currently not treated as an attachment but is inlined. The relevant section of the test file is: {noformat} Content-Type: message/rfc822 Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="test.eml" {noformat} When I open the email in several email clients, it shows this {{test.eml}} correctly as an attachment. It turns out there's a setting on mime4j's parser "setNoRecurse" that yields the correct behavior on this test file. Given that Tika handles files recursively already by default, I _think_ we should be safe to set no recurse in the mime4j parser and rely on Tika's own recursive parsing. -- This message was sent by Atlassian Jira (v8.20.10#820010)