[ https://issues.apache.org/jira/browse/TIKA-4248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17842432#comment-17842432 ]
ASF GitHub Bot commented on TIKA-4248: -------------------------------------- tballison merged PR #1738: URL: https://github.com/apache/tika/pull/1738 > Improve PST handling of attachments > ----------------------------------- > > Key: TIKA-4248 > URL: https://issues.apache.org/jira/browse/TIKA-4248 > Project: Tika > Issue Type: Task > Reporter: Tim Allison > Priority: Major > > The PST parser doesn't handle attachments in quite the same way as other > parsers which hinders analysis of attachments. > The problem is that the PST parser handles the text content of an email and > the embedded attachments. And, the PST parser processes attachments before > the main body. These two features make the normal patterns for embedded > attachments break down in the RecursiveParserWrapper. For example, when the > attachments are being processed, the RecursiveParserWrapper can't figure out > what the path will be through the "body" because that hasn't been parsed yet. > We should probably create a PSTMailItemParser that handles the content and > the attachments like other parsers so that embedded paths can be maintained. > This will be a breaking change, and I'm targeting it only to the 3.x branch. -- This message was sent by Atlassian Jira (v8.20.10#820010)