Tim Allison created TIKA-4116:
---------------------------------
Summary: Duplicate macros extracted from some embedded OLE2
containers
Key: TIKA-4116
URL: https://issues.apache.org/jira/browse/TIKA-4116
Project: Tika
Issue Type: Bug
Reporter: Tim Allison
In some OLE2 containers with embedded objects, we're calling extract macros
potentially several times on the same POIFSFileSystem.
An example file is here:
https://corpora.tika.apache.org/base/docs/govdocs1/527/527356.doc
The embedded {{_1152432709.xls}} has several attachments, including
{{MBD000000B4.unknown}} and {{MBD0049C388.unknown}} among others. Each time we
parse the embedded files, we're calling extractMacros on the same file system:
{{root.getFileSystem()}}, which takes us back to {{_1152432709.xls}}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)