On Tue, 13 Oct 2020, Tim Allison wrote:
Ha, y, this file exercises those bits of code:
https://github.com/apache/tika/blob/main/tika-parser-modules/tika-parser-microsoft-module/src/test/resources/test-documents/testPPT_oleWorkbook.ppt
Nick, does this match the features of the SO question?
Yup,
Ha, y, this file exercises those bits of code:
https://github.com/apache/tika/blob/main/tika-parser-modules/tika-parser-microsoft-module/src/test/resources/test-documents/testPPT_oleWorkbook.ppt
Nick, does this match the features of the SO question?
On Tue, Oct 13, 2020 at 10:58 AM Tim Allison w
Based on
https://github.com/apache/tika/blob/main/tika-parser-modules/tika-parser-microsoft-module/src/main/java/org/apache/tika/parser/microsoft/HSLFExtractor.java#L518
and
https://github.com/apache/tika/blob/main/tika-parser-modules/tika-parser-microsoft-module/src/main/java/org/apache/tika/pa
Thank you, Nick!
IIUC the XLSX raw bytes are in the Package entry of an OLE2 wrapper. What
is the key for the OLE2 wrapper in the PPT? Sorry for missing this...
Have you put your hands on an example that you could share privately?
Happy to look through our regression corpus if I know what exact
On Fri, 9 Oct 2020, Tim Allison wrote:
Do you think we should follow up on the Tika side? Do we know if we can
handle this?
I thought we did, but checking POIFSContainerDetector I can't actually see
that case covered
I think we (Tika) can handle it in a similar way to CompObj
Over on
Nick,
Do you think we should follow up on the Tika side? Do we know if we can
handle this?
-- Forwarded message -
From: Nick Burch
Date: Fri, Oct 9, 2020 at 4:43 PM
Subject: XLSX wrapped in an OLE2 CompObj/Package - should WorkbookFactory
handle it?
To:
Hi All
Over on Stack