Current reports are here: https://corpora.tika.apache.org/base/reports/tika-2.8.1-rand1m-xyz.tgz
I expect a bunch of ole2 files will have fewer attachments because we're no longer duplicating/triplicating macros. I haven't had a chance to look, but will look tomorrow. On Tue, Aug 15, 2023 at 11:29 AM Tim Allison <[email protected]> wrote: > All, > > I'm back from vacation. I had really hoped to run this release before I > left, but TIKA-4091 and TIKA-4048 left some surprises without quick fixes > available. > > I'd like to fix small regressions left behind in TIKA-4091 (case > insensitive object names in OLE2), the new TIKA-4116 (duplicate macros in > some OLE2) and TIKA-4048 (the regression caused by setting extract all in > compressor parsers). > > WIth those changes, I think we should increment the minor version -> 2.9.0. > > Any blockers left for the next release? Any objections to the version > choice? > > > Best, > > Tim > > >
