The govdocs file has 1290 MACRO (javascript) "attachments" with Tika
1.26-SNAPSHOT and 930 with Tika 1.25.  I have no idea why there are
more macros in the more recent version of Tika, but there are
"attachments" broadly speaking.

I'll look into the NPEs.  If those are a Java bug, I don't think those
are a blocker.

Still working on the open office document issues...
LIBRE_OFFICE-45041-0.ods is showing some weird behavior.

On Tue, Mar 23, 2021 at 2:58 PM Tilman Hausherr <[email protected]> wrote:
>
> Am 23.03.2021 um 17:31 schrieb Tim Allison:
> > Reports are available here:
> > https://corpora.tika.apache.org/base/reports/1_25_v_1_26.tgz
>
>
> govdocs1/966/966679.pdf
>
> claims to have 360 attachments more than last time. I don't see a single
> attachment, and when I run tika-app with "--extract" I get nothing???
>
>
> There are also some NPEs for BMP files, seems to be a java bug.
>
>
> Tilman
>

Reply via email to