Am 26.04.2022 um 13:07 schrieb Tim Allison:
Reports are here:
https://corpora.tika.apache.org/base/reports/reports-tika-1.28.2-SNAPSHOT.tgz

I found two issues that should be fixed (TIKA-3733 and TIKA-3734).  I
think both are related to the underlying parsers being stricter (which
is good), but we need to change our code to handle these cases more
robustly.

Let me know if you see anything else.

What about commoncrawl3/3X/3X4JRZZ4TQ2GK4QQDQEXMFCVLM3FM5I4 , this is also a rar file and the last entry in content_diffs_no_exceptions.xlsx . Is that related to TIKA-3734 ?

Tilman

Reply via email to