Reports are here: https://corpora.tika.apache.org/base/reports/reports-tika-1.28.2-SNAPSHOT.tgz
I found two issues that should be fixed (TIKA-3733 and TIKA-3734). I think both are related to the underlying parsers being stricter (which is good), but we need to change our code to handle these cases more robustly. Let me know if you see anything else.