>Is it reasonable to assume that detect-only is much safer? Safer, yes. Safe, no.
At one point, a user reported a feature of the Compressor/Package detector that read a 5 (?) byte file and allocated 2GB (?): https://issues.apache.org/jira/browse/TIKA-2330 If you're only doing detection with the Mime detector (e.g., you do not have tika-parsers on your path), you're probably quite safe. However, if you have tika-parsers on your path, some parsers will be run to determine subtypes of zip, for example, and other container file formats. On Fri, May 28, 2021 at 10:12 AM John Ulric <[email protected]> wrote: > > > For now, the general advice is documented at: > > https://cwiki.apache.org/confluence/display/TIKA/The+Robustness+of+Apache+Tika > > General question here: How would you compare the robustness in that respect > between calling the full parsers vs. detect-only? Is it reasonable to assume > that detect-only is much safer? > > John >
