Sebastian Nagel created NUTCH-3006: -------------------------------------- Summary: Downgrade Tika dependency to 2.2.1 (core and parse-tika) Key: NUTCH-3006 URL: https://issues.apache.org/jira/browse/NUTCH-3006 Project: Nutch Issue Type: Bug Affects Versions: 1.20 Reporter: Sebastian Nagel Fix For: 1.20
Tika 2.3.0 and upwards depend on a commons-io 2.11.0 (or even higher) which is not available when Nutch is used on Hadoop. Only Hadoop 3.4.0 is expected to ship with commons-io 2.11.0 (HADOOP-18301), all currently released versions provide commons-io 2.8.0. Because Hadoop-required dependencies are enforced in (pseudo)distributed mode, using Tika may cause issues, see NUTCH-2937 and NUTCH-2959. [~lewismc] suggested in the discussion of [Githup PR #776|https://github.com/apache/nutch/pull/776] to downgrade to Tika 2.2.1 to resolve these issues for now and until Hadoop 3.4.0 becomes available. -- This message was sent by Atlassian Jira (v8.20.10#820010)