[
https://issues.apache.org/jira/browse/NUTCH-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770320#comment-17770320
]
Sebastian Nagel commented on NUTCH-3006:
> revert CloseShieldInputStream.wrap(), which I think was the only conflict
Yes, looks like it was the only conflict. If it's an option to revert this,
yes, why not.
The idea of the downgrade was more to avoid that this issue blocks any release.
And downgrading from 2.3.0 (current master) to 2.2.1 sounds less dramatic.
> how far out Hadoop 3.4.0 is
Even if it's released, it takes some time (a couple of months) until Hadoop
distributions (for example Apache Bigtop) pick the release and/or users deploy
it.
> Downgrade Tika dependency to 2.2.1 (core and parse-tika)
>
>
> Key: NUTCH-3006
> URL: https://issues.apache.org/jira/browse/NUTCH-3006
> Project: Nutch
> Issue Type: Bug
>Affects Versions: 1.20
>Reporter: Sebastian Nagel
>Priority: Major
> Fix For: 1.20
>
>
> Tika 2.3.0 and upwards depend on a commons-io 2.11.0 (or even higher) which
> is not available when Nutch is used on Hadoop. Only Hadoop 3.4.0 is expected
> to ship with commons-io 2.11.0 (HADOOP-18301), all currently released
> versions provide commons-io 2.8.0. Because Hadoop-required dependencies are
> enforced in (pseudo)distributed mode, using Tika may cause issues, see
> NUTCH-2937 and NUTCH-2959.
> [~lewismc] suggested in the discussion of [Githup PR
> #776|https://github.com/apache/nutch/pull/776] to downgrade to Tika 2.2.1 to
> resolve these issues for now and until Hadoop 3.4.0 becomes available.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)