Hello,

I'm currently running Nutch under Amazon EMR 5.12.0 with Hadoop 2.83 using
S3 (EMRFS) as the filesystem.  If I build the latest version from the
master branch and run a crawl in distributed mode I get a fetcher error
like fetcher.Fetcher: Fetcher: java.lang.IllegalArgumentException: Wrong
FS: s3:..., expected: hdfs://...

This problem was reported in NUTCH-2494 and fixed in PR-274 and indeed when
I run the same crawl using a build of commit 87c7a2e it works with no
error.  So my question is has a regression been introduced, or am I missing
something?

Regards,

John

Reply via email to