Hi!

I've been using Nutch for a while but I'm new to hadoop. got a cluster with
hadoop 3.2.3 installed.

do i have to install nutch on the hadoop filesystem or can i run it
"local"? the clients don't need more from nutch than the info on master in
the command line: hadoop jar /home/debian/nutch40/lib/apache-nutch-1.19.jar
org.apache.nutch.tools.FreeGenerator -conf /home/debian/
nutch40/conf/nutch-default.xml
-Dplugin.folder=/home/debian/nutch40/plugins/
/crawl/urls//tranco-top350k-20221007.txt /home/debian/crawl/segments/

I get an error on the command:

Exception in thread "main" java.lang.RuntimeException: FreeGenerator job
did not succeed, job id: job_1665751705815_0007, job status: FAILED,
reason: Task failed task_1665751705815_0007_m_000000


Since I'm new I can't find the logs in hadoop properly yet.

Is there a guide how to install Natch (1.19) on Hadoop that I can't find?

Thanks
Mike

Reply via email to