Re: [Nutch-general] Why does Nutch crawl keep on throwing an exception?

DES Mon, 30 Jul 2007 11:30:51 -0700

Hi Micah,

what is your configuration? do you have multiple nodes or is it a
single machine? what version of hadoop library are you using?


des

On 7/30/07, Micah Vivion <[EMAIL PROTECTED]> wrote:
> Greetings,
>
> So this one has me stumped a little bit. I am running a fairly simple
> Nutch Crawl on our local intranet site or on our partners intranet
> sites. Every now and then when doing a 'bin/nutch crawl urlfile -dir
> webindex/ -depth 5' I get an exception of:
> Optimizing index.
> Indexer: done
> Dedup: starting
> Dedup: adding indexes in: /home/mvivion/webindex/target.com/indexes
> Exception in thread "main" java.io.IOException: Job failed!
>          at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:
> 604)
>          at org.apache.nutch.indexer.DeleteDuplicates.dedup
> (DeleteDuplicates.java:439)
>          at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)
>
> Has anyone see this before? Any solutions to resolve this crash?
>
> Thanks!!!
>

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Re: [Nutch-general] Why does Nutch crawl keep on throwing an exception?

Reply via email to