[Nutch-general] Why does Nutch crawl keep on throwing an exception?

Micah Vivion Mon, 30 Jul 2007 01:01:53 -0700

Greetings,

So this one has me stumped a little bit. I am running a fairly simple  
Nutch Crawl on our local intranet site or on our partners intranet  
sites. Every now and then when doing a 'bin/nutch crawl urlfile -dir  
webindex/ -depth 5' I get an exception of:
Optimizing index.
Indexer: done
Dedup: starting
Dedup: adding indexes in: /home/mvivion/webindex/target.com/indexes
Exception in thread "main" java.io.IOException: Job failed!
         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java: 
604)
         at org.apache.nutch.indexer.DeleteDuplicates.dedup 
(DeleteDuplicates.java:439)
         at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)


Has anyone see this before? Any solutions to resolve this crash?

Thanks!!!

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

[Nutch-general] Why does Nutch crawl keep on throwing an exception?

Reply via email to