Two computer nutch hadoop, no fetch or fetch job failed

陈钊 Wed, 12 Sep 2007 20:54:08 -0700

Hi,all
I deploed two computer for nutch crawl as 'NutchHadoop Tutorial'
tells.Butwhen I run nutch crawl the fetch job dosen't get data at all.
This is a part of crawl.log


 rootUrlDir = urls
threads = 6
depth = 7
Injector: starting
Injector: crawlDb: crawled/crawldb
Injector: urlDir: urls
Injector: Converting injected urls to crawl db entries.
Injector: Merging injected urls into crawl db.
Injector: done
Generator: starting
Generator: segment: crawled/segments/20070912141603
Generator: Selecting best-scoring urls due for fetch.
Generator: Partitioning selected urls by host, for politeness.
Generator: done.
Fetcher: starting
Fetcher: segment: crawled/segments/20070912141603
Fetcher: done
CrawlDb update: starting
CrawlDb update: db: crawled/crawldb
CrawlDb update: segment: crawled/segments/20070912141603
CrawlDb update: Merging segment data into db.
CrawlDb update: done
Generator: starting
……

The day befor yesterday the fetch job got some data,but job failed quickly
at my first fetch.It prints this
java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:357)
        at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:562)
        at org.apache.nutch.crawl.Crawl.Crawler(Crawl.java:135)
        at org.apache.nutch.crawl.Crawl.ReplyPNo1Command(Crawl.java:325)
        at org.apache.nutch.crawl.Crawl.run(Crawl.java:436)
Can you tell me what's wrong with my work.Thanks alot

Oh, we use redhat9.0 and nutch0.8.1.

Two computer nutch hadoop, no fetch or fetch job failed

Reply via email to