Hi, I've been using the latest trunk version on 3 machines, 3 tasktracker and 80000 starting URLs, I've been trying to have depth 2 crawl. The first loop always goes well. But the second loop (which has about 800,000 ursl to fetch in the fetch list) always fails in the middle or end of the Fetcher reduce process with this error:
060124 073930 reduce 48% 060124 074000 reduce 49% Exception in thread "main" java.io.IOException: Job failed! at org.apache.nutch.mapred.JobClient.runJob(JobClient.java:308) at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:347) at org.apache.nutch.crawl.Crawl.main(Crawl.java:111) sometimes it happends when reduce is at 100% And these are my settings: mapred.map.tasks=20 mapred.reduce.tasks=10 It seems this exception happens when fetchlist grows and size of the mapred folder is large. Can it be because the number of reduce tasks is more than number of tasktracker? It also happens with single machine and one tasktracker. Thanks Mike