KuroSaka TeruHiko (JIRA) wrote:

[ http://issues.apache.org/jira/browse/NUTCH-266?page=comments#action_12416945 ]
KuroSaka TeruHiko commented on NUTCH-266:
-----------------------------------------

I am experiencing pretty much the same symptom with the nighly builds of 
5/31/2006 up to 6/14/2006, which I tested the last time.
Here's the result of my "nutch crawl" run with DEBUG level log turned on.

2006-06-16 17:04:05,932 INFO  mapred.LocalJobRunner 
(LocalJobRunner.java:progress(140)) - 
C:/opt/nutch-060614/test/index/segments/20060616170358/crawl_parse/part-00000:0+62
2006-06-16 17:04:05,948 WARN  mapred.LocalJobRunner 
(LocalJobRunner.java:run(119)) - job_4wsxze
java.io.IOException: Couldn't rename 
/tmp/hadoop/mapred/local/map_5n5aid/part-0.out
       at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:102)
Exception in thread "main" java.io.IOException: Job failed!
       at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:342)
       at org.apache.nutch.crawl.CrawlDb.update(CrawlDb.java:55)
       at org.apache.nutch.crawl.Crawl.main(Crawl.java:114)

Prior to this fatal exception, I've seen many occurances of this exception:
2006-06-16 17:04:05,854 INFO  conf.Configuration
(Configuration.java:loadResource(397)) - parsing 
file:/C:/opt/nutch-060614/conf/hadoop-site.xml
<snip>

This isn't really an exception, it's there just to print the stacktrace (so one 
can track
who is calling it).



I am not intend to run hadoop at all, so this hadoop-site.xlm is empty.
It just has this empty element:
<configuration>
</configuration>

You should at least set values for 'mapred.system.dir' and 'mapred.local.dir' and point them to a dir that has enough space available (I think they default
to under /tmp at least on my system wich is far too small for larger jobs)

--
Sami Siren

Reply via email to