Hello,

I download the latest release of Nutch 1.0 from
http://apache.seekmeup.com/lucene/nutch/

I extracted it and just did the basic configuration for nutch-site.xml, and
crawl-urlfilter.txt.
I have only one domain in my urls.txt file and the same in
crawl-urlfilter.txt

When I try to run any of the following two commands:
# bin/nutch inject crawl/crawldb ./urls.txt
OR
# bin/nutch crawl ./urls.txt -dir crawl -depth 3 -topN 50 

I got an error.

Here is the error in my hadoop.log:
2009-03-29 10:43:12,391 WARN  mapred.LocalJobRunner - job_local_0001
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
valid local directory for
taskTracker/jobcache/job_local_0001/attempt_local_0001_m_000000_0/output/spill0.out
        at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:335)
        at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
        at
org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:930)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:842)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)


Can anyone help me with this please?

Thanks in advance,
Norton
-- 
View this message in context: 
http://www.nabble.com/Error-with-Nutch-1.0-crawling-tp22768357p22768357.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to