Hi All, I am crawling a URL list of 300K and after fetching around 200K I see IOException: Spill Failed error. Below is the stack trace.
Would anyone have some insight as to what am I running into and how I can overcome this issue. Thanks in advance, Bhawna Stack Trace: 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - java.io.IOException: Spill failed 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:860) 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466) 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at org.apache.nutch.fetcher.Fetcher$FetcherThread.output(Fetcher.java:899) 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:647) 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for taskTracker/jobcache/job_local_0001/attempt_local_0001_m_000000_0/output/spill26.out 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:343) 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124) 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107) 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1221) 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:686) 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1173) 2011-03-09 23:59:59,752 ERROR fetcher.Fetcher - fetcher caught:java.io.IOException: Spill failed

