Hi, I am currently using Nutch 1.8. I found repeated errors of this kind - "*fetcher caught:java.io.IOException: Spill failed*" while re-crawling a URL. It shows up repeatedly during the fetching process like shown below and it also does not stop the fetching process. I am not really sure how to handle it.
People who reported this error on various forums have a Hadoop cluster where as in my case I am *not *using any cluster. All I have is *Nutch 1.8* with local filesystem on a single server. 2015-04-06 14:57:39,160 INFO fetcher.Fetcher - fetching http://examplenutch/pics/pages/p0000108.shtml (queue crawl delay=5000ms) 2015-04-06 14:57:39,340 INFO fetcher.Fetcher - fetching http://examplenutch/pages/forest%20underburn.htm (queue crawl delay=5000ms) 2015-04-06 14:57:39,368 INFO fetcher.Fetcher - fetching http://examplenutch/elevation_contours_20.e00 (queue crawl delay=5000ms) 2015-04-06 14:57:39,488 ERROR fetcher.Fetcher - fetcher caught:java.io.IOException: *Spill failed* 2015-04-06 14:57:39,591 ERROR fetcher.Fetcher - fetcher caught:java.io.IOException: *Spill failed* 2015-04-06 14:57:39,592 INFO fetcher.Fetcher - fetching http://examplenutch/newsletters/2010-mar.shtml (queue crawl delay=5000ms) 2015-04-06 14:57:39,840 ERROR fetcher.Fetcher - fetcher caught:java.io.IOException: *Spill failed* 2015-04-06 14:57:39,841 INFO fetcher.Fetcher - fetching http://examplenutch/scifi83.pdf (queue crawl delay=5000ms) 2015-04-06 14:57:39,979 INFO fetcher.Fetcher - fetching http://example.com/playground.pdf(queue crawl delay=5000ms) 2015-04-06 14:57:40,033 INFO fetcher.Fetcher - -activeThreads=50, spinWaiting=46, fetchQueues.totalSize=2498 2015-04-06 14:57:40,172 ERROR fetcher.Fetcher - fetcher caught:java.io.IOException: *Spill failed* 2015-04-06 14:57:40,722 ERROR fetcher.Fetcher - fetcher caught:java.io.IOException: *Spill failed* 2015-04-06 14:57:40,722 INFO fetcher.Fetcher - fetching http://examplenutch/watershed_scale.pdf (queue crawl delay=5000ms) Please advise Thanks a bunch!!

