Hi, You cannot recover the mapper output as far as i know. But anyway, one should never have a fetcher running for three days. It's far better to generate a large amount of smaller segments and fetch them sequentially. If an error occurs, only a small portion is affected. We never run fetchers for more than one hour, instead we run many in a row and sometimes concurrently.
Cheers, -----Original message----- > From:Mohammad wrk <mhd...@yahoo.com> > Sent: Fri 26-Oct-2012 00:47 > To: user@nutch.apache.org > Subject: How to recover data from /tmp/hadoop-myuser > > Hi, > > > > My fetch cycle (nutch fetch ./segments/20121021205343/ -threads 25) failed, > after 3 days, with the error below. Under the segment folder > (./segments/20121021205343/) there is only generated fetch list > (crawl_generate) and no content. However /tmp/hadoop-myuser/ has 96G of data. > I was wondering if there is a way to recover this data and parse the segment? > > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any > valid local directory for output/file.out > > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127) > at > org.apache.hadoop.mapred.MapOutputFile.getOutputFileForWrite(MapOutputFile.java:69) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1640) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1323) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:437) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) > 2012-10-24 14:43:29,671 ERROR fetcher.Fetcher - Fetcher: java.io.IOException: > Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1265) > at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:1318) > at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:1354) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:1327) > > > Thanks, > Mohammad