Hi, The path /tmp/hadoop-pat/mapred/local/archive/-4686065962599733460_1587570556_150738331/<snip> is a location used by the tasktracker process for the 'DistributedCache' - a mechanism to distribute files to all tasks running in a map reduce job. ( http://hadoop.apache.org/common/docs/r1.0.3/mapred_tutorial.html#DistributedCache ).
You have mentioned Mahout, so I am assuming that the specific analysis job you are running is using this feature to distribute the output of the file / Users/pat/Projects/big-data/b/ssvd/Q-job/R-m-00000 to the job that is causing a failure. Also, I find links stating the distributed cache does not work with in the local (non-HDFS) mode. ( http://stackoverflow.com/questions/9148724/multiple-input-into-a-mapper-in-hadoop). Look at the second answer. Thanks hemanth On Tue, Sep 4, 2012 at 10:33 PM, Pat Ferrel <pat.fer...@gmail.com> wrote: > The job is creating several output and intermediate files all under the > location: Users/pat/Projects/big-data/b/ssvd/ several output directories > and files are created correctly and the > file Users/pat/Projects/big-data/b/ssvd/Q-job/R-m-00000 is created and > exists at the time of the error. We seem to be passing > in Users/pat/Projects/big-data/b/ssvd/Q-job/R-m-00000 as the input file. > > Under what circumstances would an input path passed in as > "Users/pat/Projects/big-data/b/ssvd/Q-job/R-m-00000" be turned into > "pat/mapred/local/archive/6590995089539988730_1587570556_37122331/file/Users/pat/Projects/big-data/b/ssvd/Q-job/R-m-00000" > > ??? > > > On Sep 4, 2012, at 1:14 AM, Narasingu Ramesh <ramesh.narasi...@gmail.com> > wrote: > > Hi Pat, > Please specify correct input file location. > Thanks & Regards, > Ramesh.Narasingu > > On Mon, Sep 3, 2012 at 9:28 PM, Pat Ferrel <p...@occamsmachete.com> wrote: > >> Using hadoop with mahout in a local filesystem/non-hdfs config for >> debugging purposes inside Intellij IDEA. When I run one particular part of >> the analysis I get the error below. I didn't write the code but we are >> looking for some hint about what might cause it. This job completes without >> error in a single node pseudo-clustered config outside of IDEA. >> >> several jobs in the pipeline complete without error creating part files >> just fine in the local file system >> >> The file >> /tmp/hadoop-pat/mapred/local/archive/6590995089539988730_1587570556_37122331/file/Users/pat/Projects/big-data/b/ssvd/Q-job/R-m-00000 >> >> which is the subject of the error - does not exist >> >> Users/pat/Projects/big-data/b/ssvd/Q-job/R-m-00000 >> >> does exist at the time of the error. So the code is looking for the data >> in the wrong place? >> >> …. >> 12/09/02 14:56:29 INFO compress.CodecPool: Got brand-new decompressor >> 12/09/02 14:56:29 INFO compress.CodecPool: Got brand-new decompressor >> 12/09/02 14:56:29 INFO compress.CodecPool: Got brand-new decompressor >> 12/09/02 14:56:29 WARN mapred.LocalJobRunner: job_local_0002 >> java.io.FileNotFoundException: File >> /tmp/hadoop-pat/mapred/local/archive/-4686065962599733460_1587570556_150738331/file/Users/pat/Projects/big-data/b/ssvd/Q-job/R-m-00000 >> does not exist. >> at >> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:371) >> at >> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245) >> at >> org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterator.<init>(SequenceFileDirValueIterator.java:92) >> at >> org.apache.mahout.math.hadoop.stochasticsvd.BtJob$BtMapper.setup(BtJob.java:219) >> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) >> at >> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) >> Exception in thread "main" java.io.IOException: Bt job unsuccessful. >> at >> org.apache.mahout.math.hadoop.stochasticsvd.BtJob.run(BtJob.java:609) >> at >> org.apache.mahout.math.hadoop.stochasticsvd.SSVDSolver.run(SSVDSolver.java:397) >> at >> com.finderbots.analysis.AnalysisPipeline.SSVDTransformAndBack(AnalysisPipeline.java:257) >> at com.finderbots.analysis.AnalysisJob.run(AnalysisJob.java:20) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >> at com.finderbots.analysis.AnalysisJob.main(AnalysisJob.java:34) >> Disconnected from the target VM, address: '127.0.0.1:63483', transport: >> 'socket' >> >> > >