I would not run them in the same root directory / key prefix. Put them both under different namespaces.
On Tue, Feb 8, 2011 at 4:34 PM, Thomas Söhngen <[email protected]> wrote: > Hi fellow data crunchers, > > I am running a JobFlow with a step using > "org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob" and a > following step using > "org.apache.mahout.cf.taste.hadoop.item.RecommenderJob". The first step > works without problems, but the second one is throwing an Exception: > > |Exception in thread"main" > org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory > temp/itemIDIndex already exists and is not empty > at > org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:124) > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:818) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:432) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447) > at > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.run(RecommenderJob.java:165) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.main(RecommenderJob.java:328) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > > | > > It looks like the second job is using the same temporal output directories > as the first job. How can I avoid this? Or even better: If some of the tasks > are already done and cached in the first step, how could I use them so that > they don't have to be recomputed in the second step? > > Best regards, > Thomas > > PS: This is the actual JobFlow definition in JSON: > > [ > [......], > { > "Name": "MR Step 2: Find similiar items", > "HadoopJarStep": { > "MainClass": > "org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob", > "Jar": "s3n://recommendertest/mahout-core/mahout-core-0.4-job.jar", > "Args": [ > "--input", > "s3n://recommendertest/data/<jobid>/aggregateWatched/", > "--output", "s3n://recommendertest/data/<jobid>/similiarItems/", > "--similarityClassname", "SIMILARITY_PEARSON_CORRELATION", > "--maxSimilaritiesPerItem", "100" > ] > } > }, > { > "Name": "MR Step 3: Find items for user", > "HadoopJarStep": { > "MainClass": "org.apache.mahout.cf.taste.hadoop.item.RecommenderJob", > "Jar": "s3n://recommendertest/mahout-core/mahout-core-0.4-job.jar", > "Args": [ > "--input", > "s3n://recommendertest/data/<jobid>/aggregateWatched/", > "--output", > "s3n://recommendertest/data/<jobid>/userRecommendations/", > "--similarityClassname", "SIMILARITY_PEARSON_CORRELATION", > "--numRecommendations", "100" > ] > } > } > ] > > |||| > >
