Hi, I'm a new user to hadoop and have been having difficulties adapting map and reduce to my needs. This is what I want to do: 1. Run multiple jobs chained to each other. The output of one Map and reduce is the input to the one after it. 2. Get multiple outputs for each Map and reduce job (Ive mostly figured this one out. I'll be using MultipleOutputFormat). For outputs from reduce that have a certain key, I want them to append to a file so that the multiple maps and reduces that follow, they always append to the output of the previous ones.
For number 1. I did the following (In this case running 2 jobs one after the other) --- FileOutputFormat.setOutputPath(job, new Path("/user/h/out")); for(int i=0;i<=1;i++) { System.out.println("Job number"+i); System.out.println((job.getConfiguration()).get("cus")); job.waitForCompletion(true); j = new Job(c); FileOutputFormat.setOutputPath(j, new Path("/user/h/out1")); } -- The job ran without displaying any error but when I check the output files for the jobs, I can see only the output directory for the first job and not the second one: $ hadoop dfs -ls /user/h/out Found 2 items drwxr-xr-x - h supergroup 0 2009-09-02 13:40 /user/h/out/_logs -rw-r--r-- 1 h supergroup 18 2009-09-02 13:41 /user/h/out/part-r-00000 $ hadoop dfs -ls /user/h/out1 ls: Cannot access /user/h/out1: No such file or directory. That is my first problem. Then for problem 2, I simply don't have any idea how to get the successive map and reduce jobs to append to a file. I've been breaking my head on this for the last 2 days. Would be really great, if someone could help. Thanks! H -- View this message in context: http://www.nabble.com/Map-and-reduce-%3ARunning-multiple-jobs-and-multiple-outputs-tp25263907p25263907.html Sent from the Hadoop core-user mailing list archive at Nabble.com.