output from one map reduce job as the input to another map reduce job?

2011-09-27 Thread Kevin Burton
Is it possible to connect the output of one map reduce job so that it is the input to another map reduce job. Basically… then reduce() outputs a key, that will be passed to another map() function without having to store intermediate data to the filesystem. Kevin -- Founder/CEO Spinn3r.com Loc

Re: output from one map reduce job as the input to another map reduce job?

2011-09-27 Thread Arko Provo Mukherjee
Hi, I am not sure how you can avoid the filesystem, however, I did it as follows: // For Job 1 FileInputFormat.addInputPath(job1, new Path(args[0])); FileOutputFormat.setOutputPath(job1, new Path(args[1])); // For job 2 FileInputFormat.addInputPath(job2, new Path(args[1])); FileOutputFormat.setO

Re: output from one map reduce job as the input to another map reduce job?

2011-09-27 Thread Marcos Luis Ortiz Valmaseda
Are you consider for this to Oozie? It´s a workflow engine developed for the Yahoo! engineers Yahoo/oozie at GitHub https://github.com/yahoo/oozie Oozie at InfoQ http://www.infoq.com/articles/introductionOozie Oozie´s examples: http://www.infoq.com/articles/oozieexample http://yahoo.github.com/oo

Re: output from one map reduce job as the input to another map reduce job?

2011-09-27 Thread Arun C Murthy
On Sep 27, 2011, at 12:09 PM, Kevin Burton wrote: > Is it possible to connect the output of one map reduce job so that it is the > input to another map reduce job. > > Basically… then reduce() outputs a key, that will be passed to another map() > function without having to store intermediate d

Re: output from one map reduce job as the input to another map reduce job?

2011-09-27 Thread Mike Spreitzer
It looks to me like Oozie will not do what was asked. In http://yahoo.github.com/oozie/releases/3.0.0/WorkflowFunctionalSpec.html#a0_Definitions I see: 3.2.2 Map-Reduce Action ... The workflow job will wait until the Hadoop map/reduce job completes before continuing to the next action in the

Re: output from one map reduce job as the input to another map reduce job?

2011-09-28 Thread Niels Basjes
To me it sounds like the asker should checkout tools like storm and s4 instead of hadoop. http://www.infoq.com/news/2011/09/twitter-storm-real-time-hadoop -- Met vriendelijke groet, Niels Basjes Op 27 sep. 2011 22:38 schreef "Mike Spreitzer" het volgende: > It looks to me like Oozie will not do