Within the Hadoop core project, there is JobControl you can utilize for this. You can view its API at http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/jobcontrol/package-summary.html and it is fairly simple to use (Create jobs in regular java API, build a dependency flow using JobControl atop these jobconf objects).
Apache Oozie and other such tools offer higher abstractions on controlling a workflow, and can be considered when your needs can get a bit complex than just a series (easy to handle failure scenarios between dependent jobs, perform minor fs operations in pre/post processing, etc.). On Thu, Sep 29, 2011 at 5:26 AM, Aaron Baff <aaron.b...@telescope.tv> wrote: > Is it possible to submit a series of MR Jobs to the JobTracker to run in > sequence (one finishes, take the output of that if successful and feed it > into the next, etc), or does it need to run client side by using the > JobControl or something like Oozie, or rolling our own? What I'm looking for > is a fire & forget, and occasionally check back to see if it's done. So > client-side doesn't need to really know anything or keep track of anything. > Does something like that exist within the Hadoop framework? > > --Aaron > -- Harsh J