Within the Hadoop core project, there is JobControl you can utilize
for this. You can view its API at
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/jobcontrol/package-summary.html
and it is fairly simple to use (Create jobs in regular java API, build
a dependency flow using JobControl atop these jobconf objects).

Apache Oozie and other such tools offer higher abstractions on
controlling a workflow, and can be considered when your needs can get
a bit complex than just a series (easy to handle failure scenarios
between dependent jobs, perform minor fs operations in pre/post
processing, etc.).

On Thu, Sep 29, 2011 at 5:26 AM, Aaron Baff <aaron.b...@telescope.tv> wrote:
> Is it possible to submit a series of MR Jobs to the JobTracker to run in 
> sequence (one finishes, take the output of that if successful and feed it 
> into the next, etc), or does it need to run client side by using the 
> JobControl or something like Oozie, or rolling our own? What I'm looking for 
> is a fire & forget, and occasionally check back to see if it's done. So 
> client-side doesn't need to really know anything or keep track of anything. 
> Does something like that exist within the Hadoop framework?
>
> --Aaron
>



-- 
Harsh J

Reply via email to