Re: Cascading jobs in hadoop

2009-10-17 Thread Kevin Weil
Bharath, The mapred package is largely deprecated, as hadoop is moving towards the mapreduce package. Use mapreduce for any new jobs you write, because mapred will go away in some future release. For now, both are there to give developers time to rewrite existing older jobs. Kevin On Sat, Oct 3

Re: Cascading jobs in hadoop

2009-10-03 Thread bharath vissapragada
Tom and Chris , Thanks for your replies .. I have seen thr o.a.h.mapred.jobcontrol.Job and o.a.h.mapreduce.Job .. Only one of them has the above option of adding a dependent Jobs .. Can anyone tell me the difference between "mapred" and "mapreduce" packages .. Thanks in advance On 10/2/09, Chris

Re: Cascading jobs in hadoop

2009-10-02 Thread Chris K Wensel
You might find the Cascading project quite useful in this regard. http://www.cascading.org/ using MapReduceFlow and CascadeConnector classes, you can chain arbitrary MR jobs together. Cascading will determine the dependencies, if any, and run the jobs in topological order (independent jobs w

Re: Cascading jobs in hadoop

2009-10-02 Thread Tom White
Have a look at the JobControl class - this allows you to set up chains of job dependencies. Tom On Fri, Oct 2, 2009 at 11:29 AM, bharath v wrote: > Hi all, > > I have a set of map red jobs which need to be cascaded ,i.e, output of MR > job1 is the input of MR job2. etc.. > > Can anyone point me

Cascading jobs in hadoop

2009-10-02 Thread bharath v
Hi all, I have a set of map red jobs which need to be cascaded ,i.e, output of MR job1 is the input of MR job2. etc.. Can anyone point me to the corresponding classes in hadoop 0.20.0 API? I have seen "x.addDependingJob(y)" function in the yahoo's hadoop tutorial but that is for the older versio