Thanks Matt, Arko, if you plan to use Oozie, you can have a simple coordinator job that does does, for example (the following schedules a WF every 5 mins that consumes the output produced by the previous run, you just have to have the initial data)
Thxs. Alejandro ---- <coordinator-app name="coord-1" frequency="${coord:minutes(5)}" start="${start}" end="${end}" timezone="UTC" xmlns="uri:oozie:coordinator:0.1"> <controls> <concurrency>1</concurrency> </controls> <datasets> <dataset name="data" frequency="${coord:minutes(5)}" initial-instance="${start}" timezone="UTC"> <uri-template>${nameNode}/user/${coord:user()}/examples/${dataRoot}/${YEAR}-${MONTH}-${DAY}-${HOUR}-${MINUTE} </uri-template> </dataset> </datasets> <input-events> <data-in name="input" dataset="data"> <instance>${coord:current(0)}</instance> </data-in> </input-events> <output-events> <data-out name="output" dataset="data"> <instance>${coord:current(1)}</instance> </data-out> </output-events> <action> <workflow> <app-path>${nameNode}/user/${coord:user()}/examples/apps/subwf-1</app-path> <configuration> <property> <name>jobTracker</name> <value>${jobTracker}</value> </property> <property> <name>nameNode</name> <value>${nameNode}</value> </property> <property> <name>queueName</name> <value>${queueName}</value> </property> <property> <name>examplesRoot</name> <value>${examplesRoot}</value> </property> <property> <name>inputDir</name> <value>${coord:dataIn('input')}</value> </property> <property> <name>outputDir</name> <value>${coord:dataOut('output')}</value> </property> </configuration> </workflow> </action> </coordinator-app> ------ On Mon, Jun 13, 2011 at 3:01 PM, GOEKE, MATTHEW (AG/1000) < matthew.go...@monsanto.com> wrote: > If you know for certain that it needs to be split into multiple work units > I would suggest looking into Oozie. Easy to install, light weight, low > learning curve... for my purposes it's been very helpful so far. I am also > fairly certain you can chain multiple job confs into the same run but I have > not actually tried that therefore I can't promise it is easy or possible. > > http://www.cloudera.com/blog/2010/07/whats-new-in-cdh3-b2-oozie/ > > If you are not running CDH3u0 then you can also get the tarball and > documentation directly here: > https://ccp.cloudera.com/display/SUPPORT/CDH3+Downloadable+Tarballs > > Matt > > -----Original Message----- > From: Marcos Ortiz [mailto:mlor...@uci.cu] > Sent: Monday, June 13, 2011 4:57 PM > To: mapreduce-user@hadoop.apache.org > Cc: Arko Provo Mukherjee > Subject: Re: Programming Multiple rounds of mapreduce > > Well, you can define a job for each round and then, you can define the > running workflow based in your implementation and to chain your jobs > > El 6/13/2011 5:46 PM, Arko Provo Mukherjee escribió: > > Hello, > > > > I am trying to write a program where I need to write multiple rounds > > of map and reduce. > > > > The output of the last round of map-reduce must be fed into the input > > of the next round. > > > > Can anyone please guide me to any link / material that can teach me as > > to how I can achieve this. > > > > Thanks a lot in advance! > > > > Thanks & regards > > Arko > > -- > Marcos Luís Ortíz Valmaseda > Software Engineer (UCI) > http://marcosluis2186.posterous.com > http://twitter.com/marcosluis2186 > > > This e-mail message may contain privileged and/or confidential information, > and is intended to be received only by persons entitled > to receive such information. If you have received this e-mail in error, > please notify the sender immediately. Please delete it and > all attachments from any servers, hard drives or any other media. Other use > of this e-mail by you is strictly prohibited. > > All e-mails and attachments sent and received are subject to monitoring, > reading and archival by Monsanto, including its > subsidiaries. The recipient of this e-mail is solely responsible for > checking for the presence of "Viruses" or other "Malware". > Monsanto, along with its subsidiaries, accepts no liability for any damage > caused by any such code transmitted by or accompanying > this e-mail or any attachment. > > > The information contained in this email may be subject to the export > control laws and regulations of the United States, potentially > including but not limited to the Export Administration Regulations (EAR) > and sanctions regulations issued by the U.S. Department of > Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this > information you are obligated to comply with all > applicable U.S. export laws and regulations. > >