Hi Try to use Oozie for job coordination and work flows.
On Thu, Sep 1, 2011 at 12:30 PM, Per Steffensen <st...@designware.dk> wrote: > Hi > > I use hadoop for a MapReduce job in my system. I would like to have the job > run very 5th minute. Are there any "distributed" timer job stuff in hadoop? > Of course I could setup a timer in an external timer framework (CRON or > something like that) that invokes the MapReduce job. But CRON is only > running on one particular machine, so if that machine goes down my job will > not be triggered. Then I could setup the timer on all or many machines, but > I would not like the job to be run in more than one instance every 5th > minute, so then the timer jobs would need to coordinate who is actually > starting the job "this time" and all the rest would just have to do nothing. > Guess I could come up with a solution to that - e.g. writing some "lock" > stuff using HDFS files or by using ZooKeeper. But I would really like if > someone had already solved the problem, and provided some kind of a > "distributed timer framework" running in a "cluster", so that I could just > register a timer job with the cluster, and then be sure that it is invoked > every 5th minute, no matter if one or two particular machines in the cluster > is down. > > Any suggestions are very welcome. > > Regards, Per Steffensen > -- * Ronen Itkin* Taykey | www.taykey.com