
Try to use Oozie for job coordination and work flows.

On Thu, Sep 1, 2011 at 12:30 PM, Per Steffensen <st...@designware.dk> wrote:

> Hi
> I use hadoop for a MapReduce job in my system. I would like to have the job
> run very 5th minute. Are there any "distributed" timer job stuff in hadoop?
> Of course I could setup a timer in an external timer framework (CRON or
> something like that) that invokes the MapReduce job. But CRON is only
> running on one particular machine, so if that machine goes down my job will
> not be triggered. Then I could setup the timer on all or many machines, but
> I would not like the job to be run in more than one instance every 5th
> minute, so then the timer jobs would need to coordinate who is actually
> starting the job "this time" and all the rest would just have to do nothing.
> Guess I could come up with a solution to that - e.g. writing some "lock"
> stuff using HDFS files or by using ZooKeeper. But I would really like if
> someone had already solved the problem, and provided some kind of a
> "distributed timer framework" running in a "cluster", so that I could just
> register a timer job with the cluster, and then be sure that it is invoked
> every 5th minute, no matter if one or two particular machines in the cluster
> is down.
> Any suggestions are very welcome.
> Regards, Per Steffensen

Ronen Itkin*
Taykey | www.taykey.com

Reply via email to