You can write a script to parse the Hadoop job list and send an alert. The trick of putting a retry into your workflow system is a nice one. If your program won't allow multiple copies to run at the same time, then if you re-invoke the program every, say, hour, then 5 retries implies that the previous invocation has been running for 5 hours.
On Sat, Dec 22, 2012 at 12:49 PM, Mohit Anchlia <mohitanch...@gmail.com>wrote: > Need alerting > > > On Sat, Dec 22, 2012 at 12:44 PM, Mohammad Tariq <donta...@gmail.com>wrote: > >> MR web UI?Although we can't trigger anything, it provides all the info >> related to the jobs. I mean it would be easier to just go there and and >> have a look at everything rather than opening the shell and typing the >> command. >> >> I'm a bit lazy ;) >> >> Best Regards, >> Tariq >> +91-9741563634 >> https://mtariq.jux.com/ >> >> >> On Sun, Dec 23, 2012 at 2:09 AM, Mohit Anchlia <mohitanch...@gmail.com>wrote: >> >>> Best I can find is hadoop job list so far >>> >>> >>> On Sat, Dec 22, 2012 at 12:30 PM, Mohit Anchlia >>> <mohitanch...@gmail.com>wrote: >>> >>>> What's the best way to trigger alert when jobs run for too long or have >>>> many failures? Is there a hadoop command that can be used to perform this >>>> activity? >>> >>> >>> >> >